Forget hard drives and data centers, DNA is the future of data storage

Every day insane amounts of data are produced, and we're running out of storage solutions. Therefore, scientists are looking at DNA to store all of the content that's generated on the internet.
Rupendra Brahambhatt
DNA strands inside a capsule
Could DNA be the next big thing in data storage?


Within three years, the world will run out of data centers to store even half of the total data that internet users will have produced. This shocking claim was made by Tom De Greef, a professor of synthetic biology at Eindhoven University of Technology (TU/e). 

So does that mean the internet will also crash by 2026? Well, it won’t if tech companies start using synthetic DNA instead of hard drives to store their data. You may not believe it, but according to Greef and his team, DNA strands can store large amounts of digital data, and in many ways, they have more advantages over modern-day data centers.  

In their recently published study, the TU/e team proposed a technique that promises to make DNA data storage practical, scalable, and highly efficient. 

Can storing data on DNA really be done?

DNA is already known for storing all the genetic information of an organism that decides most of its physical and mental makeup. Interestingly, this isn’t the first time scientists have proposed the use of synthetic DNA (DNA made in a lab) to store digital data. 

The idea was first proposed in the 1980s, and about 30 years later, in 2012, a group of scientists at Harvard University stored a 52,000-word book on DNA pieces. They were able to store 1.29 petabytes (1,280,000 GB) on each gram of DNA they used. 

However, their approach had several limitations, and that’s why DNA storage didn’t emerge as a mainstream data storage solution. Since then, many other scientists have tried to make the process efficient and scalable, but some challenges were still not resolved. 

For instance, encoding data on DNA and then extracting the information back from it requires hi-tech resources, making it a very expensive energy storage solution. Moreover, the current method that is used for reading data on DNA, known as polymerase chain reaction (PCR), can lead to errors and may also damage the DNA strand on which the data is stored.  

What happens in the PCR is that if you need to read a certain file that is stored on the DNA – “A matching primer is used to search the goop and attach to the required DNA strand. This DNA is then copied millions of times so that the system can find it and read the file,” study authors note

The problem is every time you read the data, the reading process deteriorates the quality of the original DNA. “It becomes difficult to read multiple files at once. Also, if you want to read multiple files simultaneously, you need multiple primer pairs doing their work at the same time. This creates many errors in the copying process,” they further added. 

Using DNA microcapsules for safer and better data storage

Forget hard drives and data centers, DNA is the future of data storage
Color-coded microcapsules.

In their study, Greef and his team propose a modified thermo-confined PCR method. This new technique promises to overcome the challenges scientists face with regular PCR. The authors created microcapsules made of protein and polymers that act as compartmentalized DNA files. 

Each microcapsule is linked to one data file. Plus, they are designed in such a way that when many microcapsules are heated at a temperature above 122°F (50°C ), each of these capsules seals itself. 

This enables the PCR process to occur independently in each capsule, leaving very little room for error. When the temperature normalizes, the original file remains linked to the capsule, but the copied files leave it. This saves the original data and DNA from deterioration.

However, In the standard non-compartmentalized DNA storage method, PCR doesn’t occur separately for each file and therefore leads to data degradation and errors in copying. 

The study authors employed their novel approach to simultaneously read 25 files stored on compartmentalized DNA. The results were positive and surprising. While commenting on their experiment results, Greef said, “We currently stand at a loss of 0.3 percent after three reads, compared to 35 percent with the existing method.”

The study suggests that the capsules can also be color-coded using fluorescent labels. This will make it easy to sort and search information in large DNA data libraries. “Now it’s just a matter of waiting until the costs of DNA synthesis fall further. The technique will then be ready for application,” Greef further added.

Benefits of DNA storage over data centers

Humans generate over 2 x 1018 bytes of data every day. Your every post, google search, blog, and upload creates new bits of data, and all of this data is stored in data centers located around the globe.  

Data centers are large facilities that comprise numerous data servers. They cover a huge land mass and constantly consume large amounts of electricity. The study authors claim that DNA can reduce both the land and power usage required to store data. 

This is because DNA offers high storage density; one gram of DNA would be enough to store data amounting to 215 million GBs. According to an estimate, the total data that humanity has produced so far can be put inside DNA segments stacked in a room. 

Moreover, while current data servers need to be upgraded in time and can only preserve the data for 10, 20, or 50 years, DNA can keep your content intact for millions of years. Also, DNA can be kept in facilities that will require significantly low power demands compared to data servers.   

All these factors make DNA storage a very promising technology. Hopefully, further research will bring down the cost of DNA synthesis and take this approach mainstream.  

The study was published in the journal Nature Nanotechnology

Study Abstract:

DNA has emerged as an attractive medium for archival data storage due to its durability and high information density. Scalable parallel random access to information is a desirable property of any storage system. For DNA-based storage systems, however, this still needs to be robustly established. Here we report on a thermoconfined polymerase chain reaction, which enables multiplexed, repeated random access to compartmentalized DNA files. The strategy is based on localizing biotin-functionalized oligonucleotides inside thermoresponsive, semipermeable microcapsules. At low temperatures, microcapsules are permeable to enzymes, primers and amplified products, whereas at high temperatures, membrane collapse prevents molecular crosstalk during amplification. Our data show that the platform outperforms non-compartmentalized DNA storage compared with repeated random access and reduces amplification bias tenfold during multiplex polymerase chain reaction. Using fluorescent sorting, we also demonstrate sample pooling and data retrieval by microcapsule barcoding. Therefore, the thermoresponsive microcapsule technology offers a scalable, sequence-agnostic approach for repeated random access to archival DNA files.