The amount of data in the world is doubling every two years, according to a 2014 estimate by EMC, but data storage technologies are having a hard time keeping up. That's led some researchers to look for a solution in biology rather than computers.
DNA, the molecule that uses biological codes to store genetic information, has been studied for a while as a possible solution for storing human-generated data. And this week, researchers in New York unveiled a new technique that allows DNA to store more data than ever before.
With the help of a California startup that makes synthetic DNA, researchers Yaniv Erlich and Dina Zielinski of Columbia University and the New York Genome Center, were able to transfer a complete computer operating system, a very old movie, a $50 Amazon gift card and more into a tiny speck of DNA. That achieves a data storage density of 215 petabytes per gram. In theory, that means it would be possible to store all the information humans have ever generated into the space of a single room.
A Million Times More Data-Dense
"DNA has several advantages to store information," Erlich said in a video provided by Columbia University. "The first thing is that it's very compact. In effect, it's about one million times more compact than what you can get when you use a regular digital media."
The technique developed by Erlich and Zielinski uses a "DNA fountain" algorithm that randomly packs binary code into data droplets that can be mapped onto the building blocks of DNA: the nucleotides C, G, A and T. Each data droplet was also marked with a barcode to help with decoding the DNA back into the original digital information.
Erlich and Zielinski used this technique to pack a large amount of digital data into a small drop of synthetic DNA produced by a California-based startup called Twist Bioscience. They were then able to retrieve all the original information without a single error from that drop of DNA.
Encoded in that one drop were a computer operating system, a French movie from 1895 called "Arrival of a train at La Ciotat," a $50 Amazon gift card, a computer virus, a plaque like one of the two featured on the Pioneer spacecraft in 1972 and 1973 and a study written by information theorist Claude Shannon in 1948.
Compact, Long-Lasting and Never Obsolete
In addition to being able to store vast amounts of information in a tiny space, DNA offers several other advantages for data storage, Erlich said. Because it's a liquid, it can take any shape. And it doesn't become obsolete -- as, for example, cassette tapes and CDs have become, he said.
DNA also has the advantage of being both long-lasting and robust. Even after thousands of years, DNA can retain all its original information intact, as was recently demonstrated by scientists who were able to analyze genetic data found in a 45,000-year-old woolly mammoth specimen.
While Erlich and Zielinski were able to beat the DNA data storage density record set just last year by researchers at Microsoft and the University of Washington, Erlich warned this doesn't mean that today's massive data centers will be replaced by test tubes of synthetic DNA anytime soon. Cost, for example, remains a barrier. It cost $7,000 to create the synthetic DNA with their encoded data, and $2,000 more to extract that information back into the original formats.
"My thoughts are that basically that it will take more time -- 5 to 7 years is too short," Erlich said recently in an interview with IEEE Spectrum. "There are still more challenges to solve here."
Posted: 2017-03-04 @ 4:24pm PT
Most of the information that is being collected is "write once" except for random occasional reads to see if it's there at all. Even data mining won't read most of it. And even if it does it isn't worth it. Most becomes meaningless when its subject dies. So why keep it?
Fine work on the DNA for a more permanent record, however, of what is kept.