A Practical Approach to Ultra Long-Term Data Storage on DNA

Some of the oldest information our society has access to is the genetic information of ancient species discovered in the DNA of fossil bone (up to 300,000 years old). However, DNA in solution or in our living bodies is a relatively vulnerable molecule, sensitive to light, radicals and even air (humidity). Inspired by the ancient samples we developed a chemical synthesis method for the fabrication of “Synthetic Fossils”. This is a glass-like material, in which DNA is encapsulated, and in which DNA is as well protected as in the prehistoric specimens. Driven by the current developments in medical diagnostics, the cost of DNA synthesis (=writing) and sequencing (=reading) are rapidly decreasing, allowing us to envision DNA as a high-density medium for long-term information storage. As errors still occur, during writing and reading, and even during storage, error correction methods have to be introduced. For this purpose we have implemented a concatenated Reed Solomon coding scheme adapted for the particularity of DNA storage. Utilizing a small prototype (ca. 100 kB of digital information) and simulation of long-term storage by accelerated aging, we were able to display the longevity of the data stored. Aside of presenting the methods used, I will also discuss the current state of DNA data storage in terms of opportunities, but also in terms of future development needs.

Date:
Speakers:
Robert Grass
Affiliation:
ETH Zurich

Series: Microsoft Research Talks