ABSTRACT
Demand for data storage is growing exponentially, but the capacity of existing storage media is not keeping up. Using DNA to archive data is an attractive possibility because it is extremely dense, with a raw limit of 1 exabyte/mm3 (109 GB/mm3), and long-lasting, with observed half-life of over 500 years. This paper presents an architecture for a DNA-based archival storage system. It is structured as a key-value store, and leverages common biochemical techniques to provide random access. We also propose a new encoding scheme that offers controllable redundancy, trading off reliability for density. We demonstrate feasibility, random access, and robustness of the proposed encoding with wet lab experiments involving 151 kB of synthesized DNA and a 42 kB random-access subset, and simulation experiments of larger sets calibrated to the wet lab experiments. Finally, we highlight trends in biotechnology that indicate the impending practicality of DNA storage for much larger datasets.
- L. Adleman. Molecular computation of solutions to combinatorial problems. Science, 266 (5187): 1021--1024, 1994.Google ScholarCross Ref
- M. E. Allentoft, M. Collins, D. Harker, J. Haile, C. L. Oskam, M. L. Hale, P. F. Campos, J. A. Samaniego, M. T. P. Gilbert, E. Willerslev, G. Zhang, R. P. Scofield, R. N. Holdaway, and M. Bunce. The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils. Proceedings of the Royal Society of London B: Biological Sciences, 279 (1748): 4724--4733, 2012.Google ScholarCross Ref
- C. Bancroft, T. Bowler, B. Bloom, and C. T. Clelland. Long-term storage of information in DNA. Science, 293 (5536): 1763--1765, 2001.Google Scholar
- R. Carlson. Time for new DNA synthesis and sequencing cost curves. http://www.synthesis.cc/2014/02/time-for-new-cost-curves-2014.html, 2014.Google Scholar
- Y.-J. Chen, N. Dalchau, N. Srinivas, A. Phillips, L. Cardelli, D. Soloveichik, and G. Seelig. Programmable chemical controllers made from DNA. Nature Nanotechnology, 8 (10): 755--762, 2013.Google ScholarCross Ref
- G. M. Church, Y. Gao, and S. Kosuri. Next-generation digital information storage in DNA. Science, 337 (6102): 1628, 2012.Google Scholar
- C. T. Clelland, V. Risca, and C. Bancroft. Hiding messages in DNA microdots. Nature, 399: 533--534, 1999.Google ScholarCross Ref
- ExtremeTech. New optical laser can increase DVD storage up to one petabyte. http://www.extremetech.com/computing/159245-new-optical-laser-can-increase-dvd-storage-up-to-one-petabyte, 2013.Google Scholar
- D. G. Gibson, J. I. Glass, C. Lartigue, V. N. Noskov, R.-Y. Chuang, M. A. Algire, G. A. Benders, M. G. Montague, L. Ma, M. M. Moodie, C. Merryman, S. Vashee, R. Krishnakumar, N. Assad-Garcia, C. Andrews-Pfannkoch, E. A. Denisova, L. Young, Z.-Q. Qi, T. H. Segall-Shapiro, C. H. Calvey, P. P. Parmar, C. A. Hutchison, H. O. Smith, and J. C. Venter. Creation of a bacterial cell controlled by a chemically synthesized genome. Science, 329 (5987): 52--56, 2010.Google ScholarCross Ref
- N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature, 494: 77--80, 2013.Google ScholarCross Ref
- R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark. Robust chemical preservation of digital information on DNA in silica with error-correcting codes. Angew. Chem. Int. Ed., 54: 2552--2555, 2015.Google ScholarCross Ref
- Q. Guo, K. Strauss, L. Ceze, and H. Malvar. High-density image storage using approximate memory cells. In ASPLOS, 2016.Google ScholarDigital Library
- D. Huffman. A method for the construction of minimum-redundancy codes. Proceedings of the IRE, 40 (9): 1098--1101, 1952.Google ScholarCross Ref
- IDC. Where in the world is storage. http://www.idc.com/downloads/where_is_storage_infographic_243338.pdf, 2013.Google Scholar
- S. Kosuri and G. M. Church. Large-scale de novo DNA synthesis: technologies and applications. Nature Methods, 11: 499--507, 2014.Google ScholarCross Ref
- A. Leier, C. Richter, W. Banzhaf, and H. Rauhe. Cryptography with DNA binary strands. Biosystems, 57 (1): 13--22, 2000.Google ScholarCross Ref
- M. D. Matteucci and M. H. Caruthers. Synthesis of deoxyoligonucleotides on a polymer support. Journal of the American Chemical Society, 103 (11): 3185--3191, 1981.Google ScholarCross Ref
- R. Miller. Facebook builds exabyte data centers for cold storage. http://www.datacenterknowledge.com/archives/2013/01/18/facebook-builds-new-data-centers-for-cold-storage/, 2013.Google Scholar
- R. A. Muscat, K. Strauss, L. Ceze, and G. Seelig. DNA-based molecular architecture with spatially localized components. In International Symposium on Computer Architecture, 2013.Google ScholarDigital Library
- T. P. Niedringhaus, D. Milanova, M. B. Kerby, M. P. Snyder, and A. E. Barron. Landscape of next-generation sequencing technologies. Anal. Chem., 83: 4327--4341, 2011.Google ScholarCross Ref
- L. Qian, E. Winfree, and J. Bruck. Neural network computation with DNA strand displacement cascades. Science, 475 (7356): 368--372, 2011.Google Scholar
- I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 8 (2): 300--304, 1960.Google ScholarCross Ref
- A. Sampson, J. Nelson, K. Strauss, and L. Ceze. Approximate storage in solid-state memories. In International Symposium on Microarchitecture, 2013.Google ScholarDigital Library
- J. J. Schwartz, C. Lee, and J. Shendure. Accurate gene synthesis with tag-directed retrieval of sequence-verified DNA molecules. Nature Methods, 9 (9): 913--915, 2012.Google ScholarCross Ref
- Sony. Sony develops magnetic tape technology with the world's highest recording density. http://www.sony.net/SonyInfo/News/Press/201404/14-044E/, 2014.Google Scholar
- K. Takahashi, S. Yaegashi, A. Kameda, and M. Hagiya. Chain reaction systems based on loop dissociation of DNA. In DNA Computing, volume 3892 of Lecture Notes in Computer Science, pages 347--358. Springer Berlin Heidelberg, 2006.Google Scholar
- B. Talawar. A crossbar interconnection network in DNA. In Workshop on High Performance Computational Biology, 2015.Google ScholarDigital Library
- S. M. H. T. Yazdi, Y. Yuan, J. Ma, H. Zhao, and O. Milenkovic. A Rewritable, Random-Access DNA-Based Storage System. Nature Scientific Reports, 5 (14318), 2015.Google Scholar
- J. N. Zadeh, B. R. Wolfe, and N. A. Pierce. Nucleic acid sequence design via efficient ensemble defect optimization. Journal of Computational Chemistry, 32 (3): 439--452, 2011.Google ScholarCross Ref
Index Terms
- A DNA-Based Archival Storage System
Recommendations
A DNA-Based Archival Storage System
ASPLOS '16Demand for data storage is growing exponentially, but the capacity of existing storage media is not keeping up. Using DNA to archive data is an attractive possibility because it is extremely dense, with a raw limit of 1 exabyte/mm3 (109 GB/mm3), and ...
A DNA-Based Archival Storage System
ASPLOS'16Demand for data storage is growing exponentially, but the capacity of existing storage media is not keeping up. Using DNA to archive data is an attractive possibility because it is extremely dense, with a raw limit of 1 exabyte/mm3 (109 GB/mm3), and ...
Reliability and security of RAID storage systems and D2D archives using SATA disk drives
Information storage reliability and security is addressed by using personal computer disk drives in enterprise-class nearline and archival storage systems. The low cost of these serial ATA (SATA) PC drives is a tradeoff against drive reliability design ...
Comments