skip to main content
research-article
Free Access

Keeping bits safe: how hard can it be?

Published:01 November 2010Publication History
Skip Abstract Section

Abstract

As storage systems grow larger and larger, protecting their data for long-term storage is becoming ever more challenging.

References

  1. Adams, D. The Hitchhiker's Guide to the Galaxy. British Broadcasting Corp., 1978.Google ScholarGoogle Scholar
  2. Amazon. Amazon S3 API Reference (Mar. 2006); http://docs.amazonwebservices.com/AmazonS3/latest/API/.Google ScholarGoogle Scholar
  3. Andersen, D.G., Franklin, J., Kaminsky, M., Phanishayee, A., Tan, L., Vasudevan, V. FAWN: A fast array of wimpy nodes. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (2009), 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Anderson. D. Hard drive directions (Sept. 2009); http://www.digitalpreservation.gov/news/events/other_meetings/storage09/docs/2-4_Anderson-seagate-v3_HDtrends.pdf.Google ScholarGoogle Scholar
  5. Bairavasundaram, L., Goodson, G., Schroeder, B., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H. An analysis of data corruption in the storage stack. In Proceedings of 6th Usenix Conference on File and Storage Technologies, (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Baker, M., Shah, M., Rosenthal, D.S.H., Roussopoulos, M., Maniatis, P., Giuli, T.J., Bungale, P. A fresh look at the reliability of long-term digital storage. In Proceedings of EuroSys2006, (Apr. 2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cappello, F., Geist, A., Gropp, B., Kale, S., Kramer, B., Snir, M. Toward exascale resilience. Technical Report TR-JLPC-09-01. INRIA-Illinois Joint Laboratory on Petascale Computing, (July 2009).Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. CERN. Worldwide LHC Computing Grid, 2008; http://lcg.web.cern.ch/LCG/.Google ScholarGoogle Scholar
  9. Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., Grube, R.E. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th Usenix Symposium on Operating System Design and Implementation, (2006), 205--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Christensen, C.M. The Innovator's Dilemma: When New Technologies Cause Great Firms to Fail. Harvard Business School Press (June 1997), Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Corbett, P., English, B., Goel, A., Grcanac, T., Kleiman, S., Leong, J., Sankar, S. Row-diagonal parity for double disk failure correction. In 3rd Usenix Conference on File and Storage Technologies (Mar. 2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Elerath. J. Hard-disk drives: The good, the bad, and the ugly. Commun. ACM 52, 6 (June 2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Elerath, J.G., Pecht, M. Enhanced reliability modeling of RAID storage systems. In Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, (2007), 175--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Engler, D. A system's hackers crash course: techniques that find lots of bugs in real (storage) system code. In Proceedings of 5th Usenix Conference on File and Storage Technologies, (Feb. 2007).Google ScholarGoogle Scholar
  15. Haber, S., Stornetta, W.S. How to timestamp a digital document. Journal of Cryptology 3, 2 (1991), 99--111.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hafner, J.L., Deenadhayalan, V., Belluomini, W., Rao, K. Undetected disk errors in RAID arrays. IBM Journal of Research & Development 52, 4/5, (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jiang, W., Hu, C., Zhou, Y., Kanevsky, A. Are disks the dominant contributor for storage failures? A comprehensive study of storage subsystem failure Characteristics. In Proceedings of 6th Usenix Conference on File and Storage Technologies, (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kelemen, P. Silent corruptions. In 8th Annual Workshop on Linux Clusters for Super Computing, (2007)Google ScholarGoogle Scholar
  19. Klima, V. Finding MD5 collisions---A toy for a notebook. Cryptology ePrint Archive, Report 2005/075; http://eprint.iacr.org/2005/075.Google ScholarGoogle Scholar
  20. Krioukov, A., Bairavasundaram, L.N., Goodson, G.R., Srinivasan, K., Thelen, R., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H. Parity lost and parity regained. In Proceedings of 6th Usenix Conference on File and Storage Technologies, (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Maniatis, P., Roussopoulos, M., Giuli, T.J., Rosenthal, D.S.H., Baker, M., Muliadi, Y. Preserving peer replicas by rate-limited sampled voting. In Proceedings of the 19th ACM Symposium on Operating Systems Principles, (Oct. 2003), 44--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Marshall, C. "It's like a fire. You just have to move on:" Rethinking personal digital archiving. In 6th Usenix Conference on File and Storage Technologies, (2008).Google ScholarGoogle Scholar
  23. Mearian, L. Start-up claims its DVDs last 1,000 years. Computerworld, (Nov. 2009).Google ScholarGoogle Scholar
  24. Mellor, C. Drive suppliers hit capacity increase difficulties. The Register, (July 2010).Google ScholarGoogle Scholar
  25. Michail, H.E., Kakarountas, A.P., Theodoridis, G., Goutis, C.E. A low-power and high-throughput implementation of the SHA-1 hash function. In Proceedings of the 9th WSEAS International Conference on Computers, (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mielke, N., Marquart, T., Wu1, N., Kessenich, J., Belgal, H., Schares, E., Trivedi, F., Goodness, E., Nevill, L.R. Bit error rate in NAND flash memories. In 46th Annual International Reliability Physics Symposium, (2008), 9--19.Google ScholarGoogle ScholarCross RefCross Ref
  27. Moore, R. L., D'Aoust, J., McDonald, R. H., Minor, D. Disk and tape storage cost models. In Archiving 2007.Google ScholarGoogle Scholar
  28. National Institute of Standards and Technology (NIST). Federal Information Processing Standard Publication 180--1: Secure Hash Standard, (Apr. 1995).Google ScholarGoogle Scholar
  29. Patterson, D. A., Gibson, G., Katz, R.H. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the ACM SIGMOD International Conference on Management of Data, (June 1988), 109--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Pinheiro, E., Weber, W.-D., Barroso, L. A. Failure trends in a large disk drive population. In Proceedings of 5th Usenix Conference on File and Storage Technologies, (Feb. 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Prabhakaran, V., Agrawal, N., Bairavasundaram, L., Gunawi, H., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H. IRON file systems. In Proceedings of the 20th Symposium on Operating Systems Principles, (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Rosenthal, D.S.H. Bit preservation: A solved problem? International Journal of Digital Curation 1, 5 (2010).Google ScholarGoogle Scholar
  33. Rosenthal, D.S.H. LOCKSS: Lots of copies keep stuff safe. In NIST Digital Preservation Interoperability Framework Workshop, (Mar. 2010).Google ScholarGoogle Scholar
  34. Rosenthal, D.S.H., Robertson, T.S., Lipkis, T., Reich, V., Morabito, S. Requirements for digital preservation systems: a bottom-up approach. D-Lib Magazine 11, 11 (2005).Google ScholarGoogle ScholarCross RefCross Ref
  35. Schroeder, B., Gibson, G. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? In Proceedings of 5th Usenix Conference on File and Storage Technologies (Feb. 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Schwarz, T., Baker, M., Bassi, S., Baumgart, B., Flagg, W., van Imngen, C., Joste, K., Manasse, M., Shah, M. Disk failure investigations at the Internet Archive. In Work-in-Progress Session, NASA/IEEE Conference on Mass Storage Systems and Technologies, (2006).Google ScholarGoogle Scholar
  37. SDSS (Sloan Digital Sky Survey), 2008; http://www.sdss.org/.Google ScholarGoogle Scholar
  38. Shah, M.A., Baker, M., Mogul, J.C., Swaminathan, R. Auditing to keep online storage services honest. In 11th Workshop on Hot Topics in Operating Systems, (May 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Storer, M.W., Greenan, K. M., Miller, E.L., Voruganti, K. Pergamum: replacing tape with energy-efficient, reliable, disk-based archival storage. In Proceedings of 6th Usenix Conference on File and Storage Technologies, (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Sun Microsystems. Sales Terms and Conditions, Section 11.2, (Dec. 2006); http://store.sun.com/CMTemplate/docs/legal_terms/TnC.jsp#11.Google ScholarGoogle Scholar
  41. Sun Microsystems. ST5800 presentation. Sun PASIG Meeting, (June 2008).Google ScholarGoogle Scholar
  42. Talagala, N. Characterizing large storage systems: error behavior and performance benchmarks. Ph.D. thesis, Computer Science Division, University of California at Berkeley, (Oct. 1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Williams, P., Rosenthal, D. S. H., Roussopoulos, M., Georgis, S. Predicting the archival life of removable hard disk drives. In Archiving 2008, (June 2008).Google ScholarGoogle Scholar
  44. Zhang, Y., Rajimwale, A., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H. End-to-end data integrity for file systems: A ZFS case study. In 8th Usenix Conference on File and Storage Technologies, (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Keeping bits safe: how hard can it be?

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image Communications of the ACM
                Communications of the ACM  Volume 53, Issue 11
                November 2010
                112 pages
                ISSN:0001-0782
                EISSN:1557-7317
                DOI:10.1145/1839676
                Issue’s Table of Contents

                Copyright © 2010 Copyright is held by the owner/author(s)

                Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 1 November 2010

                Check for updates

                Qualifiers

                • research-article
                • Popular
                • Refereed

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader

              HTML Format

              View this article in HTML Format .

              View HTML Format