skip to main content
10.1145/1066157.1066203acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Fossilized index: the linchpin of trustworthy non-alterable electronic records

Published:14 June 2005Publication History

ABSTRACT

As critical records are increasingly stored in electronic form, which tends to make for easy destruction and clandestine modification, it is imperative that they be properly managed to preserve their trustworthiness, i.e., their ability to provide irrefutable proof and accurate details of events that have occurred. The need for proper record keeping is further underscored by the recent corporate misconduct and ensuing attempts to destroy incriminating records. Currently, the industry practice and regulatory requirements (e.g., SEC Rule 17a-4) rely on storing records in WORM storage to immutably preserve the records. In this paper, we contend that simply storing records in WORM storage is increasingly inadequate to ensure that they are trustworthy. Specifically, with the large volume of records that are typical today, meeting the ever more stringent query response time requires the use of direct access mechanisms such as indexes. Relying on indexes for accessing records could, however, provide a means for effectively altering or deleting records, even those stored in WORM storage.In this paper, we establish the key requirements for a fossilized index that protects the records from such logical modification. We also analyze current indexing methods to determine how they fall short of these requirements. Based on our insights, we propose the Generalized Hash Tree (GHT). Using both theoretical analysis and simulations with real system data, we demonstrate that the GHT can satisfy the requirements of a fossilized index with performance and cost that are comparable to regular indexing techniques such as the B-tree. We further note that as records are indexed on multiple fields to facilitate search and retrieval, the records can be reconstructed from the corresponding index entries even after the records expire and are disposed of, Therefore, we also present a novel method to eliminate this disclosure risk by allowing an index entry to be effectively disposed of when its record expires.

References

  1. P. Bagwell. Ideal Hash Trees. Technical report, Programming Methods Laboratory, Institute of Core Computing Science, School of Computer and Communication Sciences, Swiss Institute of Technology Lausanne, 2001.Google ScholarGoogle Scholar
  2. B. Becker, S. Gschwind, T. Ohler, B. Seeger, and P. Widmayer. An Asymptotically Optimal Multiversion B-tree. The VLDB Journal: The International Journal on Very Large Data Bases, 5:264--275, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Z. Broder and A. R. Karlin. Multilevel Adaptive Hashing. In 1st ACM-SIAM Symposium on Discrete Algorithms, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cohasset Associates, Inc. The role of optical storage technology. White Paper, Apr. 2003.Google ScholarGoogle Scholar
  5. Congress of the United States of America. Sarbanes-Oxley Act of 2002, 2002. Available at http://thomas.loc.gov.Google ScholarGoogle Scholar
  6. M. Dietzfelbinger, A. Karlin, K. Mehlhorn, F. M. A. D. Heide, H. Rohnert, and R. E. Tarjan. Dynamic Perfect Hashing: Upper and Lower Bounds. SIAM Journal on Computing, 23(4):738--761, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. C. Easton. Key-Sequence Data Sets on Indelible Storage. IBM Journal of Research and Development, May 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. EMC Corp. EMC Centera Content Addressed Storage System, 2003. Available at http://www.emc.com/products/systems/centera_ce.jsp.Google ScholarGoogle Scholar
  9. R. J. Enbody and H. C. Du. Dynamic Hashing Schemes. ACM Computing Survey, 20(2), June 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Gutmann. Secure Deletion of Data from Magnetic and Solid-State Memory. In 6th USENIX Security Symposium, July 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. W. Hsu and S. Ong. Fossilization: A process for establishing truly trustworthy records. Research Report RJ 10331, IBM Almaden Research Center, San Jose, CA, 2004. Available at http://www.research.ibm.com/resources/paper_search.shtml.Google ScholarGoogle Scholar
  12. IBM Corp. IBM TotalStorage DR550, 2004. Available at http://www-1.ibm.com/servers/storage/disk/dr.Google ScholarGoogle Scholar
  13. E. W. Myers. Efficient Applicative Data Types. In 11th ACM Symposium on Principles of Programming Languages, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. National Institute of Standards and Technology. FIPS 180-1. Secure Hash Standard. US Department of Commerce, 1995.Google ScholarGoogle Scholar
  15. National Institute of Standards and Technology. FIPS PUB 197, Advanced Encryption Standard (AES), 2001.Google ScholarGoogle ScholarCross RefCross Ref
  16. Network Appliance, Inc. SnapLock#8482; Compliance and SnapLock Enterprise Software, 2003. Available at http://www.netapp.com/products/filer/snaplock.html.Google ScholarGoogle Scholar
  17. P. Rathmann. Dynamic Data Structures on Optical Disks. In 1st International Conference on Data Engineering, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Reps, T. Teitelbaum, and A. Demers. Incremental Context-Dependent Analysis for Language-based Editors, ACM Transactions on Programming Language Systems, 5:449--477, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. N. Sarnak and R. E. Tarjan. Planar Point Location Using Persistent Search Tree. Communications of the ACM, 29(7), July 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. F. Scholer, H. Williams, J. Yiannis, and J. Zobel. Compression of inverted indexes for fast query evaluation. In 25th ACM Conference on Research and Development in Information Retrieval, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Securities and Exchange Commission. SEC Interpretation: Commission Guidance to Broker-Dealers on the Use of Electronic Storage Media under the Electronic Signatures in Global and National Commerce Act of 2000 with Respect to Rule 17a-4(f), 2001. Available at http://www.sec.gov/rules/interp/34--44238.htm.Google ScholarGoogle Scholar
  22. Socha Consulting LLC. The 2004 Socha-Gelbmann Electronic Discovery Survey, 2004.Google ScholarGoogle Scholar
  23. Sony Corp. AIT-2/AIT-3 WORM Drives & Libraries, 2003. Available at http://www.storagebysony.com/products/prod_hilite4.asp.Google ScholarGoogle Scholar
  24. M. Stonebraker. The Design of the POSTGRES Storage System. In 13th VLDB Conference, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. T. Krijnen and L. G. L. T. Meertens. Making B-Trees Work for B.IW 219/83. The Mathematical Centre, Amsterdam, The Netherlands, 1983.Google ScholarGoogle Scholar
  26. The Enterprise Storage Group, Inc. Compliance: The effect on information management and the storage industry, May 2003.Google ScholarGoogle Scholar
  27. D. Wilton. How Many Words Are There In The English Language? Wilton's Word and Phrase Origins, 2001.Google ScholarGoogle Scholar
  1. Fossilized index: the linchpin of trustworthy non-alterable electronic records

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data
        June 2005
        990 pages
        ISBN:1595930604
        DOI:10.1145/1066157
        • Conference Chair:
        • Fatma Ozcan

        Copyright © 2005 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 14 June 2005

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader