ABSTRACT
Antiquity is a wide-area distributed storage system designed to provide a simple storage service for applications like file systems and back-up. The design assumes that all servers eventually fail and attempts to maintain data despite those failures. Antiquity uses a secure log to maintain data integrity, replicates each log on multiple servers for durability, and uses dynamic Byzantine fault-tolerant quorum protocols to ensure consistency among replicas. We present Antiquity's design and an experimental evaluation with global and local testbeds. Antiquity has been running for over two months on 400+ PlanetLab servers storing nearly 20,000 logs totaling more than 84 GB of data. Despite constant server churn, all logs remain durable.
- M. Abd-El-Malek, G. Ganger, G. Goodson, M. Reiter, and J. Wylie. Fault-scalable byzantine fault-tolerant services. In Proc. of ACM SOSP, Oct. 2005. Google ScholarDigital Library
- A. Adya, W. J. Bolosky, M. Castro, G. Cermak, R. Chaiken, J. R. Douceur, J. Howell, J. R. Lorch, M. Theimer, and R. P. Wattenhofer. Farsite: Federated, available, and reliable storage for an incompletely trusted environment. In Proc. of OSDI, Dec. 2002. Google ScholarDigital Library
- N. S. Agency. Global information grid (gig). http://www.nsa.gov/ia/industry/gig.cfm. Last accessed September 2006.Google Scholar
- T. Anderson, M. Dahlin, J. Neefe, D. Patterson, D. Roselli, and R. Wang. Serverless Network File Systems. In Proc. of ACM SOSP, Dec. 1995. Google ScholarDigital Library
- A. Bavier, M. Bowman, B. Chun, D. Culler, S. Karlin, S. Muir, L. Peterson, T. Roscoe, T. Spalink, and M. Wawrzoniak. Operating system support for planetary-scale network services. In Proc. of NSDI, Mar. 2004. Google ScholarDigital Library
- R. Bhagwan, K. Tati, Y. Cheng, S. Savage, and G. Voelker. Totalrecall: Systems support for automated availability management. In Proc. of NSDI, Mar. 2004. Google ScholarDigital Library
- M. Castro and B. Liskov. Practical Byzantine fault tolerance. In Proc. of OSDI, 1999. Google ScholarDigital Library
- B. Chun, F. Dabek, A. Haeberlen, E. Sit, H. Weatherspoon, M. F. Kaashoek, J. Kubiatowicz, and R. Morris. Efficient replica maintenance for distributed storage systems. In Proc. of NSDI, San Jose, CA, May 2006. Google ScholarDigital Library
- E. Corp. Symmetrix remote data facility. http://www.emc.com/products/networking/srdf.jsp.Last accessed April 2006.Google Scholar
- J. Cowling, D. Myers, B. Liskov, R. Rodrigues, and L. Shrira. Hq replication: A hybrid quorum protocol for byzantine fault tolerance. In Proc. of OSDI, Nov. 2006. Google ScholarDigital Library
- F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with CFS. In Proc. of ACM SOSP, October 200. Google ScholarDigital Library
- A. Demers, D. Greee, C. Hauser, W. Irish, J. Larson, S. Shenker, H. Sturgis, D. Swindhart, and D. Terry. Epidemic algorithms for replicated database maintenance. In Proc. of ACM PODC, pages 1 -- 12, 1987. Google ScholarDigital Library
- P. Druschel and A. Rowstron. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In Proc. of ACM SOSP, 2001. Google ScholarDigital Library
- P. Eaton, H. Weatherspoon, and J. Kubiatowicz. Efficiently binding data to owners in distributed content-addressable storage systems. In 3rd International Security in Storage Workshop, Dec. 2005. Google ScholarDigital Library
- P. R. Eaton. Improving Access to Remote Storage for Weakly Connected Users. PhD thesis, EECS Department, University of California, Berkeley, January 11 2007. Google ScholarDigital Library
- K. Fu, M. F. Kaashoek, and D. Mazières. Fast and secure distributed read-only file system. In Proc. of OSDI, Oct. 2000. Google ScholarDigital Library
- S. Ghemawat, H. Gobioff, and S. Leung. The google file system. In Proc. of ACM SOSP, pages 29--43, Oct. 2003. Google ScholarDigital Library
- G. R. Goodson, J. J. Wylie, G. R. Ganger, and M. K. Reiter. Byzantine-tolerant erasure-coded storage. Technical Report CMU-CS-03-187, Carnegie Mellon University School for Computer Science, Sept. 2003.Google Scholar
- A. Haeberlen, A. Mislove, and P. Druschel. Glacier: Highly durable, decentralized storage despite massive correlated failures. In Proc. of NSDI, May 2005. Google ScholarDigital Library
- J. H. Hartman and J. K. Ousterhout. The zebra striped network file system. In Proc. of ACM SOSP, 1993. Google ScholarDigital Library
- L. Lamport, R. Shostak, and M. Pease. The Byzantine Generals Problem. ACM TOPLAS, 4(3):382--401, 1982. Google ScholarDigital Library
- E. K. Lee and C. A. Thekkath. Petal: Distributed virtual disks. In Proc. of ASPLOS, pages 84--92, 1996. Google ScholarDigital Library
- S. A. Leung, J. MacCormick, S. E. Perl, and L. Zhang. Myriad: Cost-effective disaster tolerance. In Proc. of USENIX FAST, Jan. 2002. Google ScholarDigital Library
- J. Li, M. Krohn, D. Mazières, and D. Shasha. Secure untrusted data repository (sundr). In Proc. of OSDI, pages 121--136, Dec. 2004. Google ScholarDigital Library
- B. Liskov, S. Ghemawat, R. Gruber, P. Johnson, L. Shrira, and M. Williams. Replication in the harp file system. In Proc. of ACM SIGOPS, 1991. Google ScholarDigital Library
- D. Malkhi and M. Reiter. Byzantine quorum systems. In Proc. of ACM STOC, pages 569 -- 578, May 1997. Google ScholarDigital Library
- D. Malkhi, M. K. Reiter, D. Tulone, and E. Ziskind. Persistent objects in the fleet system. In DISCEX II, 2001.Google ScholarCross Ref
- P. Maniatis, M. Roussopoulos, T. Giuli, D. S. H. Rosenthal, and M. Baker. The lockss peer-to-peer digital preservation system. ACM Trans. Comput. Syst., 23(1):2--50, 2005. Google ScholarDigital Library
- J.-P. Martin and L. Alvisi. A framework for dynamic byzantine storage. In Proc. of the Intl. Conf. on Dependable Systems and Networks, June 2004. Google ScholarDigital Library
- J. Matthews, D. Roselli, A. Costello, R. Wang, and T. Anderson. Improving the performance of log-structured file systems with adaptive methods. In Proc. of ACM SOSP, Oct. 1997. Google ScholarDigital Library
- R. Merkle. A digital signature based on a conventional encryption function. In Proc. of CRYPTO, pages 369--378. Springer-Verlag, 1988. Google ScholarDigital Library
- S. J. Mullender and A. S. Tanenbaum. A distributed file service based on optimistic concurrency control. In Proc. of ACM SOSP, pages 51--62, Dec. 1985. Google ScholarDigital Library
- A. Muthitacharoen, S. Gilbert, and R. Morris. Etna: A fault-tolerant algorithm for atomic mutable dht data. Technical Report MIT-LCS-TR-993, MIT Laboratory for Computer Science, June 2004.Google Scholar
- A. Muthitacharoen, R. Morris, T. Gil, and B. Chen. Ivy: A read/write peer-to-peer file system. In Proc. of OSDI, 2002. Google ScholarDigital Library
- L. Peterson, A. B. E. Fiuczynski, and S. Muir. Experiences building planetlab. In Proc. of OSDI, Nov. 2006. Google ScholarDigital Library
- S. Quinlan and S. Dorward. Venti: A new approach to archival data storage. In Proc. of USENIX FAST, Jan. 2002. Google ScholarDigital Library
- S. Rhea, B. Chun, J. Kubiatowicz, and S. Shenker. Fixing the embarrassing slowness of opendht on planetlab. In Proc. of USENIX Workshop on Real, Large Distributed Systems (WORLDS), Dec. 2005. Google ScholarDigital Library
- S. Rhea, P. Eaton, D. Geels, H. Weatherspoon, B. Zhao, and J. Kubiatowicz. Pond: the OceanStore prototype. In Proc. of USENIX FAST, 2003. Google ScholarDigital Library
- S. Rhea, D. Geels, T. Roscoe, and J. Kubiatowicz. Handling churn in a dht. In Proc. of USENIX, June 2004. Google ScholarDigital Library
- R. Rodrigues and B. Liskov. Rosebud: A scalable byzantine-fault-tolerant storage architecture. Technical Report MIT-LCS-TR-932, MIT Laboratory for Computer Science, Dec. 2003.Google Scholar
- B. Schneier and J. Kelsey. Cryptographic support for secure logs on untrusted machines. In Proc. of USENIX Annual Technical Conf., Jan. 1998. Google ScholarDigital Library
- A. S. Tanenbaum, R. van Renesse, H. van Staveren, G. J. Sharp, S. J. Mullender, J. Jansen, and G. van Rossum. Experiences with the Amoeba distributed operating system. Communications of the ACM, 33(12):46--63, 1990. Google ScholarDigital Library
- C. Thekkath, T. Mann, and E. Lee. Frangipani: A scalable distributed file system. In Proc. of ACM SOSP, 1997. Google ScholarDigital Library
- R. van Renesse and F. B. Schneider. Chain replication for supporting high throughput and availability. In Proc. of OSDI, May 2004. Google ScholarDigital Library
- H. Weatherspoon. Design and Evaluation of Distributed Wide-Area On-line Archival Storage Systems. PhD thesis, EECS Department, University of California, Berkeley, October 13 2006. Google ScholarDigital Library
- L. Zhou, F. Schneider, and R. van Renesse. Coca: A secure distributed on-line certification authority. ACM Transactions on Computer Systems, pages 329--368, Nov. 2002. Google ScholarDigital Library
Index Terms
- Antiquity: exploiting a secure log for wide-area distributed storage
Recommendations
Antiquity: exploiting a secure log for wide-area distributed storage
EuroSys'07 Conference ProceedingsAntiquity is a wide-area distributed storage system designed to provide a simple storage service for applications like file systems and back-up. The design assumes that all servers eventually fail and attempts to maintain data despite those failures. ...
Estimating the Reliability of Regeneration-Based Replica Control Protocols
The accessibility of vital information can be enhanced by replicating the data on several sites and employing a consistency control protocol to manage the replicas. The reliability of a replicated data object depends on maintaining a viable set of ...
Reducing Replication Overhead for Data Durability in DHT Based P2P System
DHT based p2p systems appear to provide scalable storage services with idle resource from many unreliable clients. If a DHT is used in storage intensive applications where data loss must be minimized, quick replication is especially important to replace ...
Comments