article

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

Authors:
Antony Rowstron

Microsoft Research, Cambridge, United Kingdom

Microsoft Research, Cambridge, United Kingdom
View Profile

,
Peter Druschel

Rice University, Houston, TX

Rice University, Houston, TX
View Profile

Authors Info & Claims

ACM SIGOPS Operating Systems Review Volume 35 Issue 5Dec. 2001pp 188–201https://doi.org/10.1145/502059.502053

Published:21 October 2001Publication History

ACM SIGOPS Operating Systems Review

Abstract

This paper presents and evaluates the storage management and caching in PAST, a large-scale peer-to-peer persistent storage utility. PAST is based on a self-organizing, Internet-based overlay network of storage nodes that cooperatively route file queries, store multiple replicas of files, and cache additional copies of popular files.In the PAST system, storage nodes and files are each assigned uniformly distributed identifiers, and replicas of a file are stored at nodes whose identifier matches most closely the file's identifier. This statistical assignment of files to storage nodes approximately balances the number of files stored on each node. However, non-uniform storage node capacities and file sizes require more explicit storage load balancing to permit graceful behavior under high global storage utilization; likewise, non-uniform popularity of files requires caching to minimize fetch distance and to balance the query load.We present and evaluate PAST, with an emphasis on its storage management and caching system. Extensive trace-driven experiments show that the system minimizes fetch distance, that it balances the query load for popular files, and that it displays graceful degradation of performance as the global storage utilization increases beyond 95%.

References

1 Napster. http://www.napster.com/.]]Google Scholar
2 The Gnutella protocol specification, 2000. http: / / dss.clip2.com/GnuteUaProtocolO4.pdf.]]Google Scholar
3 W. Adjie-Winoto, E. Schwartz, H. Baiakrishnan, and J. Lilley. The design and implementation of an intentional naming system. In Proc. SOSP'99, Kiawah Island, SC, Dec. 1999.]] Google ScholarDigital Library
4 Y. Amir, A. Peterson, and D. Shaw. Seamlessly selecting the best copy from Internet-wide replicated web servers. In Proc. 12th Symposium on Distributed Computing, Andros, Greece, Sept. 1998.]] Google ScholarDigital Library
5 R. Anderson. The Eternity service. In Proc. PRAGOCRYPT'96, pages 242-252. CTU Publishing House, 1996. Prague, Czech Republic.]]Google Scholar
6 T. Anderson, M. Dahlin, J. Neefe, D. Patterson, D. RoseUi, and R. Wang. Serverless network file systems. In Proe. 15th A CM SOSP, Copper Mountain, CO, Dec. 1995.]] Google ScholarDigital Library
7 F. Bennett, D. Clarke, J. B. Evans, A. Hopper, A. Jones, and D. Leask. Piconet - embedded mobile networking. IBEE Personal Communications, 4(5):8-15, October 1997.]]Google ScholarCross Ref
8 W. J. Bolosky, J. R. Douceur, D. Ely, and M. Theimer. Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs. In Proc. SIGMBTRICS'2000, Santa Clara, CA, 2000.]] Google ScholarDigital Library
9 M. Bowman, L. L. Peterson, and A. Yeatts. Univers: An attribute-based name server. Software -- Practice and Experience, 20(4):403-424, Apr. 1990.]] Google ScholarDigital Library
10 L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In Proc. IEEE Infoeom'g9, New York, NY, Mar. 1999.]]Google ScholarCross Ref
11 P. Cao and S. Irani. Cost-aware WWW proxy caching algorithms. In Proc. USENIX Symposium on Internet Technologies and Systems (USITS), Monterey, CA, Dec. 1997.]] Google ScholarDigital Library
12 D. R. Cheriton and T. P. Mann. Decentralizing a global naming service for improved performance and fault tolerance. ACM Trans. Comput. Syst., 7(2):147-183, May 1989.]] Google ScholarDigital Library
13 I. Clarke, O. Sandberg, B. Wiley, and T. W. Hong. Freenet: A distributed anonymous information storage and retrieval system. In Workshop on Design Issues in Anonymity and Unobservability, pages 311-320, July 2000. ICSI, Berkeley, CA, USA.]] Google ScholarDigital Library
14 F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with CFS. In Proc. ACM SOSP'01, Banff, Canada, Oct. 2001.]] Google ScholarDigital Library
15 It. Dingledine, M. J. Freedman, and D. Molnar. The Free Haven project: Distributed anonymous storage service. In Proc. Workshop on Design Issues in Anonymity and Unobservability, Berkeley, CA, July 2000.]] Google ScholarDigital Library
16 P. Druschel and A. Rowstron. PAST: A large-scale, persistent peer-to-peer storage utility. In Proc. HotOS VIII, Schloss Elman, Germany, May 2001.]] Google ScholarDigital Library
17 J. Jannotti, D. K. Gifford, K. L. Johnson, M. F. Kaashoek, and J. W. O'Toole. Overcast: Reliable multicasting with an overlay network. In Proc. OSDI 2000, San Diego, CA, October 2000.]] Google ScholarDigital Library
18 J. Kangasharju, J. W. Roberts, and K. W. Ross. Performance evaluation of redirection schemes in content distribution networks. In Proc. 4th Web Caching Workshop, San Diego, CA, Mar. 1999.]]Google Scholar
19 J. Kangasharju and K. W. Ross. A replicated architecture for the domain name system. In Proc. IEEE Infocom 2000, Tel Aviv, Israel, Max. 2000.]]Google ScholarCross Ref
20 J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao. Oceanstore: An architecture for global-scale persistent store. In Proc. ASPLOS'2000, Cambridge, MA, November 2000.]] Google ScholarDigital Library
21 B. Lampson. Designing a global name service. In Proc. Fifth Symposium on the Principles of Distributed Computing, pages 1-10, Minaki, Canada, Aug. 1986.]] Google ScholarDigital Library
22 J. Li, J. 3annotti, D. S. J. D. Couto, D. R. Karger, and R. Morris. A scalable location service for geographical ad hoc routing. In Proc. of A CM MOBICOM 2000, Boston, MA, August 2000.]] Google ScholarDigital Library
23 J. S. Plank. A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems. Software -- Practice and Experience, 27(9):995-1012, Sept. 1997.]] Google ScholarDigital Library
24 C. G. Plaxton, R. Rajaraman, and A. W. Richa. Accessing nearby copies of replicated objects in a distributed environment. Theory of Computing Systems, 32:241-280, 1999.]]Google ScholarCross Ref
25 S. Ratnasamy, P. Francis, M. Handiey, R. Karp, and S. Shenker. A scalable content-addressable network. In Proc. ACM SIGCOMM'01, San Diego, CA, Aug. 2001.]] Google ScholarDigital Library
26 J. Reynolds. RFC 1309: Technical overview of directory services using the X.500 protocol, Mar. 1992.]]Google Scholar
27 A. Rowstron and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proc. IFIP/A CM Middleware 2001, Heidelberg, Germany, Nov. 2001.]] Google ScholarDigital Library
28 S. Saroiu, P. K. Gummadi, and S. D. Gribble. A measurement study of peer-to-peer file sharing systems. Technical Roport UW-CSE-01-06-02, University of Washington, July 2001.]]Google Scholar
29 M. A. Sheldon, A. Duda, R. Weiss, and D. K. Gifford. Discover: A resource discovery system based on content routing. In Proe. 3rd International World Wide Web Conference, Darmstadt, Germany, 1995.]] Google ScholarDigital Library
30 I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proc. A CM SIGCOMM'01, San Diego, CA, Aug. 2001.]] Google ScholarDigital Library
31 B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph. Tapestry: An infrastructure for fault-resilient wide-area location and routing. Technical Report UCB//CSD-01-1141, U. C. Berkeley, April 2001.]] Google ScholarDigital Library

Index Terms

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility
1. Information systems
  1. Information storage systems
    1. Storage management
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        File systems management
        Memory management
    2. Software system structures
      1. Distributed systems organizing principles

Recommendations

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility
SOSP '01: Proceedings of the eighteenth ACM symposium on Operating systems principles

This paper presents and evaluates the storage management and caching in PAST, a large-scale peer-to-peer persistent storage utility. PAST is based on a self-organizing, Internet-based overlay network of storage nodes that cooperatively route file ...
Read More
Storage management in large distributed object-based storage systems
Read More
LSM-tree managed storage for large-scale key-value store
SoCC '17: Proceedings of the 2017 Symposium on Cloud Computing

Key-value stores are increasingly adopting LSM-trees as their enabling data structure in the backend storage, and persisting their clustered data through a file system. A file system is expected to not only provide file/directory abstraction to organize ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGOPS Operating Systems Review Volume 35, Issue 5
Dec. 2001
243 pages
ISSN:0163-5980
DOI:10.1145/502059
Issue’s Table of Contents
SOSP '01: Proceedings of the eighteenth ACM symposium on Operating systems principles
October 2001
254 pages
ISBN:1581133898
DOI:10.1145/502034
General Chair:
Keith Marzullo
University of California, San Diego
,
Program Chair:
M. Satyanarayanan
Carnegie Mellon University and Intel Research, Pittsburgh
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 October 2001
Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 821
  Total Citations
  View Citations
- 2,879
  Total Downloads
- Downloads (Last 12 months)41
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

ACM SIGOPS Operating Systems Review

Abstract

References

Cited By

Index Terms

Recommendations

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

Storage management in large distributed object-based storage systems

LSM-tree managed storage for large-scale key-value store