Ceph | Guide books

Ceph: reliable, scalable, and high-performance distributed storage

January 2007

Author:
Sage A. Weil
University of California, Santa Cruz
,
Adviser:
Scott A. Brandt
University of California, Santa Cruz

Publisher:

University of California at Santa Cruz
Computer and Information Sciences Dept. 265 Applied Sciences Building Santa Cruz, CA
United States

ISBN:978-0-549-31235-2

Order Number:AAI3288383

Pages:

221

Purchase on ProQuest

Bibliometrics

Abstract

As the size and performance requirements of storage systems have increased, file system designers have looked to new architectures to facilitate system scalability. The emerging object-based storage paradigm diverges from server-based ( e. g . NFS) and SAN-based storage systems by coupling processors and memory with disk drives, allowing systems to delegate low-level file system operations ( e. g . allocation and scheduling) to object storage devices (OSDs) and decouple I/O (read/write) from metadata (file open/close) operations. Even recent object-based systems inherit a variety of decades-old architectural choices going back to early UNIX file systems, however, limiting their ability to effectively scale.

This dissertation shows that device intelligence can be leveraged to provide reliable, scalable, and high-performance file service in a dynamic cluster environment. It presents a distributed metadata management architecture that provides excellent performance and scalability by adapting to highly variable system workloads while tolerating arbitrary node crashes. A flexible and robust data distribution function places data objects in a large, dynamic cluster of storage devices, simplifying metadata and facilitating system scalability, while providing a uniform distribution of data, protection from correlated device failure, and efficient data migration. This placement algorithm facilitates the creation of a reliable and scalable object storage service that distributes the complexity of consistent data replication, failure detection, and recovery across a heterogeneous cluster of semi-autonomous devices.

These architectural components, which have been implemented in the Ceph distributed file system, are evaluated under a variety of workloads that show superior I/O performance, scalable metadata management, and failure recovery.

Cited By

Contributors

Scott Alan Brandt
University of California, Santa Cruz
- Publication Years1998 - 2016
- Publication counts68
- Citation count1,653
- Available for Download24
- Downloads (cumulative)20,268
- Downloads (12 months)580
- Downloads (6 weeks)62
- Average Downloads per Article845
- Average Citation per Article24
View Full Profile
Sage A Weil
Red Hat, Inc.
- Publication Years2004 - 2020
- Publication counts10
- Citation count737
- Available for Download8
- Downloads (cumulative)22,700
- Downloads (12 months)1,898
- Downloads (6 weeks)206
- Average Downloads per Article2,838
- Average Citation per Article74
View Full Profile

Index Terms

Recommendations

Ceph: a scalable, high-performance distributed file system
OSDI '06: Proceedings of the 7th symposium on Operating systems design and implementation

We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution ...
Read More
Ceph: a scalable, high-performance distributed file system
OSDI '06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7

We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution ...
Read More
Using ceph's BlueStore as object storage in HPC storage framework
CHEOPS '21: Proceedings of the Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems

In times of ever-increasing data sizes, data management and insightful analysis are amidst the most severe challenges of high-performance computing. While high-level libraries such as NetCDF, HDF5, and ADIOS2, as well as the associated self-describing ...
Read More

Comments

Browse Theses

Sections

Cited By

Index Terms

Ceph: a scalable, high-performance distributed file system

Ceph: a scalable, high-performance distributed file system

Using ceph's BlueStore as object storage in HPC storage framework

Sections

Cited By

Save to Binder

Index Terms

Recommendations

Ceph: a scalable, high-performance distributed file system

Ceph: a scalable, high-performance distributed file system

Using ceph's BlueStore as object storage in HPC storage framework