skip to main content
Ceph: reliable, scalable, and high-performance distributed storage
Publisher:
  • University of California at Santa Cruz
  • Computer and Information Sciences Dept. 265 Applied Sciences Building Santa Cruz, CA
  • United States
ISBN:978-0-549-31235-2
Order Number:AAI3288383
Pages:
221
Bibliometrics
Skip Abstract Section
Abstract

As the size and performance requirements of storage systems have increased, file system designers have looked to new architectures to facilitate system scalability. The emerging object-based storage paradigm diverges from server-based ( e. g . NFS) and SAN-based storage systems by coupling processors and memory with disk drives, allowing systems to delegate low-level file system operations ( e. g . allocation and scheduling) to object storage devices (OSDs) and decouple I/O (read/write) from metadata (file open/close) operations. Even recent object-based systems inherit a variety of decades-old architectural choices going back to early UNIX file systems, however, limiting their ability to effectively scale.

This dissertation shows that device intelligence can be leveraged to provide reliable, scalable, and high-performance file service in a dynamic cluster environment. It presents a distributed metadata management architecture that provides excellent performance and scalability by adapting to highly variable system workloads while tolerating arbitrary node crashes. A flexible and robust data distribution function places data objects in a large, dynamic cluster of storage devices, simplifying metadata and facilitating system scalability, while providing a uniform distribution of data, protection from correlated device failure, and efficient data migration. This placement algorithm facilitates the creation of a reliable and scalable object storage service that distributes the complexity of consistent data replication, failure detection, and recovery across a heterogeneous cluster of semi-autonomous devices.

These architectural components, which have been implemented in the Ceph distributed file system, are evaluated under a variety of workloads that show superior I/O performance, scalable metadata management, and failure recovery.

Contributors
  • University of California, Santa Cruz
  • Red Hat, Inc.

Recommendations