As the size and performance requirements of storage systems have increased, file system designers have looked to new architectures to facilitate system scalability. The emerging object-based storage paradigm diverges from server-based ( e. g . NFS) and SAN-based storage systems by coupling processors and memory with disk drives, allowing systems to delegate low-level file system operations ( e. g . allocation and scheduling) to object storage devices (OSDs) and decouple I/O (read/write) from metadata (file open/close) operations. Even recent object-based systems inherit a variety of decades-old architectural choices going back to early UNIX file systems, however, limiting their ability to effectively scale.
This dissertation shows that device intelligence can be leveraged to provide reliable, scalable, and high-performance file service in a dynamic cluster environment. It presents a distributed metadata management architecture that provides excellent performance and scalability by adapting to highly variable system workloads while tolerating arbitrary node crashes. A flexible and robust data distribution function places data objects in a large, dynamic cluster of storage devices, simplifying metadata and facilitating system scalability, while providing a uniform distribution of data, protection from correlated device failure, and efficient data migration. This placement algorithm facilitates the creation of a reliable and scalable object storage service that distributes the complexity of consistent data replication, failure detection, and recovery across a heterogeneous cluster of semi-autonomous devices.
These architectural components, which have been implemented in the Ceph distributed file system, are evaluated under a variety of workloads that show superior I/O performance, scalable metadata management, and failure recovery.
Cited By
- Irtegov D, Belousov P, Fal A and Fedosenko A On one source of latency in NFSv4 client Proceedings of the 13th Central & Eastern European Software Engineering Conference in Russia, (1-9)
- Huo Z, Xiao L, Zhong Q, Li S, Li A, Ruan L, Wang S and Fu L (2016). MBFS, The Journal of Supercomputing, 72:8, (3006-3032), Online publication date: 1-Aug-2016.
- Séguin C, Mahec G and Depardon B Towards elasticity in distributed file systems Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (1047-1056)
- Sevilla M, Watkins N, Maltzahn C, Nassi I, Brandt S, Weil S, Farnum G and Fineberg S Mantle Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-12)
Index Terms
- Ceph: reliable, scalable, and high-performance distributed storage
Recommendations
Ceph: a scalable, high-performance distributed file system
OSDI '06: Proceedings of the 7th symposium on Operating systems design and implementationWe have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution ...
Ceph: a scalable, high-performance distributed file system
OSDI '06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution ...
Using ceph's BlueStore as object storage in HPC storage framework
CHEOPS '21: Proceedings of the Workshop on Challenges and Opportunities of Efficient and Performant Storage SystemsIn times of ever-increasing data sizes, data management and insightful analysis are amidst the most severe challenges of high-performance computing. While high-level libraries such as NetCDF, HDF5, and ADIOS2, as well as the associated self-describing ...