Abstract
Data-intensive applications require extreme scaling of their underlying storage systems. Such scaling, together with the fact that storage systems must be implemented in actual data centers, increases the risk of data loss from failures of underlying components. Accurate engineering requires quantitatively predicting reliability, but this remains challenging due to the need to account for extreme scale, redundancy scheme type and strength, distribution architecture, and component dependencies. This article introduces CQSim-R, a tool suite for predicting the reliability of large-scale storage system designs and deployments. CQSim-R includes (a) direct calculations based on an only-drives-fail failure model and (b) an event-based simulator for detailed prediction that handles failures of and failure dependencies among arbitrary (drive or nondrive) components. These are based on a common combinatorial framework for modeling placement strategies. The article demonstrates CQSim-R using models of common storage systems, including replicated and erasure coded designs. New results, such as the poor reliability scaling of spread-placed systems and a quantification of the impact of data center distribution and rack-awareness on reliability, demonstrate the usefulness and generality of the tools. Analysis and empirical studies show the tools’ soundness, performance, and scalability.
- John E. Angus. 1988. On computing MTBF for a k-out-of-n:G repairable system. IEEE Transactions on Reliability 37, 312--313.Google ScholarCross Ref
- Lakshmi N. Bairavasundaram, Garth R. Goodson, Shankar Pasupathy, and Jiri Schindler. 2007. An analysis of latent sector errors in disk drives. In Proceedings of the 2007 ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’07). Google ScholarDigital Library
- Dhruba Borthakur. 2007. The Hadoop Distributed File System: Architecture and Design. Technical Report. Apache Software Foundation.Google Scholar
- CERN-LHC. 2016. Computing. Retrieved June 28, 2016, from http://home.cern/about/computing.Google Scholar
- Peter M. Chen, Edward K. Lee, Garth A. Gibson, Randy H. Katz, and David A. Patterson. 1994. RAID: High-performance, reliable secondary storage. ACM Computing Surveys 26, 145--185. Google ScholarDigital Library
- Asaf Cidon, Stephen Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout, and Mendel Rosenblum. 2013. Copysets: Reducing the frequency of data loss in cloud storage. In Proceedings of the 2013 Usenix Annual Technical Conference. Google ScholarDigital Library
- J. G. Elerath and M. Pecht. 2007. Enhanced reliability modeling of RAID storage systems. In Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks. Google ScholarDigital Library
- Jon G. Elerath and Jiri Schindler. 2014. Beyond MTTDL: A closed-form RAID 6 reliability equation. ACM Transactions on Storage 10, 2, Article No. 7. DOI:http://dx.doi.org/10.1145/2577386 Google ScholarDigital Library
- Daniel Ford, Franois Labelle, Florentina I. Popovici, Murray Stokely, Van anh Truong, Luiz Barroso, Carrie Grimes, and Sean Quinlan. 2010. Availability in globally distributed storage systems. In Proceedings of the 9th Usenix Symposium on Operating Systems Design and Implementation. Google ScholarDigital Library
- Kevin Greenan, James Plank, and Jay Wylie. 2010. Mean time to meaningless: MTTDL, Markov models, and storage system reliability. In Proceedings of the 2nd USENIX Conference on Hot Topics in Storage Systems. Google ScholarDigital Library
- V. Guerriero. 2012. Power law distribution: Method of multi-scale inferential statistics. Journal of Modern Mathematics Frontier, 21--28.Google Scholar
- Ilias Iliadis and Vinodh Venkatesan. 2015. Rebuttal to beyond mttdl: A closed-form raid-6 reliability equation. ACM Transactions on Storage 11, 2, Article No. 9. Google ScholarDigital Library
- H.-W. Kao, J.-F. Paris, T. Schwarz, and D. Long. 2013. A flexible simulation tool for estimating data loss risks in storage arrays. In Proceedings of the IEEE 29th Symposium on Mass Storage Systems and Technologies.Google Scholar
- Michael Ovsiannikov, Silvius Rus, Damian Reeves, Paul Sutter, Sriram Rao, and Jim Kelly. 2013. The quantcast file system. Proceedings of the VLDB Endowment 6, 11, 1092--1101. Google ScholarDigital Library
- Kestutis Patiejunas. 2014. Freezing exabytes of data at Facebook’s cold storage. In Proceedings of the Workshop on Designing Storage Architectures for Digital Collections 2014.Google Scholar
- Eduardo Pinheiro, Wolf-Dietrich Weber, and Luiz Andre Barroso. 2007. Failure trends in a large disk drive population. In Proceedings of the 5th USENIX Conference on File and Storage Technologies. Google ScholarDigital Library
- K. K. Rao, J. L. Hafner, and R. A. Golding. 2006. Reliability for networked storage nodes. In Proceedings of the International Conference on Dependable Systems and Networks. Jason Resch and Ilya Volvovski. 2013. Reliability models for highly fault-tolerant storage systems. arXiv.org, arXiv:1310.4702. (2013). Google ScholarDigital Library
- Biance Schroeder, Sotirios Damouras, and Phillipa Gill. 2010. Understanding latent sector errors and how to protect against them. ACM Transactions on Storage 6, 3, Article No. 9. Google ScholarDigital Library
- Bianca Schroeder and Garth A. Gibson. 2007. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? In Proceedings of the 5th Usenix Conference on File and Storage Technologies. Google ScholarDigital Library
- Mark Storer, Kevin Greenan, Ethan Miller, and Kaladhar Voruganti. 2008. Pergamum: Replacing tape with energy efficient, reliable, disk-based archival storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies. Google ScholarDigital Library
- Vinodh Venkatesan and Ilias Iliadis. 2012. A general reliability model for data storage systems. In Proceedings of the 9th International Conference on Quantitative Evaluation of Systems. Google ScholarDigital Library
- Vinodh Venkatesan, Ilias Iliadis, Christina Fragouli, and Rudiger Urbanke. 2011. Reliability of clustered vs. declustered replica placement in data storage systems. In Proceedings of the 19th International Symposium on Modeling, Analysis, & Simulation of Computer and Telecommunication Systems. Google ScholarDigital Library
- Vinodh Venkatesan, Ilias Iliadis, and Robert Haas. 2012. Reliability of data storage systems under network rebuild bandwidth constraints. In Proceedings of the 20th International Symposium on Modeling, Analysis, & Simulation of Computer and Telecommunication Systems. Google ScholarDigital Library
- Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006b. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementations. Google ScholarDigital Library
- Sage A. Weil, Scott A. Brandt, Ethan L. Miller, and Carlos Maltzahn. 2006a. CRUSH: Controlled, scalable, decentralized placement of replicated data. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing. Google ScholarDigital Library
- Qin Xin, Ethan L. Miller, Thomas Schwarz, Darrell D. E. Long, Scott A. Brandt, and Witold Litwin. 2003. Reliability mechanisms for very large storage systems. In Proceedings of the 20th IEEE Conference on Mass Storage Systems and Technologies. Google ScholarDigital Library
Index Terms
- Tools for Predicting the Reliability of Large-Scale Storage Systems
Recommendations
Storage systems for movies-on-demand video servers
MSS '95: Proceedings of the 14th IEEE Symposium on Mass Storage SystemsWe evaluate storage system alternatives for movies-on-demand video servers. We begin by characterizing the movies-on-demand workload. We briefly discuss performance in disk arrays. First, we study disk farms in which one movie is stored per disk. This ...
Machine Learning for Reliability Analysis of Large Scale Systems
Quantitative Evaluation of SystemsAbstractAs distributed systems dramatically grow in terms of scale, complexity, and usage, understanding the hidden interactions among system and workload properties becomes an exceedingly difficult task. Machine learning models for prediction of system ...
Read-Performance Optimization for Deduplication-Based Storage Systems in the Cloud
Data deduplication has been demonstrated to be an effective technique in reducing the total data transferred over the network and the storage space in cloud backup, archiving, and primary storage systems, such as VM (virtual machine) platforms. However, ...
Comments