Abstract
Information storage reliability and security is addressed by using personal computer disk drives in enterprise-class nearline and archival storage systems. The low cost of these serial ATA (SATA) PC drives is a tradeoff against drive reliability design and demonstration test levels, which are higher in the more expensive SCSI and Fibre Channel drives. This article discusses the tradeoff between SATA which has the advantage that fewer higher capacity drives are needed for a given system storage capacity, which further reduces cost and allows higher drive failure rates, and the use of additional storage system redundancy and drive failure prediction to maintain system data integrity using less reliable drives. RAID stripe failure probability is calculated using typical ATA and SCSI drive failure rates, for single and double parity data reconstruction failure, and failure due to drive unrecoverable block errors. Reliability improvement from drive failure prediction is also calculated, and can be significant. Today's SATA drive specifications for unrecoverable block errors appear to allow stripe reconstruction failure, and additional in-drive parity blocks are suggested as a solution. The possibility of using low cost disks data for backup and archiving is discussed, replacing higher cost magnetic tape. This requires significantly better RAID stripe failure probability, and suitable drive technology alternatives are discussed. The failure rate of nonoperating drives is estimated using failure analysis results from ≈4000 drives. Nonoperating RAID stripe failure rates are thereby estimated. User data security needs to be assured in addition to reliability, and to extend past the point where physical control of drives is lost, such as when drives are removed from systems for data vaulting, repair, sale, or discard. Today, over a third of resold drives contain unerased user data. Security is proposed via the existing SATA drive secure-erase command, or via the existing SATA drive password commands, or by data encryption. Finally, backup and archival disc storage is compared to magnetic tape, a technology with a proven reliability record over the full half-century of digital data storage. In contrast, tape archives are not vulnerable to tape transport failure modes. Only failure modes in the archived tapes and reels will make data unrecoverable.
- Anderson, D., Dykes, J., and Riedel, E. 2003. More than an interface---SCSI vs. ATA. In Proceedings of the 2nd Annual Conference on File and Storage Technology (FAST) (March).]] Google Scholar
- Daniel, E., Mee, C. D., and Clark, M. C. 1999. Magnetic recording, the first 100 years. IEEE Press. 20.]]Google Scholar
- Colarelli, D., Grunwald, D., and Neufeld, M. 2002. The case for massive arrays of idle disks. 2002 Conference on File and Storage Technologies.]]Google Scholar
- Garfinkel, S. L. and Shelat, A. 2003. Rembrance of data past: a study of disk sanitization practices, IEEE J. Secur. Privacy (Jan.-Feb.) 17--25.]] Google Scholar
- Hamerly, G. and Elkan, C. 2001. Bayesian approaches to failure prediction for disk drives. In The 18th International Conference on Machine Learning. 1--9.]] Google Scholar
- Hughes, G. F. 2002. Improved disk drive failure warnings. IEEE Trans. Reliab. 51, (Sept.), 350--357.]]Google ScholarCross Ref
- Hughes, G. F. 2002. Wise Drives, IEEE Spectrum (Aug.).]]Google Scholar
- Lee, M.-Y. and Park, M.-S. 1996. Double parity sparing for performance improvement in disk arrays. In Proceedings of International Conference on Parallel and Distributed Systems---ICPADS 1996. IEEE, Los Alamitos, CA, 169--174.]] Google Scholar
- Lueth, C. 2004. NetApp data double parity RAID for enhanced data protection with RAID DP. Network Appliance Report TR3298 (Jan.).]]Google Scholar
- Moser, A., Takano, K., Margulies, D., Albrecht, M., Sonobe, Y., Ikeda, Y., Sun, S., and Fullerton, E. 2002. Magnetic recording: advancing into the future. J. Phys. D: Appl. Phys. 35. R157--67.]]Google Scholar
- Murray, J. F. and Hughes, G. F. 2003. Hard drive failure prediction using non-parametric statistical methods. International Conference on Artificial Neural Networks. Istanbul.]]Google Scholar
- Murray, J. F., Hughes, G. F., And Kreutz-Delgado, K. 2004. Comparison of machine learning methods for predicting failures in hard drives. To appear J. Mach. Learn. Res.]] Google Scholar
- Schwarz, T. J. E. and Burkhard, W. A. 1995. Reliability and performance of RAIDs. IEEE Trans. Mag. 31, 2 (March), 1161--1166.]]Google ScholarCross Ref
- Storage Networking Industry Association OSD Technical Work Group. www.snia.org.]]Google Scholar
Index Terms
- Reliability and security of RAID storage systems and D2D archives using SATA disk drives
Recommendations
Hybrid S-RAID: A Power-Aware Archival Storage Architecture
PDCAT '12: Proceedings of the 2012 13th International Conference on Parallel and Distributed Computing, Applications and TechnologiesSemi-RAID (S-RAID) is an alternative RAID data layout for applications that exhibit sequential data access pattern in order to reduce power consumption of storage systems. However, it is not design for archival storage specially, and that makes it not ...
Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems
SIGMETRICS '08: Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systemsTwo schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used disk scrubbing scheme, which operates by periodically accessing disk drives ...
Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems
SIGMETRICS '08Two schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used disk scrubbing scheme, which operates by periodically accessing disk drives ...
Comments