article

Reliability and security of RAID storage systems and D2D archives using SATA disk drives

Authors:
Gordon F. Hughes

University of California San Diego, San Diego, CA

University of California San Diego, San Diego, CA
View Profile

,
Joseph F. Murray

University of California San Diego, San Diego, CA

University of California San Diego, San Diego, CA
View Profile

Authors Info & Claims

ACM Transactions on Storage Volume 1 Issue 1pp 95–107https://doi.org/10.1145/1044956.1044961

Published:01 February 2005Publication History

ACM Transactions on Storage

Abstract

Information storage reliability and security is addressed by using personal computer disk drives in enterprise-class nearline and archival storage systems. The low cost of these serial ATA (SATA) PC drives is a tradeoff against drive reliability design and demonstration test levels, which are higher in the more expensive SCSI and Fibre Channel drives. This article discusses the tradeoff between SATA which has the advantage that fewer higher capacity drives are needed for a given system storage capacity, which further reduces cost and allows higher drive failure rates, and the use of additional storage system redundancy and drive failure prediction to maintain system data integrity using less reliable drives. RAID stripe failure probability is calculated using typical ATA and SCSI drive failure rates, for single and double parity data reconstruction failure, and failure due to drive unrecoverable block errors. Reliability improvement from drive failure prediction is also calculated, and can be significant. Today's SATA drive specifications for unrecoverable block errors appear to allow stripe reconstruction failure, and additional in-drive parity blocks are suggested as a solution. The possibility of using low cost disks data for backup and archiving is discussed, replacing higher cost magnetic tape. This requires significantly better RAID stripe failure probability, and suitable drive technology alternatives are discussed. The failure rate of nonoperating drives is estimated using failure analysis results from ≈4000 drives. Nonoperating RAID stripe failure rates are thereby estimated. User data security needs to be assured in addition to reliability, and to extend past the point where physical control of drives is lost, such as when drives are removed from systems for data vaulting, repair, sale, or discard. Today, over a third of resold drives contain unerased user data. Security is proposed via the existing SATA drive secure-erase command, or via the existing SATA drive password commands, or by data encryption. Finally, backup and archival disc storage is compared to magnetic tape, a technology with a proven reliability record over the full half-century of digital data storage. In contrast, tape archives are not vulnerable to tape transport failure modes. Only failure modes in the archived tapes and reels will make data unrecoverable.

References

Anderson, D., Dykes, J., and Riedel, E. 2003. More than an interface---SCSI vs. ATA. In Proceedings of the 2nd Annual Conference on File and Storage Technology (FAST) (March).]] Google Scholar
Daniel, E., Mee, C. D., and Clark, M. C. 1999. Magnetic recording, the first 100 years. IEEE Press. 20.]]Google Scholar
Colarelli, D., Grunwald, D., and Neufeld, M. 2002. The case for massive arrays of idle disks. 2002 Conference on File and Storage Technologies.]]Google Scholar
Garfinkel, S. L. and Shelat, A. 2003. Rembrance of data past: a study of disk sanitization practices, IEEE J. Secur. Privacy (Jan.-Feb.) 17--25.]] Google Scholar
Hamerly, G. and Elkan, C. 2001. Bayesian approaches to failure prediction for disk drives. In The 18th International Conference on Machine Learning. 1--9.]] Google Scholar
Hughes, G. F. 2002. Improved disk drive failure warnings. IEEE Trans. Reliab. 51, (Sept.), 350--357.]]Google ScholarCross Ref
Hughes, G. F. 2002. Wise Drives, IEEE Spectrum (Aug.).]]Google Scholar
Lee, M.-Y. and Park, M.-S. 1996. Double parity sparing for performance improvement in disk arrays. In Proceedings of International Conference on Parallel and Distributed Systems---ICPADS 1996. IEEE, Los Alamitos, CA, 169--174.]] Google Scholar
Lueth, C. 2004. NetApp data double parity RAID for enhanced data protection with RAID DP. Network Appliance Report TR3298 (Jan.).]]Google Scholar
Moser, A., Takano, K., Margulies, D., Albrecht, M., Sonobe, Y., Ikeda, Y., Sun, S., and Fullerton, E. 2002. Magnetic recording: advancing into the future. J. Phys. D: Appl. Phys. 35. R157--67.]]Google Scholar
Murray, J. F. and Hughes, G. F. 2003. Hard drive failure prediction using non-parametric statistical methods. International Conference on Artificial Neural Networks. Istanbul.]]Google Scholar
Murray, J. F., Hughes, G. F., And Kreutz-Delgado, K. 2004. Comparison of machine learning methods for predicting failures in hard drives. To appear J. Mach. Learn. Res.]] Google Scholar
Schwarz, T. J. E. and Burkhard, W. A. 1995. Reliability and performance of RAIDs. IEEE Trans. Mag. 31, 2 (March), 1161--1166.]]Google ScholarCross Ref
Storage Networking Industry Association OSD Technical Work Group. www.snia.org.]]Google Scholar

Index Terms

Reliability and security of RAID storage systems and D2D archives using SATA disk drives
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
2. Hardware
  1. Hardware test
  2. Robustness

Recommendations

Hybrid S-RAID: A Power-Aware Archival Storage Architecture
PDCAT '12: Proceedings of the 2012 13th International Conference on Parallel and Distributed Computing, Applications and Technologies

Semi-RAID (S-RAID) is an alternative RAID data layout for applications that exhibit sequential data access pattern in order to reduce power consumption of storage systems. However, it is not design for archival storage specially, and that makes it not ...
Read More
Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems
SIGMETRICS '08: Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems

Two schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used disk scrubbing scheme, which operates by periodically accessing disk drives ...
Read More
Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems
SIGMETRICS '08

Two schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used disk scrubbing scheme, which operates by periodically accessing disk drives ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Storage Volume 1, Issue 1
February 2005
131 pages
ISSN:1553-3077
EISSN:1553-3093
DOI:10.1145/1044956
Issue’s Table of Contents

Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 February 2005
Published in tos Volume 1, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Disk drive
SATA
SMART
archival storage
failure prediction
secure erase
storage resource management
storage systems architecture
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 40
  Total Citations
  View Citations
- 1,662
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Reliability and security of RAID storage systems and D2D archives using SATA disk drives

ACM Transactions on Storage

Abstract

References

Cited By

Index Terms

Recommendations

Hybrid S-RAID: A Power-Aware Archival Storage Architecture

Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems

Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Reliability and security of RAID storage systems and D2D archives using SATA disk drives

ACM Transactions on Storage

Abstract

References

Cited By

Index Terms

Recommendations

Hybrid S-RAID: A Power-Aware Archival Storage Architecture

Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems

Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media