article

The automatic improvement of locality in storage systems

Authors:
Windsor W. Hsu

IBM Almaden Research Center and University of California, San Jose, CA

IBM Almaden Research Center and University of California, San Jose, CA
View Profile

,
Alan Jay Smith

University of California, Berkeley, Berkeley, CA

University of California, Berkeley, Berkeley, CA
View Profile

,
Honesty C. Young

IBM Almaden Research Center, San Jose, CA

IBM Almaden Research Center, San Jose, CA
View Profile

Authors Info & Claims

ACM Transactions on Computer Systems Volume 23 Issue 4pp 424–473https://doi.org/10.1145/1113574.1113577

Published:01 November 2005Publication History

ACM Transactions on Computer Systems

Abstract

Disk I/O is increasingly the performance bottleneck in computer systems despite rapidly increasing disk data transfer rates. In this article, we propose Automatic Locality-Improving Storage (ALIS), an introspective storage system that automatically reorganizes selected disk blocks based on the dynamic reference stream to increase effective storage performance. ALIS is based on the observations that sequential data fetch is far more efficient than random access, that improving seek distances produces only marginal performance improvements, and that the increasingly powerful processors and large memories in storage systems have ample capacity to reorganize the data layout and redirect the accesses so as to take advantage of rapid sequential data transfer. Using trace-driven simulation with a large set of real workloads, we demonstrate that ALIS considerably outperforms prior techniques, improving the average read performance by up to 50% for server workloads and by about 15% for personal computer workloads. We also show that the performance improvement persists as disk technology evolves. Since disk performance in practice is increasing by only about 8% per year, the benefit of ALIS may correspond to as much as several years of technological progress.

References

Akyürek, S. and Salem, K. 1995. Adaptive block rearrangement. ACM Trans. Comput. Syst. 13, 2 (May), 89--121.]] Google Scholar
Baker, M. G., Hartman, J. H., Kupfer, M. D., Shirriff, K. W., and Ousterhout, J. K. 1991. Measurements of a distributed file system. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP, Pacific Grove, CA). 198--212.]] Google Scholar
Bakke, B. E., Huss, F. L., Moertl, D. F., and Walk, B. M. 1998. Method and apparatus for adaptive localization of frequently accessed, randomly addressed data. U.S. Patent 5765204. Filed June 5, 1996. Issued June 9, 1998.]]Google Scholar
Carson, S. D. and Reynolds, Jr. P. F. 1989. Adaptive disk reorganization. Tech. rep. UMIACS-TR-89-4. Department of Computer Science, University of Maryland, College Park, MD.]]Google Scholar
Chee, C. L., Lu, H., Tang, H., and Ramamoorthy, C. V. 1998. Adaptive prefetching and storage reorganization in a log-structured storage system. IEEE Trans. Knowl. Data. Eng. 10, 5 (Sept.), 824--838.]] Google Scholar
Chen, P. M., Lee, E. K., Gibson, G. A., Katz, R. H., and Patterson, D. A. 1994. RAID: High-performance, reliable secondary storage. ACM Comput. Surv. 26, 2 (June), 145--185.]] Google Scholar
Dahlin, M., Mather, C., Wang, R., Anderson, T., and Patterson, D. 1994. A quantitative analysis of cache policies for scalable network file systems. In Proceedings of the ACM Conference on Measurement and Modeling of Computer Systems (SIGMETRICS). 150--160.]] Google Scholar
Executive Software International Inc. 2001. Diskeeper 6.0 second edition for Windows. Go online to http://www.execsoft.com/diskeeper/diskeeper.asp.]]Google Scholar
Ferrari, D. 1976. Improvement of program behavior. Comput. 9, 11 (Nov.), 39--47.]]Google Scholar
Ganger, G. R. and Kaashoek, M. F. 1997. Embedded inodes and explicit grouping: Exploiting disk bandwidth for small files. In Proceedings of the USENIX Technical Conference (Anaheim, CA). 1--17.]] Google Scholar
Ganger, G. R., Worthington, B. L., and Patt, Y. N. 1999. The DiskSim Simulation Environment Version 2.0 Reference Manual. Carnegie-Mellon University, Pittsburgh, PA/University of Michigan, Ann Arbor, MI.]]Google Scholar
Gray, J. 1998. Put EVERYTHING in the storage device. Talk at NASD Workshop on Storage Embedded Computing. Go online to http://www.nsic.org/nasd/1998-jun/gray.pdf.]]Google Scholar
Griffioen, J. and Appleton, R. 1994. Reducing file system latency using a predictive approach. In Proceedings of the Summer USENIX Conference. 197--207.]] Google Scholar
Grimsrud, K. S., Archibald, J. K., and Nelson, B. E. 1993. Multiple prefetch adaptive disk caching. IEEE Trans. Knowl. Data Eng. 5, 1 (Feb.), 88--103.]] Google Scholar
Grochowski, E. 2002. IBM magnetic hard disk drive technology. Go online to http://www.hgst.com/hdd/technolo/grochows/grocho01.htm.]]Google Scholar
Heisch, R. R. 1994. Trace-directed program restructuring for AIX executables. IBM J. Res. Develop. 38, 5 (Sept.), 595--603.]] Google Scholar
Hennessy, J. L. and Patterson, D. A. 1996. Computer Architecture A Quantitative Approach, 2nd ed. Morgan Kaufmann, San Francisco, CA.]] Google Scholar
Hsu, W. W. 2002. Dynamic locality improvement techniques for increasing effective storage performance. Ph.D. dissertation. University of California, Berkeley, Berkeley, CA. Available as Tech. rep. CSD-03-1223, Computer Science Division, University of California, Berkeley, Berkeley, CA (2003).]] Google Scholar
Hsu, W. W. and Smith, A. J. 2003. Characteristics of I/O traffic in personal computer and server workloads. IBM Syst. J. 42, 2, 347--372.]] Google Scholar
Hsu, W. W. and Smith, A. J. 2004. The performance impact of I/O optimizations and disk improvements. IBM J. Res. Develop. 48, 2, 255--289. Also available as Chapter 3 of {Hsu 2002}.]] Google Scholar
Hsu, W. W., Smith, A. J., and Young, H. C. 2000. Projecting the performance of decision support workloads on systems with smart storage (SmartSTOR). In Proceedings of the IEEE Seventh International Conference on Parallel and Distributed Systems (ICPADS, Iwate, Japan). 417--425.]] Google Scholar
Hsu, W. W., Smith, A. J., and Young, H. C. 2001a. Analysis of the characteristics of production database workloads and comparison with the TPC benchmarks. IBM Syst. J. 40, 3, 781--802.]] Google Scholar
Hsu, W. W., Smith, A. J., and Young, H. C. 2001b. I/O reference behavior of production database workloads and the TPC benchmarks---an analysis at the logical level. ACM Trans. Database Syst. 26, 1 (Mar.), 96--143.]] Google Scholar
Hsu, W. W., Smith, A. J., and Young, H. C. 2003. The automatic improvement of locality in storage systems. Tech. rep. CSD-03-1264. Computer Science Division, University of California, Berkeley, Berkeley, CA. Also available as Chapter 4 of {Hsu 2002}.]]Google Scholar
IBM Corp. 2001a. Autonomic computing: IBM's perspective on the state of information technology. Go online to http://www.research.ibm.com/autonomic/manifesto/autonomic_computing.pdf.]]Google Scholar
IBM Corp. 2001b. Ultrastar 73LZX product summary version 1.1. IBM, Yorktown Heights, NY.]]Google Scholar
Intel Corp. 1998. Intel application launch accelerator. Go online to http://www.intel.com/ial/ala.]]Google Scholar
Keeton, K., Patterson, D., and Hellerstein, J. 1998. A case for intelligent disks (IDISKs). ACM SIGMOD Rec. 27, 3, 42--52.]] Google Scholar
Knuth, D. E. 1973. The Art of Computer Programming, vol. 3. Addison-Wesley, Reading, MA.]]Google Scholar
Kroeger, T. M. and Long, D. D. E. 1996. Predicting file system actions from prior events. In Proceedings of the USENIX Annual Technical Conference. 319--328.]] Google Scholar
Lei, H. and Duchamp, D. 1997. An analytical approach to file prefetching. In Proceedings of the USENIX Annual Technical Conference (Anaheim, CA). 275--288.]] Google Scholar
Lorch, J. R. and Smith, A. J. 2000. The VTrace tool: Building a system tracer for Windows NT and Windows 2000. MSDN Mag. 15, 10 (Oct.), 86--102.]]Google Scholar
Lumb, C., Schindler, Ganger, G. R., Riedel, E., and Nagle, D. F. 2000. Towards higher disk head utilization: Extracting “free” bandwidth from busy disk drives. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI, San Diego, CA). 87--102.]] Google Scholar
Masuda, T., Shiota, H., Noguchi, K., and Ohki, T. 1974. Optimization of program organization by cluster analysis. In Proceedings of IFIP Congress. 261--265.]]Google Scholar
Matthews, J. N., Roselli, D., Costello, A. M., Wang, R. Y., and Anderson, T. E. 1997. Improving the performance of log-structured file systems with adaptive methods. In Proceedings of the ACM Symposium on Operating System Principles (SOSP, Saint-Malo, France). 238--251.]] Google Scholar
McDonald, S. 1988. Dynamically restructuring disk space for improved file system performance. Tech. rep. 88-14. Department of Computational Science, University of Saskatchewan, Saskatoon, Sask. Canada.]]Google Scholar
McKusick, M. K., Joy, W. N., Leffler, S. J., and Fabry, R. S. 1984. A fast file system for UNIX. ACM Trans. Comput. Syst. 2, 3 (Aug.), 181--197.]] Google Scholar
McNutt, B. 1995. MVS DASD survey: Results and trends. In Proceedings of the Computer Measurement Group (CMG) Conference (Nashville, TN). 658--667.]]Google Scholar
McVoy, L. W. and Kleiman, S. R. 1991. Extent-like performance from a UNIX file system. In Proceedings of the Winter USENIX Conference (Dallas, TX). 33--43.]]Google Scholar
Mesquite Software Inc. 1994. CSIM18 simulation engine (C++ version). Mesquite Software Inc., Austin, TX.]]Google Scholar
Ng, S. W. 1991. Improving disk performance via latency reduction. IEEE Trans. Comput. 40, 1 (Jan.), 22--30.]] Google Scholar
Ousterhout, J. and Douglis, F. 1989. Beating the I/O bottleneck: A case for log-structured file systems. Operat. Syst. Rev. 23, 1 (Jan.), 11--28.]] Google Scholar
Palmer, M. and Zdonik, S. B. 1991. Fido: A cache that learns to fetch. Tech. rep. CS-91-15. Department of Computer Science, Brown University, Providence, RI.]] Google Scholar
Peacock, J. K. 1988. The counterpoint fast file system. In USENIX Conference Proceedings (Dallas, TX). 243--249.]]Google Scholar
Riedel, E., Gibson, G. A., and Faloutsos, C. 1998. Active storage for large-scale data mining and multimedia. In Proceedings of the International Conference on Very Large Data Bases (VLDB, New York, NY). 62--73.]] Google Scholar
Roselli, D., Lorch, J. R., and Anderson, T. E. 2000. A comparison of file system workloads. In Proceedings of the USENIX Annual Technical Conference (Berkeley, CA). 41--54.]] Google Scholar
Ruemmler, C. and Wilkes, J. 1991. Disk shuffling. Tech. rep. HPL-91-156. HP Laboratories. Menlo Park, CA.]]Google Scholar
Ruemmler, C. and Wilkes, J. 1993. UNIX disk access patterns. In Proceedings of the USENIX Winter Conference (San Diego, CA). 405--420.]]Google Scholar
Smith, A. J. 1978. Sequentiality and prefetching in database systems. ACM Trans. Database Syst. 3, 3 (Sept.), 223--247.]] Google Scholar
Smith, A. J. 1985. Disk cache---miss ratio analysis and design considerations. ACM Trans. Comput. Syst. 3, 3 (Aug.), 161--203.]] Google Scholar
Smith, A. J. 1994. Trace driven simulation in research on computer architecture and operating systems. In Proceedings of the Conference on New Directions in Simulation for Manufacturing and Communications (Tokyo, Japan). 43--49.]]Google Scholar
Staelin, C. and Garcia-Molina, H. 1991. Smart filesystems. In Proceedings of USENIX Winter Conference. 45--52.]]Google Scholar
Symantec Corp. 2001. Norton utilities 2002. Go online to http://www.symantec.com/nu/nu_9x.]]Google Scholar
Tsangaris, M. M. and Naughton, J. F. 1992. On the performance of object clustering techniques. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 144--153.]] Google Scholar
Uhlig, R. A. and Mudge, T. N. 1997. Trace-driven memory simulation: A survey. ACM Comput. Surv. 29, 2 (June), 128--170.]] Google Scholar
Verhofstad, J. S. M. 1978. Recovery techniques for database systems. ACM Comput. Surv. 10, 2 (June), 167--195.]] Google Scholar
Vogels, W. 1999. File system usage in Windows NT 4.0. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP). 93--109.]] Google Scholar
Vongsathorn, P. and Carson, S. D. 1990. A system for adaptive disk rearrangement. Softw.: Pract. Exper. 20, 3 (Mar.), 225--242.]] Google Scholar
Whipple II, A. E. 1994. Optimizing a magnetic disk by allocating files by the frequency a file is acessed/updated or by designating a file to a fixed location on a disk. U.S. Patent 5333311. Filed Dec. 10, 1990. Issued July 26, 1994.]]Google Scholar
Yu, X., Gum, B., Chen, Y., Wang, W., Li, K., Krishnamurthy, A., and Anderson, A. 2000. Trading capacity for performance in a disk array. In Proceedings of the Symposium on Operating Systems Design and Implementation (OSDI, San Diego, CA). 243--258.]] Google Scholar
Zhou, M. and Smith, A. J. 1999. Analysis of personal computer workloads. In Proceedings of the International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS, College Park, MD). 208--217.]] Google Scholar

Index Terms

The automatic improvement of locality in storage systems

Recommendations

Run-time spatial locality detection and optimization
MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture

As the disparity between processor and main memory performance grows, the number of execution cycles spent waiting for memory accesses to complete also increases. As a result, latency hiding techniques are critical for improved application performance ...
Read More
Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies
ISCA '18: Proceedings of the 45th Annual International Symposium on Computer Architecture

On-die caches are a popular method to help hide the main memory latency. However, it is difficult to build large caches without substantially increasing their access latency, which in turn hurts performance. To overcome this difficulty, on-die caches ...
Read More
Web object-based storage management in proxy caches

Proxy caches are essential to improve the performance of the World Wide Web and to enhance user perceived latency. Appropriate cache management strategies are crucial to achieve these goals. In our previous work, we have introduced Web object-based ...
Read More

Reviews

Reviewer: Veronica Lagrange

Based on the observation that sequential data fetch is more efficient than random access, the authors propose automatic techniques and heuristics to physically reorganize disk blocks. Three heuristics are explained in great detail and evaluated with the help of trace-driven simulations. The heuristics are heat clustering, run clustering, and a combination of both. The heat heuristic locates hot spots on the disk and copies the blocks from those spots to adjacent locations. The run heuristic is based on the observation that some workloads contain long read sequences that are often repeated, these sequences are likewise copied to adjacent locations. All three heuristics are fully analyzed and evaluated for two different storage systems. Two real-life workloads provide input to the simulation: one contains traces from 14 personal computers from different types of users, and the other contains traces from three servers (a file server, a time-sharing system, and a database server). The simulation is divided into two steps: "collect data and apply heuristics," and production. Performance data is collected on the production step and indicates different results for different workloads. Average read response times vary from negative to improvements of up to 50 percent. There is, however, no data on the overhead incurred by the system during the block reorganization step, which is assumed to cost very little and be completed during the system's idle time. Online Computing Reviews Service

Reviewer: Jingping Long

Modern data is stored in a hierarchical structure: register, level 1 cache, level 2 cache, main memory, and hard disk. A hard disk has the largest storage capacity, but also has the poorest performance in terms of data read and write (I/O). A significant research effort has been made to resolve this performance bottleneck. Part of the research effort is in improving the data locality on the disk. The value of improving data locality is based on this general observation: related data items are intended to operate together. Thus, it can be easily seen that if, when the system transfers the requested data item to the processing unit from the disk, it also loads (prefetches) the related data items to a faster storage system (say, the level 2 cache), then there may be a good chance that in the next command cycle the processing unit will get what it wants from the level 2 cache, rather than from the disk, and the I/O speed will improve. Certainly, prefetching will be more effective if the related data items are stored as close as possible. In other words, by improving data locality, it can be expected that the I/O performance will be improved. To design a scheme for improving data locality, it is important to correctly identify the relation among the data items, if such a relation exists. The authors of this paper point out that, in many cases, related data items are not only accessed in a short time period, but also in a predictable order. Hence, they argue, the related data items should be laid out not only spatially close to each other, but also in the original order. The authors believe this latter point is, to a large extent, more important than the first. This is the paper's main contribution-previous disk data locality researchers have merely tried to store the related data items spatially close to each other and often break the original order. Based on this idea, the authors design a system that may automatically improve the data locality on the disks. The authors test their system using a software simulator driven by some actual disk workload records. Although the authors demonstrate some promising results, their conclusion that the system can improve disk I/O significantly, in general, lacks support. An obvious reason is that, in many cases, there may not be any relations among the data items. Also, it is quite common that the order of accessing related data items is not predictable. The disk workload traces the authors test are all from office computers used by professionals. The results would have been more valuable if the authors had tested some traces recorded in household personal computers, which comprise the majority of the total computers in the world. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Computer Systems Volume 23, Issue 4
November 2005
137 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/1113574
Issue’s Table of Contents

Copyright © 2005 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 November 2005
Published in tocs Volume 23, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Data layout optimization
block layout
data reorganization
data restructuring
defragmentation
disk technology trends
locality improvement
prefetching
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 70
  Total Citations
  View Citations
- 2,176
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The automatic improvement of locality in storage systems

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

Run-time spatial locality detection and optimization

Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies

Web object-based storage management in proxy caches

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

The automatic improvement of locality in storage systems

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

Run-time spatial locality detection and optimization

Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies

Web object-based storage management in proxy caches

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media