skip to main content
article

Intelligent storage: Cross-layer optimization for soft real-time workload

Published:01 August 2006Publication History
Skip Abstract Section

Abstract

In this work, we develop an intelligent storage system framework for soft real-time applications. Modern software systems consist of a collection of layers and information exchange across the layers is performed via well-defined interfaces. Due to the strictness and inflexibility of interface definition, it is not possible to pass the information specific to one layer to other layers. In practice, the exploitation of this information across the layers can greatly enhance the performance, reliability, and manageability of the system. We address the limitation of legacy interface definition via enabling intelligence in the storage system. The objective is to enable the lower-layer entity, for example, a physical or block device, to conjecture the semantic and contextual information of that application behavior which cannot be passed via the legacy interface. Based upon the knowledge obtained by the intelligence module, the system can perform a number of actions to improve the performance, reliability, security, and manageability of the system. Our intelligence storage system focuses on optimizing the I/O subsystem performance for a soft real-time application. Our intelligence framework consists of three components: the workload monitor, workload analyzer, and system optimizer. The workload monitor maintains a window of recent I/O requests and extracts feature vectors in regular intervals. The workload analyzer is trained to determine the class of the incoming workload by using the feature vector. The system optimizer performs various actions to tune the storage system for a given workload. We use confidence rate boosting to train the workload analyzer. This sophisticated learner achieves a higher than 97% accuracy of workload class prediction. We develop a prototype intelligence storage system on the legacy operating system platform. The system optimizer performs; (1) dynamic adjustment of the file-system-level read-ahead size; (2) dynamic adjustment of I/O request size; and (3) filtering of I/O requests. We examine the effect of this autonomic optimization via experimentation. We find that the storage level pro-active optimization greatly enhances the efficiency of the underlying storage system. The sophisticated intelligence module developed in this work does not restrict its usage for performance optimization. It can be effectively used as classification engine for generic autonomic computing environment, i.e. management, diagnosis, security and etc.

References

  1. Aboutabl, M., Agrawala, A., and Decotignie, J.-D. 1998. Temporally determinate disk access: An experimental approach. In Proceedings of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems. ACM, New York, 280--281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Acharya, A., Uysal, M., and Saltz, J. 1998. Active disks: Programming model, algorithms and evaluation. In ASPLOS-VIII: Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York, 81--91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. ANSI. 2002. At attachment with packet interface entension-(ata/atapi-6). American National Standard for Information Technology, T13-1410D.Google ScholarGoogle Scholar
  4. Bovet, D. P. and Cesati, M. 2005. Understanding the LINUX Kernel. O'REILLY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Breiman, L., Friedman, J., Olshen, R., and Stone., C. 1984. Classification and Regression Trees. Wadsworth, Belmont, CA.Google ScholarGoogle Scholar
  6. Burnett, N. C., Bent, J., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2000. Exploiting gray-box knowledge of buffer-cache management. In Proceedings of 2002 USENIX Annual Technical Conference. USENIX Association, Berkeley, CA, 29--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Choi, J. and Won, Y. 2002. Power constraints: Another dimension of complexity in continuous media playback. In Proceedings of the Joint International Workshops on Interactive Distributed Multimedia Systems and Protocols for Multimedia Systems. Coimbra, Portugal, 288--299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cohen, I., Goldszmidt, M., Kelly, T., Symons, J., and Chase, J. S. 2004. Correlating instrumentation data to system states: A building block for automated diagnosis and control. Tech. Rep. HPL-2004-183, HP Laboratories, Palo Alto, CA, Oct.Google ScholarGoogle Scholar
  9. David, R. R. 2004. Diskbench: User-Level disk feature extraction tool. Tech. rep. UCSB TR-2004-18. Nov.Google ScholarGoogle Scholar
  10. Dimitrijevic, Z., Rangaswami, R., and Chang, E. 2003. Design and implementation of semi-preemptible IO. In FAST '03: Proceedings of the Conference on File and Storage Technologies. San Jose, CA. 145--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Freud, Y. and Schapire, R. E. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In EuroCOLT '95: Proceedings of the 2nd European Conference on Computational Learning Theory. Springer Verlag, London, 23--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Friedman, J. 2001. Greedy function approximation: A gradient boosting machine. Ann. Statist. 29, 1189--1232.Google ScholarGoogle ScholarCross RefCross Ref
  13. Ganger, G. 2001. Blurring the line between OSES and storage devices. Tech. rep. Technical Report CMU-CS-01-166, Carnegie Mellon University. Dec.Google ScholarGoogle Scholar
  14. Ganger, G. R., Worthington, B. L., and Patt, Y. 1998. The Disksim simulation environment. Tech. rep. CSE-TR-358-98, Dept. of Electrical Engineering and Computer Science, Univ. of Michigan. Feb.Google ScholarGoogle Scholar
  15. Hughes, G. 2002. Wise drives. IEEE Spectrum 39, 8 (Aug.), 37--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Huston, L., Sukthankar, R., Wickremesinghe, R., Satyanarayanan, M., Ganger, G., Riedel, E., and Ailamaki, A. 2004. Diamond: A storage architecture for early discard in interactive search. In FAST '04: Proceedings of the 3rd USENIX Conference on File and Techonologies. San Jose, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Iyer, S. and Druschel, P. 2001. Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous I/O. In SOSP '01: Proceedings of the 18th ACM Symposium on Operating Systems Principles. ACM, New York, 117--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Karlsson, M. and Covell, M. 2005. Dynamic black-box performance model estimation for self-tuning regulators. In Proceedings of Internation Conference on Autonomic Computing. Seattle, WA, 172--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kim, T., Won, Y., and Koh, K. 2005. Apollon: File system support for qos augmented I/O. In Proceedings of the Pacific Rim Conference on Multimedia. Jeju, Korea. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Li, Z., Chen, Z., Srinivasan, S. M., and Zhou, Y. 2004. C-Miner: Mining block correlations in storage. In FAST '04: Proceedings of the 3rd USENIX Conference on File and Storage Technologies. San Francisco, CA, 173--186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Lu, Y., Du, D. H., and Ruwart, T. 2005. Qos provisioning framework for an OSD-Based storage system. In Proceedings of the 22nd IEEE/13th NASA Goddard Conferene on Mass Storage Systems and Technologies (MSST). 28--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lumb, C. R., Schindler, J., and Ganger, G. R. 2002. Freeblock scheduling outside of disk firmware. In FAST '02: Proceedings of the Conference on File and Storage Technologies. USENIX Association, Berkeley, CA, 275--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Mesnier, M., Thereska, E., Gregory Ganger, D. E., and Seltzer, M. 2004. File classification in self-*stroage systems. In Proceedings of the 1st International Conference on Autonomic Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mitechelle, T. M. 1997. Machine Learning. Donnelly and Sons.Google ScholarGoogle Scholar
  25. mpeg2dec. http://libmpeg2.sourceforge.net.Google ScholarGoogle Scholar
  26. mplayer. http://www.mplayerhq.hu.Google ScholarGoogle Scholar
  27. Niranjan, T., Chiueh, T., and Schloss, G. A. 1997. Implementation and evaluation of a multimedia file system. In ICMCS '97: Proceedings of the International Conference on Multimedia Computing and Systems (ICMCS '97). IEEE Computer Society, Ottawa, Ontario, Canada, 269--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Performance Evaluation Laboratory, B. Y. U. 2006. Dtb: Linux disk trace buffer. http://traces.byu.edu/new/Tools/.Google ScholarGoogle Scholar
  29. Quinlan, J. R. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Riedel, E., Faloutsos, C., Ganger, G. R., and Nagle, D. F. 2000. Data mining on an oltp system (nearly) for free. In SIGMOD '00: Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 13--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Riedel, E., Gibson, G. A., and Faloutsos, C. 1998. Active storage for large-scale data mining and multimedia. In VLDB '98: Proceedings of the 24th International Conference on Very Large Data Bases. Morgan Kaufmann, San Francisco, CA, 62--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Schapire, R. E. and Singer, Y. 1999. Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37, 3 (Dec.), 297--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Schindler, J., Griffin, J. L., Lumb, C. R., and Ganger, G. R. 2002. Track-Aligned extents: Matching access patterns to disk drive characteristics. In FAST '02: Proceedings of the Conference on File and Storage Technologies. USENIX Association, Berkeley, CA, 259--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sivathanu, M., Prabhakaran, V., Popovici, F. I., Denehy, T. E., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2003. Semantically-Smart disk systems. In FAST '03: Proceedings of 2nd USENIX Conference on File and Storage Technologies (FAST). USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Wang, C., Goebel, V., and Plagemann, T. 1999. Techniques to increase disk access locality in the minorca multimedia file system. In Proceedings of the 7th ACM Multimedia Conference. 147--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Wang, R. Y., Anderson, T. E., and Patterson, D. A. 1999. Virtual log based file systems for a programmable disk. In OSDI '99: Proceedings of the 3rd Symposium on Operating Systems Design and Implementation. USENIX Association, Berkeley, CA, 29--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Weissel, A., Beutel, B., and Bellosa, F. 2002. Cooperative I/O: A novel I/O semantics for energy-aware applications. SIGOPS Oper. Syst. Rev. 36, SI (Dec.), 117--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Wildstrom, J., Stone, P., Witchel, E., Mooney, R., and Dahlin, M. 2005. Towards self-configuring hardware for distributed computer systems. In Proceedings of the International Conference on Autonomic Computing. Seattle, WA, 241--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Won, Y., Park, J., Kim, D., and Lee, S. 2005. Hermes: Embedded file system for a/v workload. Multimedia Tools and Applications, Springer.Google ScholarGoogle Scholar
  40. Worthington, B. L., Ganger, G. R., Patt, Y. N., and Wilkes, J. 1995. On-line extraction of SCSI disk drive parameters. In SIGMETRICS '95/PERFORMANCE '95: Proceedings of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems. ACM, New York, 146--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. xine. http://xinehq.de.Google ScholarGoogle Scholar
  42. Xu, W., Bodik, P., and Patterson, D. 2004. A flexible architecture for statistical learning and data mining from system log streams. In Proceedings of the Workshop on Temporal Data Mining: Algorithms, Theory and Applications Conjunction with the International Conference on Data Mining. Brighton, UK.Google ScholarGoogle Scholar
  43. Zhang, Z., Lian, Q., lin, S., Chen, W., Chen, Y., and Jin, C. 2005. Bitvault: A highly reliable distributed retension platform. Tech. rep. MSR-TR-2005-179, Microsoft Research, China. Dec.Google ScholarGoogle Scholar
  44. Zhang, Z., Lin, S., Lian, Q., and Jin, C. 2004. Repstore: A self-managing and self-tuning storage backend with smart bricks. In Proceedings of the International Conference on Autonomic Computing. 122--129. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Intelligent storage: Cross-layer optimization for soft real-time workload

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 2, Issue 3
          August 2006
          149 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/1168910
          Issue’s Table of Contents

          Copyright © 2006 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 August 2006
          Published in tos Volume 2, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader