skip to main content
10.1145/774572.774669acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
Article

Synthesis of customized loop caches for core-based embedded systems

Published:10 November 2002Publication History

ABSTRACT

Embedded system programs tend to spend much time in small loops. Introducing a very small loop cache into the instruction memory hierarchy has thus been shown to substantially reduce instruction fetch energy. However, loop caches come in many sizes and variations -- using the configuration best on the average may actually result in worsened energy for a specific program. We therefore introduce a loop cache exploration tool that analyzes a particular program's profile, rapidly explores the possible configurations, and generates the configuration with the greatest power savings. We introduce a simulation-based approach and show the good energy savings that a customized loop cache yields. We also introduce a fast estimation-based approach that obtains nearly the same results in seconds rather than tens of minutes or hours.

References

  1. Aditya, S., B. Rau, V. Kathail. Automatic architectural synthesis of VLIW and EPIC Processors. Int. Symp. on System Synthesis, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bahar, R., G. Albera, S. Manne. Power and Performance Tradeoffs using Various Caching Strategies. Int. Symp.on Low Power Electronics and Design, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Benini, L., A. Macii, E. Macii, M. Poncino. Selective Instruction Compression for Memory Energy Reduction in Embedded Systems. Int. Symp. on Low Power Electronics and Design, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Benini, L., G. Micheli, E. Macii, D. Sciuto, C. Silvano. Asymptotic Zero-Transition Activity Encoding for Address Busses in Low-Power Microprocessor-Based Systems. IEEE GLS-VLSI-97, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Elder, J., M.D. Hill. Dinero IV Trace-Driven Uniprocessor Cache Simulator. http://www.cs.wisc.edu/~markhill/DineroIV.Google ScholarGoogle Scholar
  6. Fisher, J. Customized Instruction-Sets For Embedded Processors. Design Automation Conference, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Fisher, J., P. Faraboschi, G. Desoli. Custom-Fit Processors: Letting Applications Define Architectures. Int. Symp. on Microarchitecture, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gonzales, R. Xtensa: A Configurable and Extensible Processor. Int. Symp. on Microarchitecture, 2000.Google ScholarGoogle Scholar
  9. Gordon-Ross, A., S. Cotterell, F. Vahid. Exploiting Fixed Programs in Embedded Systems: A Loop Cache Example. Computer Architecture Letters, Vol 1, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kalambur, A., M. J. Irwin. An Extended Addressing Mode for Low Power. Int. Symp. on Low Power Electronics and Design, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kavvadias, N., A. Chatzigeorgiou, N. Zervas, S. Nikolaidis. Memory Hierarchy Exploration For Low Power Architectures in Embedded Multimedia Applications. Int. Conf. on Image Processing, 2001.Google ScholarGoogle Scholar
  12. Kienhuis, B., E. Deprettere, K. Vissers, P. van der Wolf. An Approach for Quantitative Analysis of Application-Specific Dataflow Architectures. Application-Specific Systems, Architectures, and Processors, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kim, S., N. Vijaykrishnan, M. Kandemir, A. Sivasubramaniam, M. Irwin, E. Geethanjali. Power-aware Paritioned Cache Architectures. Int. Symp. on Low Power Electronics and Design, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kin, J., M. Gupta, W. Magione-Smith. The Filter Cache: An Energy Efficient Memory Structure. Int. Symp. on Microarchitecture, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kirovski, D., J. Kin, W. Mangione-Smith. Procedure Based Program Compression. Int. Symp. on Microachitecture, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ko, U., P. Balsara. Characterization and Design of A Low-Power, High-Performance Cache Architecture. Int. Symp. on VLSI Technology, Systems, and Applications, 1995.Google ScholarGoogle Scholar
  17. Lee, C., M. Potkonjak, W. Magione-Smith. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communication Systems. International Symposium on Microarchitecture, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lee, L., B. Moyer, J. Arends. Instruction Fetch Energy Reduction Using Loop Caches For Embedded Applications with Small Tight Loops. Int. Symp. on Low Power Electronics and Design, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lee, L., B. Moyer, J. Arends. Low-Cost Embedded Program Loop Caching -- Revisited. University of Michigan Technical Report CSE-TR-411-99, 1999.Google ScholarGoogle Scholar
  20. Lekatsas, H., J. Henkel, W. Wolf. Code Compression for Low Power Embedded System Design. Design Automation Conference, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Malik, A., B. Moyer, D. Cermak. A Low Power Unified Cache Architecture Providing Power and Performance Flexibility. Int. Symp. on Low Power Electronics and Design. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mehta, H., R. Owens, M. Irwin. Some Issues in Gray Code Addressing. IEEE GLS-VLSI-96, March 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Montanaro, J., et. al. A 160-MHz, 32-b, 0.5-W CMOS RISC Microprocessor. IEEE Journal of Solid State Circuits, 1996.Google ScholarGoogle Scholar
  24. Nachtergaele, L., F. Catthoor, F. Balasa, F. Franssen, E. DeGreef, H. Samsom, and H. De Man., Optimization of Memory Organization and Hierarchy for Decreased Size and Power in Video and Image Processing Systems. Int. Workshop on Memory Technology, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Panda, P., N. Dutt, A. Nicolau. Architectural Exploration and Optimization of Local Memory in Embedded Systems. Int. Symp. on System Synthesis, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Shiue, W., C. Chakrabarti. Memory Design and Exploration for Low Power, Embedded Systems. Journal of VLSI Signal Processing -- Systems for Signal, Image, and Video Technology, Vol. 29, No. 3, pp. 167--178, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Stan, M., W. Burleson. Bus Invert for Low Power I/O. IEEE Transactions on VLSI, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Su, C., C. Tsui, A. Despain. Cache Design Trade-offs for Power and Performance Optimization: A Case Study. Int. Symp. Low Power Design, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Su, C., C. Tsui, A. Despain. Saving Power in the Control Path of Embedded Processors. IEEE Test and Design of Computers, Vol. 11, No. 4, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sugumar, R., and S. Abraham. Efficient Simulation of Multiple Cache Configurations using Binomial Trees. Technical Report CSE-TR-111-91, CSE Division, University of Michigan, 1991.Google ScholarGoogle Scholar
  31. Vahid, F., T. Givargis, Platform Tuning for Embedded Systems Design. IEEE Computer, Vol. 34, No 3, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Villarreal, J., D. Suresh, G. Stitt, F. Vahid, and W. Najjar. Improving Software Performance with Configurable Logic. Design Automation of Embedded System, 2002.Google ScholarGoogle Scholar
  33. Villarreal, J., R. Lysecky, S. Cotterell, and F. Vahid. A Study on the Loop Behavior of Embedded Programs. Technical Report UCR-CSE-01-03, University of California, Riverside, 2002.Google ScholarGoogle Scholar
  34. Wu, Z, and W. Wolf. Iterative Cache Simulation of Embedded CPUs with Trace Stripping. International Conference on Hardware/Software Co-Design, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Synthesis of customized loop caches for core-based embedded systems

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICCAD '02: Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
          November 2002
          793 pages
          ISBN:0780376072
          DOI:10.1145/774572

          Copyright © 2002 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 10 November 2002

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate457of1,762submissions,26%

          Upcoming Conference

          ICCAD '24
          IEEE/ACM International Conference on Computer-Aided Design
          October 27 - 31, 2024
          New York , NY , USA

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader