Abstract
Energy consumption is a major concern in many embedded computing systems. Several studies have shown that cache memories account for about 50% of the total energy consumed in these systems. The performance of a given cache architecture is determined, to a large degree, by the behavior of the application executing on the architecture. Desktop systems have to accommodate a very wide range of applications and therefore the cache architecture is usually set by the manufacturer as a best compromise given current applications, technology, and cost. Unlike desktop systems, embedded systems are designed to run a small range of well-defined applications. In this context, a cache architecture that is tuned for that narrow range of applications can have both increased performance as well as lower energy consumption. We introduce a novel cache architecture intended for embedded microprocessor platforms. The cache has three software-configurable parameters that can be tuned to particular applications. First, the cache's associativity can be configured to be direct-mapped, two-way, or four-way set-associative, using a novel technique we call way concatenation. Second, the cache's total size can be configured by shutting down ways. Finally, the cache's line size can be configured to have 16, 32, or 64 bytes. A study of 23 programs drawn from Powerstone, MediaBench, and Spec2000 benchmark suites shows that the configurable cache tuned to each program saved energy for every program compared to a conventional four-way set-associative cache as well as compared to a conventional direct-mapped cache, with an average savings of energy related to memory access of over 40%.
- Agarwal, A., Li, H., and Roy, K. 2002. DRG-Cache. A data retention gated-ground cache for low power. In Design Automation Conference. Google Scholar
- Albonesi, D. H. 1999. Selective cache ways: On-demand cache resource allocation. In the 32nd Annual ACM/IEEE International Symposium on Microarchitecture. Google Scholar
- Balasubramonian, R., Albonesi, D., Buyuktosunoglu, A., and Dwarkadas, S. 2000. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In the 33rd International Symposium on Microarchitecture. Google Scholar
- Batson, B. and Vijaykumar, T. N. 2001. Reactive-associative caches. In International Conference on Parallel Architectures and Compilation Techniques. Google Scholar
- Burger, D. and Austin, T. M. 1997. The SimpleScalar Tool Set, Version 2.0. University of Wisconsin-Madison Computer Sciences, Department. Technical Report #1342.Google Scholar
- CADENCE. 2002. http://www.cadence.com.Google Scholar
- Calder, B., Grunwall, D., and Emer, J. 1996. Predictive sequential associative cache. In International Symposium on High Performance Computer Architecture. Google Scholar
- Edmondson, J. H. and Rubinfield, P. I. 1995. Internal organization of the Alpha 21164 a 300-MHz 64-bit quad-issue CMOS RISC microprocessor. Digital Technical Journal 7, 1, 119---135. Google ScholarDigital Library
- Dropsho, S., Buyuktosunoglu, A., Balasubramonian, R., Albonesi, D. H., Dwarkadas, S., Semeraro, G., Magklis, G., and Scott, M. L. 2002. Integrating adaptive on-chip storage structures for reduced dynamic power. In the 11th International Conference on Parallel Architectures and Compilation Techniques. Google Scholar
- Flautner, K., et al. 2002. Drowsy caches: Simple techniques for reducing leakage power. In the 35th Annual ACM/IEEE International Symposium on Microarchitecture.Google Scholar
- Ghose, K. and Kamble, M. B. 1999. Reducing power in superscaler processor caches using subbanking, multiple line buffers and bit-line segmentation. In International Symposium on Low Power Electronics and Design. Google Scholar
- Hanson, H. 2000. Static energy reduction for microprocessor caches. In the International Conference on Computer Design.Google Scholar
- Hasegawa, A., Kawasaki, I., Yamada, K., Yoshioka, S., Kawasaki, S., and Biswas, P. 1995. SH3: High code density, low power. IEEE Micro 15, 6, 11--19. Google ScholarCross Ref
- Hennessy, J. L., and Patterson, D. A. 1996. Computer Architecture Quantitative Approach, 2nd ed. Morgan-Kaufmann, Menlo Park, CA. Google Scholar
- INTEL. 2002. http://www.developer.intel.com/design/strong/.Google Scholar
- Inoue, K., Ishihara, T., and Murakami, K. 1999. Way-predictive set-sssociative cache for high performance and low energy consumption. In International Symposium on Low Power Electronic Design. Google Scholar
- Inoue, K. and Kai, K. 2000. A high-performance/low-power on-chip memory-path architecture with variable cache-line size. IEICE Trans. Electron. E83-CV, 11 (Nov.).Google Scholar
- Kaxiras, S., Hu, Z., and Martonosi, M. 2001. Cache decay: Exploiting generational behavior to reduce cache leakage power. In the 28th Annual International Symposium on Computer Architecture. Google Scholar
- Kim, H., Somani, A. K., and Tyagi, A. 2001. A reconfigurable multi-function computing cache architecture. IEEE Transactions on VLSI Systems 9, 4 (Aug.), 509--523. Google Scholar
- Kin, J., Gupta, M., and Mangione-Smith, W. 1997. The filter cache: An energy efficient memory structure. In International Symposium on Microarchitecture. 184--193. Google Scholar
- Lee, C., Potkonjak, M., and Mangione-Smith, W. 1997. MediaBench: A tool for evaluating and synthesizing multimedia and communications systems. In International Symposium on Microarchitecture. Google ScholarDigital Library
- Mai, K., Paaske, T., Jayasena, N., Ho, R., Dally, W. J., and Horowitz, M. 2000. Smart memories: A modular reconfigurable architecture. ACM SIGARCH Computer Architecture News 28, 2. Google ScholarCross Ref
- Malik, A., Moyer, B., and Cermak, D. 2000. A low power unified cache architecture providing power and performance flexibility. In International Symposium on Low Power Electronics and Design. Google Scholar
- MIPS. 2002. http://www.mips.com.Google Scholar
- MOSIS. 2002. http://www.mosis.org.Google Scholar
- Powell, M., Yang, S. H., Falsafi, B., Roy, K., and Vijaykumar, T. N. 2000. Gated-Vdd: A circuit technique to reduce leakage in deep-submicron cache memories. In the ACM/IEEE International Symposium on Low Power Electronics and Design. Google Scholar
- Powell, M. D., Agarwal, A., Vijaykumar, T. N., Falsafi, B., and Roy, K. 2001. Reducing set-associative cache energy via way-prediction and selective direct-mapping. In the 34th International Symposium on Microarchitecture. Google Scholar
- Ranganathan, P., Adve, S., and Jouppi, N. P. 2000. Reconfigurable caches and their application to media processing. In the 27th Annual International Symposium on Computer Architecture. Google Scholar
- Reinman, G. and Jouppi, N. P. 1999. CACTI2.0: An Integrated Cache Timing and Power Model. COMPAQ Western Research Lab.Google Scholar
- Segars, S. 2001. Low power desin techniques for microprocessors. In IEEE International Solid-State Circuits Conference Tutorial.Google Scholar
- Semiconductor Industry Association. 1999. International Technology Roadmap for Semiconductors: 1999 edition. International SEMATECH, Austin, TX.Google Scholar
- Smith, M. J. S. 1997. Application-Specific Integrated Circuits. Addison-Wesley Longman, Reading, MA.Google Scholar
- SPECBENCH. 2002. http://www.specbench.org/osg/cpu2000.Google Scholar
- Tadas, S. and Chakrabarti, C. 2002. Architectural approaches to reduce leakage energy in caches. In International Symposium on Circuits and System.Google Scholar
- Veidenbaum, A., Tang, W., Gupta, R., Nicolau, A., and Ji, X. 1999. Adapting cache line size to application behavior. In International Conference on Supercomputing. Google Scholar
- Witchel, E. and Asannovic, K. 2001. The span cache: Software controlled tag checks and cache cine Size. In the 28th Annual International Symposium on Computer Architecture.Google Scholar
- Yang, S., Powell, M. D., Falsafi, B., Roy, K., and Vijaykumar, T. N. 2001. An integrated circuit/architecture approach to reducing leakage in deep-submicron high-performance I-caches. In the 7th International Symposium on High-Performance Computer Architecture. Google Scholar
- Ye, Y. Borker, S., et al. 1998. A new technique for standby leakage reduction in high-performance circuits. In International Symposium on VLSI circuits.Google Scholar
- Zhang, C., Vahid, F., and Najjar, W. 2003a. A highly configurable cache architecture for embedded systems. In the 30th ACM/IEEE International Symposium on Computer Architecture. Google Scholar
- Zhang, C., Vahid, F., and Najjar, W. 2003b. Energy benefits of a configurable line size cache for embedded systems. In International Symposium on VLSI Design. Google Scholar
- Zhang, C., Vahid, F., and Lysecky, R. 2004. A self-tuning cache architecture for embedded systems. In Special issue on Dynamically Adaptable Embedded System. ACM Transactions on Embedded Computing Systems 3, 2 (May), 1--19. Google ScholarDigital Library
- Zhou, H., Toburen, M. C., Rotenberg, E., and Cont, T. M. 2001. Adaptive mode-control: A static-power-efficient cache design. In the 10th International Conference on Parallel Architectures and Compilation Techniques. Google Scholar
Index Terms
- A highly configurable cache for low energy embedded systems
Recommendations
A highly configurable cache architecture for embedded systems
ISCA '03: Proceedings of the 30th annual international symposium on Computer architectureEnergy consumption is a major concern in many embedded computing systems. Several studies have shown that cache memories account for about 50% of the total energy consumed in these systems. The performance of a given cache architecture is largely ...
A self-tuning cache architecture for embedded systems
Memory accesses often account for about half of a microprocessor system's power consumption. Customizing a microprocessor cache's total size, line size, and associativity to a particular program is well known to have tremendous benefits for performance ...
A highly configurable cache architecture for embedded systems
ISCA 2003Energy consumption is a major concern in many embedded computing systems. Several studies have shown that cache memories account for about 50% of the total energy consumed in these systems. The performance of a given cache architecture is largely ...
Comments