The growing disparity between processor and memory performance has made cache misses increasingly expensive. Additionally, data and instruction caches are not always used efficiently, resulting in large numbers of cache misses. Therefore, the importance of cache performance improvements at each level of the memory hierarchy will continue to grow. For numeric programs there are several known compiler techniques for optimizing data cache performance. However, integer (non-numeric) programs often have irregular access patterns that are more difficult for the compiler to optimize. In the past, cache management techniques such as cache bypassing were implemented manually at the machine-language-programming level. As the available chip area grows, it makes sense to spend more resources to allow intelligent control over the cache management.The objective of this dissertation is to improve cache effectiveness, taking advantage of the growing chip area, utilizing run-time adaptive cache management techniques, and optimizing both performance and cost of implementation. Specifically, the aim is to increase cache effectiveness for integer programs. This dissertation proposes a microarchitecture scheme where the hardware determines data placement within the cache hierarchy based on dynamic referencing behavior. This scheme is fully compatible with existing instruction set architectures. This dissertation also examines the theoretical upper bounds on the cache hit ratio that the proposed techniques can provide, for several integer applications. Then, detailed trace-driven simulations of several integer applications are used to show that the implementations described in this dissertation can achieve performance close to that of the upper bound.
Cited By
- Gaur J, Chaudhuri M and Subramoney S Bypass and insertion algorithms for exclusive last-level caches Proceedings of the 38th annual international symposium on Computer architecture, (81-92)
- Feng M, Tian C, Lin C and Gupta R (2011). Dynamic access distance driven cache replacement, ACM Transactions on Architecture and Code Optimization (TACO), 8:3, (1-30), Online publication date: 1-Oct-2011.
- Gaur J, Chaudhuri M and Subramoney S (2011). Bypass and insertion algorithms for exclusive last-level caches, ACM SIGARCH Computer Architecture News, 39:3, (81-92), Online publication date: 22-Jun-2011.
- Qureshi M, Jaleel A, Patt Y, Steely S and Emer J Adaptive insertion policies for high performance caching Proceedings of the 34th annual international symposium on Computer architecture, (381-391)
- Qureshi M, Jaleel A, Patt Y, Steely S and Emer J (2019). Adaptive insertion policies for high performance caching, ACM SIGARCH Computer Architecture News, 35:2, (381-391), Online publication date: 9-Jun-2007.
- Sahuquillo J and Pont A (2000). Splitting the Data Cache, IEEE Concurrency, 8:3, (30-35), Online publication date: 1-Jul-2000.
- Johnson T, Connors D, Merten M and Hwu W (1999). Run-Time Cache Bypassing, IEEE Transactions on Computers, 48:12, (1338-1354), Online publication date: 1-Dec-1999.
Recommendations
Run-Time Cache Bypassing
The growing disparity between processor and memory performance has made cache misses increasingly expensive. Additionally, data and instruction caches are not always used efficiently, resulting in large numbers of cache misses. Therefore, the importance ...
Run-time adaptive cache hierarchy management via reference analysis
Special Issue: Proceedings of the 24th annual international symposium on Computer architecture (ISCA '97)Improvements in main memory speeds have not kept pace with increasing processor clock frequency and improved exploitation of instruction-level parallelism. Consequently, the gap between processor and main memory performance is expected to grow, ...