ABSTRACT
In this work, we propose a new organization for the last level shared cache of a multicore system. Our design is based on the observation that the Next-Use distance, measured in terms of intervening misses between the eviction of a line and its next use, for lines brought in by a given delinquent PC falls within a predictable range of values. We exploit this correlation to improve the performance of shared caches in multi-core architectures by proposing the NUcache organization.
- }}Belady, L. A. A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal 5, 2 (1966), 78. Google ScholarDigital Library
- }}Luo, K., Gummaraju, J., and Franklin, M. Balancing thoughput and fairness in smt processors. In Performance Analysis of Systems and Software, 2001. ISPASS. 2001 IEEE International Symposium on (2001), pp. 164--171.Google Scholar
- }}Qureshi, M. K., and Patt, Y. N. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (2006), pp. 423--432. Google ScholarDigital Library
- }}Xie, Y., and Loh, G. H. Pipp: promotion/insertion pseudo-partitioning of multi-core shared caches. In ISCA '09: Proceedings of the 36th annual international symposium on Computer architecture (2009), pp. 174--183. Google ScholarDigital Library
Index Terms
- NUcache: a multicore cache organization based on next-use distance
Recommendations
Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors
EuroSys'07 Conference ProceedingsThe major chip manufacturers have all introduced chip multiprocessing (CMP) and simultaneous multithreading (SMT) technology into their processing units. As a result, even low-end computing systems and game consoles have become shared memory ...
Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors
EuroSys '07: Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007The major chip manufacturers have all introduced chip multiprocessing (CMP) and simultaneous multithreading (SMT) technology into their processing units. As a result, even low-end computing systems and game consoles have become shared memory ...
Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors
MICRO '43: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on MicroarchitectureTranslation Look-aside Buffers (TLBs) are vital hardware support for virtual memory management in high performance computer systems and have a momentous influence on overall system performance. Numerous techniques to reduce TLB miss latencies including ...
Comments