ABSTRACT
Sectored caches have been used for many years in order to reconcile low tag array size and small or medium block size. In a sectored cache, a single address tag is associated with a sector consisting on several cache lines, while validity, dirty and coherency tags are associated with each of the inner cache lines.Maintaining a low tag array size is a major issue in many cache designs (e.g. L2 caches). Using a sectored cache is a design trade-off between a low size of the tag array which is possible with large line size and a low memory traffic which requires a small line size.This technique has been used in many cache designs including small on-chip microprocessor caches and large external second level caches. Unfortunately, as on some applications, the miss ratio on a sectored cache is significantly higher than the miss ratio on a non-sectored cache (factors higher than two are commonly observed), a significant part of the potential performance may be wasted in miss penalties.Usually in a cache, a cache line location is statically linked to one and only one address tag word location. In the decoupled sectored cache we introduce in this paper, this monolithic association is broken; the address tag location associated with a cache line location is dynamically chosen at fetch time among several possible locations.The tag volume on a decoupled sectored cache is in the same range as the tag volume in a traditional sectored cache; but the hit ratio on a decoupled sectored cache is very close to the hit ratio on a non-sectored cache. A decoupled sectored cache will allow the same level of performance as a non-sectored cache, but at a significantly lower hardware cost.
- 1.A. Agaxwal Analysis of Cache Performance for Operating Systems and Multiprogramming, Kluwer Academic Publishers, 1989 Google ScholarDigital Library
- 2.Baer J.L., W.H. Wang "On the inclusion property for multi-level cache hierarchies" pp73-80, Proceedings of the 15th international Symposium on Computer Architecture (IEEE-ACM), June 1988 Google ScholarDigital Library
- 3.J.H. Chang, H. Chao, and K. So "Cache Design of A Sub-Micro CMOS System/370" pp208-213, Proceedings of the 14th International Symposium on Computer Architecture (IEEE-ACM), May 1987. Google ScholarDigital Library
- 4.S.J. Eggers, R.H. Katz, "The.effect of sharing on the cache and bus performance" Proceedings of the 3rd conference on Architectural Support for Programming Language and Operating Systems, Oct. 1989 Google ScholarDigital Library
- 5.J.R. Goodman, "Using cache memory to reduce processor-memory traffic", Proceedings of the 10th International Symposium on Computer Architecture (IEEE-ACM), May 1983 Google ScholarDigital Library
- 6.M.D. Hill, "A case for direct-mapped caches", IEEE Computer, Dec 1988 Google ScholarDigital Library
- 7.M.D.HiII, A.J. Smith "Evaluating Associativity in CPU Caches" IEEE Transactions on Computers, Dec. 1989 Google ScholarDigital Library
- 8.The IBM RISC System~6000 Processor, Special issue of the IBM Journal af Research and Development, Jan. 1990Google Scholar
- 9.PowerPC 601, RISC Microprocessor User's Manual, Motorola, 1993Google Scholar
- 10.Pentium Processor User's Manual, Intel Corporation, 1993Google Scholar
- 11.G.Irlam "Spa" personnal communication 1992; the Spa package is available from gordoni~cs, adelaide.edu, auGoogle Scholar
- 12.G. Kane, J. Heinrich MIPS RISC Architecture Prentice-Hall, 1992 Google ScholarDigital Library
- 13.A. Seznec "Interleaved Sectored Caches: conciling low tag volume and low miss ratio", IRISA Research report No 761, Oct. 1993Google Scholar
- 14.A.J. Smith "Bibliography and readings on CPU cache memories and related topics", Computer Architecture News, Jan. 1986 Google ScholarDigital Library
- 15.A.J. Smith "Second bibliography on Cache Memories" Computer Architecture News, June 1991 Google ScholarDigital Library
- 16.A.J. Smith "Line (block) size choice for CPU cache memories" IEEE Transactions on Computers, Sept. 1987 Google ScholarDigital Library
- 17."TMS390Z55 Cache Controller, Data Sheet", Texas Instrument, 1992Google Scholar
- 18."TMS390Z50, Data Sheet", Texas instrument, 1992Google Scholar
- 19.D. Windheiser, E.L. Boy(l, E. Hao, S.G. Abraham "KSR1 Multiprocessor: Analysis of Latency Hiding techniques in a Sparse Solver", Proceedings of the 7th International Parallel Processlng Symposium, April 1993.Google Scholar
Index Terms
- Decoupled sectored caches: conciliating low tag implementation cost
Recommendations
Decoupled sectored caches: conciliating low tag implementation cost
Special Issue: Proceedings of the 21st annual international symposium on Computer architecture (ISCA '94)Sectored caches have been used for many years in order to reconcile low tag array size and small or medium block size. In a sectored cache, a single address tag is associated with a sector consisting on several cache lines, while validity, dirty and ...
Decoupled Sectored Caches
Maintaining a low tag array size is a major issue in many cache designs. In the decoupled sectored cache, we present in this paper, the monolithic association between a cache block and a tag location is broken; the address tag location associated with a ...
A Performance Study on Bounteous Transfer in Multiprocessor Sectored Caches
Special issue: high performance computing systemsIn a sectored cache, a cache line is divided into several subblocks. Each subblock is a basic coherence unit. In this way partial block invalidation can be done on the cache lines in order to eliminate false sharing on invalidate-based multiprocessors. ...
Comments