skip to main content
Multilevel cache hierarchies
Publisher:
  • University of Washington
  • Computer Science Dept. Fr-35 112 Sieg Hall Seattle, WA
  • United States
Order Number:AAI9013828
Pages:
149
Bibliometrics
Skip Abstract Section
Abstract

We advocate the concept of multilevel caching for the design of high performance cache systems. We suggest that a multilevel inclusion property be imposed in multilevel cache hierarchies to simplify I/O and cache coherency. We give some necessary and sufficient conditions for imposing the inclusion property for fully- and set-associative caches which allow different block sizes at different levels of the hierachy. Our simulation results show that imposing the inclusion property greatly reduces the cache coherence disturbance to first-level caches.

We examine three multiprocessor structures with a two-level cache hierarchy and discuss the feasibility of imposing the inclusion property in these structures. We explore in detail one particular structure, namely a shared-bus organization with a two-level virtual-real cache hierarchy. We show how the second-level cache can be easily extended to solve the synonym problem resulting from the use of a virtually-addressed cache at the first level. We also propose solutions to context switching overhead and cache coherence problems in the context of a two-level virtual-real cache hierarchy. Our simulation results show that this organization has a performance advantage over a hierachy of physically-addressed caches in a multiprocessor environment.

Finally, we propose improvements to current trace-driven cache simulations to make them faster and more economical. We attack the large time and space demands of cache simulation in two ways. First, we reduce the program traces to the extent that exact performance can still be obtained from the reduced traces. Second, we devise an algorithm that can produce performance results for a variety of metrics (hit ratio, write-back counts, bus traffic) for a large number of set-associative write-back caches in just a single simulation run. The trace reduction and the efficient simulation techniques are extended to parallel multiprocessor cache simulations. Our simulation results show that our approach substantially reduces the disk space needed to store the program traces and can dramatically speedup cache simulations and still produce the exact results.

Contributors
  • Intel Corporation, Asia Pacific
  • University of Washington

Recommendations