skip to main content
Translation-lookaside buffer consistency in highly-parallel shared-memory multiprocessors
Publisher:
  • New York University
  • 202 Tisch Hall Washington Square New York, NY
  • United States
Order Number:UMI Order No. GAX91-34697
Bibliometrics
Skip Abstract Section
Abstract

To implement virtual memory efficiently, virtual-to-physical address translation information is stored in page tables and cached in translation-lookaside buffers (TLBs). In multiprocessors with multiple TLBs, page-table modifications can result in outdated TLBs entries, the use of which can cause erroneous memory accesses.

We propose three new solutions to this TLB consistency problem, which unlike other solutions for highly-parallel shared-memory multiprocessors do not require interprocessor synchronization and communication, and neither interrupt processor execution nor introduce unnecessary serialization. The cost of these solutions is embodied in the cost of TLB reloads, which load into TLBs translation information for referenced pages. Two assume TLBs at processors and one assumes TLBs at memory.

We study their performance in scalable multiprocessor architectures with multi-stage interconnection networks via a trace-driven simulation system capable of stimulating a range of architectures using just one address trace.

Our results show that system performance improves if TLBs are located at memory, rather than processors, provided that memory is organized as multiple paging arenas, where the mapping of pages to arenas is fixed.

A class of parallel workloads can produce a number of TLB reloads, R, that grows linearly with N. A set of our simulations for processor-based TLBs validate this model.

A processor-based TLB reload costs O(log N) because of network transit. Thus, the overhead of managing processor-based TLBs, be it consistency ensuring or not, grows as R log N.

The cost of a memory-based TLB reload within a paging arena can be made smaller than that of a processor-based TLB reload, since additional network transits are not required. Simulation results show that memory-based TLBs with one paging arena exhibit generally larger miss rates than processor-based TLBs of equal size, and the related overhead is generally larger. Memory-based TLBs with two paging arenas produce smaller miss rates than processor-based TLBs of equal size, and the related overhead is generally smaller. For memory-based TLBs to maintain low overhead for large machines, it is likely that the number of paging arenas must grow as O(N).

Contributors
  • The University of Texas at El Paso

Index Terms

  1. Translation-lookaside buffer consistency in highly-parallel shared-memory multiprocessors

    Recommendations