Abstract
In modern processors, the dynamic translation of virtual addresses to support virtual memory is done before or in parallel with the first-level cache access. As processor technology improves at a rapid pace and the working sets of new applications grow insatiably the latency and bandwidth demands on the TLB (Translation Lookaside Buffer) are getting more and more difficult to meet. The situation is worse in multiprocessor systems, which run larger applications and are plagued by the TLB consistency problem.We evaluate and compare five options for virtual address translation in the context of COMAs (Cache Only Memory Architectures). The dynamic address translation mechanism can be located after the cache access provided the cache is virtual. In a particular design, which we call V-COMA for Virtual COMA, the physical address concept and the traditional TLB are eliminated. While still supporting virtual memory, V-COMA reduces the address translation overhead to a minimum.V-COMA scales well and works better in systems with large number of processors. As a machine running on virtual addresses, V-COMA provides a simple and consistent hardware model to the operating system and the compiler, in which further optimization opportunities are possible.
- 1 Anant Agarwal, "Analysis of cache performance for operating system and multiprogramming," Kluwer Academic Publishers, Boston, 1989. Google ScholarDigital Library
- 2 Todd Austin and Gurindar Sohi. "High-Bandwidth Address Translation for Multiple-lssue Processors," In Proceedings of the 22nd Annual International Symposium on Computer Architecture(ISCA), pages 158-167, 1996. Google ScholarDigital Library
- 3 E. Bugnion, J. M. Anderson, T. C. Mowry, M. Rosenblum, and M. S. Lam, "Compiler-Directed Page Coloring for Multiprocessor," In Proceedings of the 7th Conf. on Architecture Support for Programming Languages and Operating Systems(ASPLOS), Oct. 1996. Google ScholarDigital Library
- 4 H. Burkhardt III et al. "Overview of the KSR-1 Computer System," Technical Report KSR-TR-9202001, Kendall Square Research, Feb. 1992.Google Scholar
- 5 Michel Cekleov and Michel Dubois. "Virtual-Address Caches, Part 1: Problems and Solutions in Uniprocessors" pages 64-71, IEEE Micro, Sep/Oct. 1997. Google ScholarDigital Library
- 6 Michel Cekleov and Michel Dubois, "Virtual-Address Caches, Part 2: Multiprocessor Issues," IEEE Micro, Nov/Dec 1997. Google ScholarDigital Library
- 7 Jeffrey Chase, Henry Levy, and Michael Feeley, "Sharing and Protection in a Single-Address-Space Operating System," In A CM transaction on computer systems, pages 271-307, Nov., 1994. Google ScholarDigital Library
- 8 J. Bradley Chen and Anita Borg. "A Simulation Based Study of TLB Performance," In Proceedings of the 19th Annual International Symposium on Computer Architecture(ISCA ), pages 114-123, May 1992. Google ScholarDigital Library
- 9 D. W. Clark and J. S. Emer, "Performance of the VAX-11/780 Translation Buffer: Simulation and Measurement," In A CM Transactions on Computer Systems, vol. 3, no. 1, February, 1985. Google ScholarDigital Library
- 10 K. Gharachorloo, A. Gupta, and J. Hennessy. "Performance Evaluation of Memory Consistency Models for Shared-Memory Multiprocessors," In Proceedings of the 4th Conf. on Architecture Support for Programming languages and Operating Systems(ASPLOS), pages 245-257, 1991. Google ScholarDigital Library
- 11 J. R., Goodman, "Coherency for Multiprocessor Virtual Address Caches," in Proceedings of the 2nd Conf. on Architecture Support for Programming Languages and Operating Systems(ASPLOS), 1987. Google ScholarCross Ref
- 12 L. Gwennap, "Design Concepts for Merced, Forecasting the Inner Workings of the Decade's Most Anticipated Processor," pages 9-11, Microprocessor Report, vol. 11, no. 3, March I0, 1997.Google Scholar
- 13 E. Hagersten, A. Landin, and S. Haridi. "DDM-A Cache- Only Memory Architecture," IEEE Computer, vol. 25, no. 9, pages 44-54, Sep. 1992. Google ScholarDigital Library
- 14 Jerry Huck, and Jim Hays. "Architecture Support for Translation Table Management in Large Address Space Machines," In Proceedings of the 20th Annual International Symposium on Computer Architecture(ISCA), pages 39-50, 1993. Google ScholarDigital Library
- 15 Bruce Jacob and Trevor Mudge. "Software-Managed Address Translation," In Proceedings of the 3rd international Symposium on High Performance Computer Architecture(HPCA), Feb. 1997. Google ScholarDigital Library
- 16 T. Joe. "COMA-F: A Non-Hierarchical Cache Only Memory Architecture," PhD. Thesis, Stanford, 1995. Google ScholarDigital Library
- 17 Eric J. Koldinger, Jeffrey S. Chase, and Susan J. Eggers. "Architecture Support for Single Address Space Operating System," In Proceedings of the 5th Conf. on Architecture Support for Programming Languages and Operating Systems(ASPLOS), pages 175-186, Oct. 1992. Google ScholarDigital Library
- 18 J. Kuskin, D. Ofelt, M. Heinrich, J. Heinlein, R. Simoni, K. Gharachorloo, J. Chapin, D. Nakahira, J. Baxter, M. Horowitz, A. Gupta, M. Rosenblum, and J. Hennessy. "The Stanford FLASH Multiprocessor," In Proceedings of the 21st Annual international Symposium on Computer Architecture(ISCA ), pages 302-313, 1994. Google ScholarDigital Library
- 19 William Lynch. "The Interaction of Virtual Memory and Cache Memory," Ph.D. Thesis, Technical Report CSL-TR-93- 587, Stanford University, 1993.Google Scholar
- 20 C. May, E. Silha, R. Simpson, and H. Warren, Eds. "The PowerPC Architecture: A Specification for a New Family of RISC Processors," Morgan Kaufmann Publishers, San Francisco CA, 1994. Google ScholarDigital Library
- 21 Adrian Moga, Alain Gefflaut, and Michel Dubois, "Hardware vs. Software Implementation of COMA", In Proceedings of the 1997 lnt'l Conference on Parallel Processing, pages 248-256, August 1997. Google ScholarDigital Library
- 22 David Nagle, Richard Uhlig, Tim Stanley, Stuart Sechrest, Trevor Mudge, and Richard Brown. "Design Tradeoffs for Software-Managed TLBs," In Proceedings of the 20th Annual International Symposium on Computer Architecture(ISCA), pages 27-38, 1993. Google ScholarDigital Library
- 23 Xiaogang Qiu and Michel Dubois, "Options for Dynamic Address Translation in COMAs", Technical report CENG98-08, Department of Electrical Engineering- Systems, University of Southern California.Google Scholar
- 24 Theodore H. Romer, Wayne H. Ohlrich, and Anna R. Karlin. "Reducing TLB and Memory Overhead using Online Promotion," In Proceedings of the 22nd Annual international Symposium on Computer Architecture(ISCA), page 176-187, 1995. Google ScholarDigital Library
- 25 M. Talluri and M. D. Hill. "Surpassing the TLB Performance of Superpages with Less Operating System Support," In Proceedings of the 6th Conf. on Architecture Support for Programming Languages and Operating Systems(ASPLOS), 1994. Google ScholarDigital Library
- 26 M. Talluri, S. Kong, M. D. Hill, and D. A. Patterson. "Tradeoffs in Supporting Two Page Sizes," In Proceedings of the 19th Annual International Symposium on Computer Architecture(ISCA), pages 415-424, May 1992. Google ScholarDigital Library
- 27 Patricia Teller and Allan Gottlieb. "Locating Multiprocessor TLBs at Memory," In Proceedings of the 27th Annual Hawaii international Conference on System Science, pages 554-563, 1994.Google Scholar
- 28 M. Tremblay and J. M. O'Connor, "Ultrasparc I: A Four- Issue Processor Supporting Multimedia," IEEE Micro, pages 42- 50, April 1996 Google ScholarDigital Library
- 29 W, H. Wang, J-L, Baer, and H. M. Levy, "Organization and performance of a two-level Virtual-Real cache hierarchy," In Proceedings of the 16th Annual International Symposium on Computer Architecture(ISCA), pages 140-148, June 1989. Google ScholarDigital Library
- 30 Hong, Wang, Tong Sun, and Qing Yang, "CAT-- Caching Address Tags, A Technique for Reducing Area Cost of On-chip Caches", In Proceedings of the 22nd Annual International Symposium on Computer Architecture(ISCA), page 381-390, 1995. Google ScholarDigital Library
- 31 S. C. Woo, M. Ohara, and E. Tome. "The SPLASH-2 Programs: Characterization and Methodological Considerations," in Proceedings of the 22nd Annual International Symposium on Computer Architecture(ISCA), pages 24-36, 1995. Google ScholarDigital Library
- 32 David Wood, Susan Eggers, Garth Gibson, Mark Hill, and Joan Pendleton. "An In-Cache Address Translation Mechanism," In Proceedings of the 13th Annual International Symposium on Computer Architecture(ISCA), pages 358-365, Jan. 1986. Google ScholarDigital Library
- 33 K. C. Yeager, "The MIPS R10000 Superscalar Microprocessor," IEEE Micro, pages 28-40, April 1996. Google ScholarDigital Library
Index Terms
- Options for dynamic address translation in COMAs
Recommendations
Options for dynamic address translation in COMAs
ISCA '98: Proceedings of the 25th annual international symposium on Computer architectureIn modern processors, the dynamic translation of virtual addresses to support virtual memory is done before or in parallel with the first-level cache access. As processor technology improves at a rapid pace and the working sets of new applications grow ...
Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors
MICRO '43: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on MicroarchitectureTranslation Look-aside Buffers (TLBs) are vital hardware support for virtual memory management in high performance computer systems and have a momentous influence on overall system performance. Numerous techniques to reduce TLB miss latencies including ...
Comments