Abstract
We have developed a new technique for evaluating cache coherent, shared-memory computers. The Wisconsin Wind Tunnel (WWT) runs a parallel shared-memory program on a parallel computer (CM-5) and uses execution-driven, distributed, discrete-event simulation to accurately calculate program execution time. WWT is a virtual prototype that exploits similarities between the system under design (the target) and an existing evaluation platform (the host). The host directly executes all target program instructions and memory references that hit in the target cache. WWT's shared memory uses the CM-5 memory's error-correcting code (ECC) as valid bits for a fine-grained extension of shared virtual memory. Only memory references that miss in the target cache trap to WWT, which simulates a cache-coherence protocol. WWT correctly interleaves target machine events and calculates target program execution time. WWT runs on parallel computers with greater speed and memory capacity than uniprocessors. WWT's simulation time decreases as target system size increases for fixed-size problems and holds roughly constant as the target system and problem scale.
- 1 Anant Agarwal, Richard Simoni, Mark Horowitz, and John Helmessy. An Evaluation of Directory Schemes for Cache Coherence. In Proceedings of the 15th Annual International Symposium on Computer Architecture, pages 280-289, 1988.]] Google ScholarDigital Library
- 2 Robert Alverson, David Callahan, Daniel Cummings, Brian Koblenz, Allan Porterfield, and Burton Smith. The Tera Computer System. In Proceedings of the 1990 International Con}erence on Supercomputing, pages 1-6, June 1990.]] Google ScholarDigital Library
- 3 Rassul Ayanl. A Parallel Simulation Scheme Based on the Distance Between Objects. In Proceedings of the SCS Multiconfcrcnc~ on Distributed Simulation, pages 113- 118, March 1989.]]Google Scholar
- 4 Thomas Ball and James R. Larus. Optimally Profiling and Tracing Programs. In Conference Record of the Nineteenth Annual A CM Symposium on Principles of Programming Languages, pages 59--70, January 1992.]] Google ScholarDigital Library
- 5 Bob Boothe. Fast Accurate Simulation of Large Shared Memory Multiprocessors. Technical Report CSD 92/682, Computer Science Division (EECS), University of California at Berkeley, January 1992.]] Google ScholarDigital Library
- 6 Eric A. Brewer, Chrysanthos N Dellarocas, Adrian Colbrook, and William Weihl. PROTEUS: A High- Performance Parallel-Architecture Simulator. Technical Report MIT/LCS/TR-516, MIT Laboratory for Computer Science, September 1991.]] Google ScholarDigital Library
- 7 David Chaiken, John Kubiatowics, and Anant Agarwal. LimitLESS Directories: A Scalable Cache Coherence Scheme. In Proceedings o.f the Fourth International Conference on Architectural Support for Programming Lang~tages and Operating Systems (ASPLOS IV), pages 224- 234, April 1991.]] Google ScholarDigital Library
- 8 Robert F. Cmelik, Shing I. Kong, David It. Ditzel, and Edmund J. Kelly. An Analysis of MIPS and SPARC instruction Set Utilization on the SPEC Benchmarks. In Proceedings of the Fourth International Conference on Architectural Support .for Programming Languages and Operating Systems (ASPLOS IV), p,Lges 290-302, April 1991.]] Google ScholarDigital Library
- 9 Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary, 1991.]]Google Scholar
- 10 R.C. Covington, S. Madala, V. Mehta, J.R. Jump, and J.B. Sinclair. The Rice Parallel Processing Testbed. In Proceedings of the 1988 A CM SIGMETRICS Conference on Measurin9 and Modeling of Computer Systems, pages 4-11, May 1988.]] Google ScholarDigital Library
- 11 William J. Dally, Andrew Chien, Stuart Fiske, Waldemar Horwat, John Keen, Michael Larivee, Rich Nuth, Scott Wills, Paul Carrick, and Greg Flyer. The J-Machine: A Fine-Grain Concurrent Computer. In G. X. Ritter, editor, Proe. Information Processing 89. Elsevier North-Holland, Inc., 1989.]]Google Scholar
- 12 Helen Davis, Stephen R. Goldschmidt, and John Hennessy. Multiprocessor Simulation artd Tracing Using Tango. In Proceedings of the 1991 International Con- }erence on Parallel Processing (Vol. Ii Software), pages Ii99--107, August 1991.]]Google Scholar
- 13 Richard M. Fujimoto. Parallel Discrete Event Simulation. Communications of the A CM, 33(10):30-53, October 1990.]] Google ScholarDigital Library
- 14 James R. Goodman. Coherency for Multiprocessor Virtual Address Caches. In Proceeding8 of the Second international Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS II), pages 408-419, October 1987.]] Google ScholarCross Ref
- 15 John L. Gustafson. Reevaluating Amdald's Law. Communications of the A CM, 31(5):532-533, May 1988.]] Google ScholarDigital Library
- 16 Mark D. Hill, James It. Larus, Steven K. Reinhardt, and David A. Wood. Cooperative Shared Memory: Software and Hardware for Scalable Multiprocessors. in Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IF), pages 262-273, October 1992.]] Google ScholarDigital Library
- 17 David R. Jefferson. Virtual Time. A CM Transactions on Programming Languages and Systems, 7(3):404-425, July 1985.]] Google ScholarDigital Library
- 18 Kendall Square Research. Kendall Square Research Technical Summary, 1992.]]Google Scholar
- 19 Pavlos Konas and Pen-Chung Yew. Parallel Discrete Event Simulation on Shared-Memory Multiprocessors. in Proe. of the ~th Annual Simulation Symposium, pages 134-148, April 1991.]] Google ScholarDigital Library
- 20 Pavlos Konas and Pen-Chung Yew. Synchronous Parallel Discrete Event Simulation on Shared-Memory Multiproeessors, in Proceedings of 6th Workshop on Parallel and Distributed Simulation, pages 12-21, January 1992.]]Google Scholar
- 21 Daniel Lenoski, James Laudon, Kourosh Gharaehorloo, Wolf-Dietrich Weber, Anoop Gupta, John Hennessy, Mark Horowitz, and Monica Lam. The Stanford DASH Multiprocessor. IEEE Computer, 25(3):63-79, March 1992.]] Google ScholarDigital Library
- 22 Kai Li and Paul Hudak. Memory Coherence in Shared Virtual Memory Systems. A CM Transactions on Computer Systems, 7(4):321-359, November 1989.]] Google ScholarDigital Library
- 23 Y.-B. Lin, J.-L. Baer, and E. D. Lazowska. Tailoring a Parallel Trace-Driven Simulation Tedmique to Specific Multlprocessor Cache Coherence Protocols. Technical Report 88-01-02, Department of Computer Science, University of Washington, March 1988.]]Google Scholar
- 24 J. S. Liptay. Structural Aspects of the System/360 Model 85, Part II: The Cache. IBM Systems Journal, 7(1):15-21, 1968.]]Google ScholarDigital Library
- 25 Boris D. Lubachevsky. Efficient Distributed Event-Driven Simulatiozts of Multiple-Loop Networks. Coramttnieations of the A Clef, 32(2):111-123, January 1989.]] Google ScholarDigital Library
- 26 Jayadev Misra. Distributed-Discrete Event Simulation. A CM Computing Surveys, 18(1):39-65, March 1986.]] Google ScholarDigital Library
- 27 Todd MowTy and Anoop Gupta. Tolerating Latency Through Software-Controlled Prefetehing in Shared- Memory M{ultiprocessors. Journal of Parallel and Distributed Computing, 12:87-106, June 1991.]] Google ScholarDigital Library
- 28 David Nicol. Conservative Parallel Simulation of Priority Class Queueing Networks. IEEE Transactions on Parallel and Distributed Systems, 3(3):398-412, May 1992.]] Google ScholarDigital Library
- 29 David M. Nicol. Performance Bounds on Parallel Self- Initiating F)iscrete-Event Simulations. A CM Transactions on Modeling and Computer Simulation, 1(1):24-50, January 1991.]] Google ScholarDigital Library
- 30 Jaswinder Pal Singh, Wolf-Dietrich Weber, and Anoop Gupta. SPLASH: Stanford Paralld Applications for Shared Memory. Computer Architecture News, 20(1):5- 44, March 1992.]] Google ScholarDigital Library
- 31 Richard L. Sites, Anton Chernoff, Matthew B. Kirk, Maurice P. Marks, and Scott G. Robinson. Binary Translation. Communications of the A CM, 36(2):69-81, February 1993.]] Google ScholarDigital Library
- 32 SPEC. SPEC Benchmark Suite Release 1.0, Winter 1990.]]Google Scholar
- 33 Yuval Tamir and G. Janakiraman. Hierarchical Coherency Management for Shared Virtual Memory Multicomputers. Journal of Parallel and Distributed Computing, 15(4):408-419, August 1992.]]Google ScholarCross Ref
- 34 David A. Wood, Satish Chandra, Babak Falsafi, Mark D. Hill, James R. Larus, Alvin R. Lebeck, James C. Lewis, Shubhendu S. Mukherjee, Subbarao Palncharla, and Steven K. Reirdaardt. Mechanisms for Cooperative Shared Memory. iu Proceedings of the ~Otk Annual International S3/mposium on Computer Architecture, page May 1993. To appear.]] Google ScholarDigital Library
Index Terms
- The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
Recommendations
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93: Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systemsWe have developed a new technique for evaluating cache coherent, shared-memory computers. The Wisconsin Wind Tunnel (WWT) runs a parallel shared-memory program on a parallel computer (CM-5) and uses execution-driven, distributed, discrete-event ...
Wisconsin Wind Tunnel II: A Fast, Portable Parallel Architecture Simulator
Analysis of future parallel computers requires rapidly simulating target designs running realistic workloads. Two techniques have accelerated such simulations: direct execution and using a parallel host. Historically, these techniques have lacked ...
Kernel support for the Wisconsin wind tunnel
moas'93: USENIX Symposium on USENIX Microkernels and Other Kernel Architectures Symposium - Volume 4This paper describes a kernel interface that provides an untrusted user-level process (an executive) with protected access to memory management functions, including the ability to create, manipulate, and execute within subservient contexts (address ...
Comments