ABSTRACT
Time Warp is an optimistic synchronization protocol for parallel discrete event simulation that coordinates the available parallelism through its rollback and antimessage mechanisms. In this paper we present the results of a strong scaling study of the ROSS simulator running Time Warp with reverse computation and executing the well-known PHOLD benchmark on Lawrence Livermore National Laboratory's Sequoia Blue Gene/Q supercomputer. The benchmark has 251 million PHOLD logical processes and was executed in several configurations up to a peak of 7.86 million MPI tasks running on 1,966,080 cores. At the largest scale it processed 33 trillion events in 65 seconds, yielding a sustained speed of 504 billion events/second using 120 racks of Sequoia. This is by far the highest event rate reported by any parallel discrete event simulation to date, whether running PHOLD or any other benchmark. Additionally, we believe it is likely to be the largest number of MPI tasks ever used in any computation of any kind to date.
ROSS exhibited a super-linear speedup throughout the strong scaling study, with more than a 97x speed improvement from scaling the number of cores by only 60x (from 32,768 to 1,966,080). We attribute this to significant cache-related performance acceleration as we moved to higher scales with fewer LPs per core.
Prompted by historical performance results we propose a new, long term performance metric called Warp Speed that grows logarithmically with the PHOLD event rate. As we define it our maximum speed of 504 billion PHOLD events/sec corresponds to Warp 2.7.
We suggest that the results described here are significant because they demonstrate that direct simulation of planetary-scale discrete event models are now, in principle at least, within reach.
- D. W. Bauer and C. D. Carothers. Eliminating remote message passing in optimistic simulation. In WSC '06: Proceedings of the 38th conference on Winter simulation. Winter Simulation Conference, December 2006. Google ScholarDigital Library
- D. W. Bauer Jr., C. D. Carothers, and A. Holder. Scalable time warp on blue gene supercomputers. In Proceedings of the 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation, pages 35--44, Washington, DC, USA, 2009. IEEE Computer Society. Google ScholarDigital Library
- P. Beckman, K. Iskra, K. Yoshii, S. Coghlan, and A. Nataraj. Benchmarking the Effects of Operating System Interference on Extreme-Scale Parallel Machines. Cluster Comput., 11:3--16, 2008. Google ScholarDigital Library
- C. D. Carothers, D. Bauer, and S. Pearce. Ross: A high-performance, low-memory, modular time warp system. Journal of Parallel and Distributed Computing, 62(11):1648 -- 1669, 2002.Google ScholarDigital Library
- C. D. Carothers and K. S. Perumalla. On deciding between conservative and optimistic approaches on massively parallel platforms. In Winter Simulation Conference'10, pages 678--687, 2010. Google ScholarDigital Library
- C. D. Carothers, K. S. Perumalla, and R. M. Fujimoto. Efficient optimistic parallel simulations using reverse computation. ACM Transactions on Modeling and Computer Simulation, 9(3):224--253, 1999. Google ScholarDigital Library
- D. Chen, N. Eisley, P. Heidelberger, S. Kumar, A. Mamidala, F. Petrini, R. Senger, Y. Sugawara, R. Walkup, B. Steinmacher-Burow, A. Choudhury, Y. Sabharwal, S. Singhal, and J. J. Parker. Looking under the hood of the ibm blue gene/q network. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, pages 69:1--69:12, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press. Google ScholarDigital Library
- D. Chen, N. A. Eisley, P. Heidelberger, R. M. Senger, Y. Sugawara, S. Kumar, V. Salapura, D. L. Satterfield, B. Steinmacher-Burow, and J. J. Parker. The ibm blue gene/q interconnection network and message unit. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC '11, pages 26:1--26:10, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- G. Chen and B. K. Szymanski. Dsim: scaling time warp to 1,033 processors. In WSC '05: Proceedings of the 37th conference on Winter simulation, pages 346--355. Winter Simulation Conference, 2005. Google ScholarDigital Library
- G. Chen and B. K. Szymanski. Time quantum GVT: A scalable computation of the global virtual time in parallel discrete event simulations. Scalable Computing: Practice and Experience: Scientific International Journal for Parallel and Distributed Computing, pages 425--446, 2007.Google Scholar
- G. Chiu, P. Coteus, and R. Wisniewski. Blue gene/q overview and update. http://www.alcf.anl.gov/sites/www.alcf.anl.gov/files/IBM_BGQ_Architecture_0.pdf, 2011.Google Scholar
- C. C. Foster. Information retrieval: information storage and retrieval using avl trees. In Proceedings of the 1965 20th national conference, ACM '65, pages 192--205, New York, NY, USA, 1965. ACM. Google ScholarDigital Library
- R. M. Fujimoto. Performance of time warp under synthetic workloads, January 1990.Google Scholar
- R. M. Fujimoto and K. S. Panesar. Buffer management in shared-memory time warp systems. In Proceedings of the ninth workshop on Parallel and distributed simulation, PADS '95, pages 149--156, Washington, DC, USA, 1995. IEEE Computer Society. Google ScholarDigital Library
- R. M. Fujimoto, K. Perumalla, A. Park, H. Wu, M. H. Ammar, and G. F. Riley. Large-scale network simulation -- how big? how fast. In In Symposium on Modeling, Analysis and Simulation of Computer Telecommunication Systems (MASCOTS, 2003.Google ScholarCross Ref
- E. Gonsiorowski, C. Carothers, and C. Tropper. Modeling large scale circuits using massively parallel discrete-event simulation. In Modeling, Analysis Simulation of Computer and Telecommunication Systems (MASCOTS), 2012 IEEE 20th International Symposium on, pages 127--133, Aug. Google ScholarDigital Library
- A. G. Greenberg, B. D. Lubachevsky, P. E. Wright, and D. M. Nicol. Efficient massively parallel simulation of dynamic channel assignment schemes for wireless cellular communications. In Workshop on Parallel and Distributed Simulation, pages 187--194, 1994. Google ScholarDigital Library
- F. Hao, K. Wilson, R. Fujimoto, and E. Zegura. Logical process size in parallel simulations. In Proceedings of the 28th conference on Winter simulation, WSC '96, pages 645--652, Washington, DC, USA, 1996. IEEE Computer Society. Google ScholarDigital Library
- A. Holder and C. D. Carothers. Analysis of time warp on a 32,768 processor ibm blue gene/l supercomputer. In 2008 Proceedings European Modeling and Simulation Symposium (EMSS), 2008.Google Scholar
- D. R. Jefferson. Virtual time. ACM Trans. Program. Lang. Syst., 7(3):404--425, 1985. Google ScholarDigital Library
- S. Kumar, A. R. Mamidala, D. A. Faraj, B. Smith, M. Blocksome, B. Cernohous, D. Miller, J. Parker, J. Ratterman, P. Heidelberger, D. Chen, and B. Steinmacher-Burrow. Pami: A parallel active message interface for the blue gene/q supercomputer. Parallel and Distributed Processing Symposium, International, 0:763--773, 2012. Google ScholarDigital Library
- P. L'Ecuyer and T. H. Andres. A random number generator based on the combination of four lcgs. Math. Comput. Simul., 44(1):99--107, 1997. Google ScholarDigital Library
- N. Liu, C. Carothers, J. Cope, P. Carns, R. Ross, A. Crume, and C. Maltzahn. Modeling a leadership-scale storage system. In Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I, PPAM'11, pages 10--19, Berlin, Heidelberg, 2012. Springer-Verlag. Google ScholarDigital Library
- N. Liu and C. D. Carothers. Modeling billion-node torus networks using massively parallel discrete-event simulation. In Proceedings of the 2011 IEEE Workshop on Principles of Advanced and Distributed Simulation, PADS '11, pages 1--8, Washington, DC, USA, 2011. IEEE Computer Society. Google ScholarDigital Library
- N. Liu, J. Cope, P. Carns, C. D. Carothers, R. Ross, G. Grider, A. Crume, and C. Maltzahn. On the role of burst buffers in leadership-class storage systems. In In Proceedings of the 28th IEEE Conference on Mass Storage Systems and Technologies (MSST 2012). IEEE, 2012.Google ScholarCross Ref
- B. D. Lubachevsky, A. Shwartz, and A. Weiss. An analysis of rollback-based simulation. ACM Transactions on Modeling and Computer Simulation, 1(2):154--193, 1991. Google ScholarDigital Library
- M. Mubarak, C. D. Carothers, R. Ross, and P. Carns. Modeling a million-node dragonfly network using massively parallel discrete event simulation. In 3rd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS12) held as part of SC12, 2012. Google ScholarDigital Library
- D. M. Nicol and X. Liu. The dark side of risk (what your mother never told you about time warp). In PADS '97: Proceedings of the eleventh workshop on Parallel and distributed simulation, pages 188--195, Washington, DC, USA, 1997. IEEE Computer Society. Google ScholarDigital Library
- K. S. Perumalla. Scaling time warp-based discrete event execution to 104 processors on a blue gene supercomputer. In CF '07: Proceedings of the 4th international conference on Computing Frontiers, pages 69--76, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- K. S. Perumalla. μπ: A scalable and transparent system for simulation mpi programs. In In Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques, 2010. Google ScholarDigital Library
- K. S. Perumalla and S. K. Seal. Reversible parallel discrete-event execution of large-scale epidemic outbreak models. In In Proceedings of the 24th Workshop on Principles of Advanced and Distributed Simulation, 2010. Google ScholarDigital Library
- J. Romero. Energy-wise blog: Lack of rain a leading cause of indian grid collapse. IEEE Spectrum, July 2012.Google Scholar
- P. Schweizer. Throw Them All Out. Houghton Mifflin Harcount Publishing Company, New York, 2011.Google Scholar
- D. D. Sleator and R. E. Tarjan. Self-adjusting binary search trees. J. ACM, 32(3):652--686, July 1985. Google ScholarDigital Library
- E. Ullman. "errant code? it's not just a bug", new york times, the opinion pages. http://www.nytimes.com/2012/08/09/opinion/after-knight-capital-new-code-for-trades.html, August 8th, 2012.Google Scholar
- J. Vaucher and P. Duval. A comparison of simulation event list algorithms. Communications of the ACM, 18(4):223--230, 1975. Google ScholarDigital Library
- G. Yaun, C. D. Carothers, and S. Kalyanaraman. Large-scale tcp models using optimistic parallel simulation. In Proceedings of the seventeenth workshop on Parallel and distributed simulation, PADS '03, pages 153--, Washington, DC, USA, 2003. IEEE Computer Society. Google ScholarDigital Library
Index Terms
- Warp speed: executing time warp on 1,966,080 cores
Recommendations
Time Warp on the GPU: Design and Assessment
SIGSIM-PADS '17: Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete SimulationThe parallel execution of discrete-event simulations on commodity GPUs has been shown to achieve high event rates. Most previous proposals have focused on conservative synchronization, which typically extracts only limited parallelism in cases of low ...
Parallel Discrete-Event Simulation on Data Processing Engines
DS-RT '16: Proceedings of the 20th International Symposium on Distributed Simulation and Real-Time ApplicationsDevelopment of a decent parallel simulator is challenging work. It should achieve enough performance, scalability and fault tolerance. Our proposal is utilizing general-purpose data processing engines such as MapReduce implementations for parallel ...
Lightweight Time Warp- A Novel Protocol for Parallel Optimistic Simulation of Large-Scale DEVS and Cell-DEVS Models
DS-RT '08: Proceedings of the 2008 12th IEEE/ACM International Symposium on Distributed Simulation and Real-Time ApplicationsThis paper proposes a novel Lightweight Time Warp (LTW) protocol for high-performance parallel optimistic simulation of large-scale DEVS and Cell-DEVS models. By exploiting the characteristics of the simulation process, the protocol is able to set free ...
Comments