skip to main content
10.1145/2486092.2486107acmconferencesArticle/Chapter ViewAbstractPublication PagespadsConference Proceedingsconference-collections
research-article

Interference resilient PDES on multi-core systems: towards proportional slowdown

Published:19 May 2013Publication History

ABSTRACT

Parallel Discrete Event Simulation (PDES) harnesses the power of parallel processing to improve the performance and capacity of simulation, supporting bigger models, in more details and for more scenarios. PDES engines are typically designed and evaluated assuming a homogeneous parallel computing system that is dedicated to the simulation application. In this paper, we first show that the presence of interference from other users, even a single process in an arbitrarily large parallel environment, can lead to dramatic slowdown in the performance of the simulation. We define a new metric, which we call proportional slowdown, that represents the idealized target for graceful slowdown in the presence of interference. We identify some of the reasons why simulators fall far short of proportional slowdown. Based on these observations, we design alternative simulation scheduling and mapping algorithms that are better able to tolerate interference. More precisely, the most resilient simulators will allow dynamic mapping of simulation event execution to processing resources (a work pool model). However, this model has significant overhead and can substantially impact locality. Thus, we propose a locality-aware adaptive dynamic-mapping (LADM) algorithm for PDES on multi-core systems. LADM reduces the number of active threads in the presence of interference, avoiding having threads disabled due to context switching. We show that LADM can substantially reduce the impact of interference while maintaining memory locality reducing the gap with proportional slowdown. LADM and similar techniques can also help in situations where there is load imbalance or processor heterogeneity.

References

  1. G. R. Andrews. Foundations of Multithreaded, Parallel, and Distributed Programming. Addison-Wesley, Nov. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. K. Bahulkar, J. Wang, N. Abu-Ghazaleh, and D. Ponomarev. Partitioning on dynamic behavior for parallel discrete event simulation. In Principles of Advanced and Distributed Simulation (PADS), pages 221--230. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Carothers, D. Bauer, and S. Pearce. ROSS: A high-performance, low memory, modular time warp system. In Principles of Advanced and Distributed Simulation (PADS), pages 53--60. IEEE, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Carothers, K. Perumalla, and R. Fujimoto. Efficient optimistic parallel simulations using reverse computation. ACM TOMACS, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. D. Carothers and R. M. Fujimoto. Background execution of time warp programs. In Principles of Advanced and Distributed Simulation (PADS), pages 12--19. IEEE, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. D. Carothers, R. M. Fujimoto, and Y.-B. Lin. A case study in simulating pcs networks using time warp. In Principles of Advanced and Distributed Simulation (PADS), pages 87--94. IEEE, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C.D.Carothers and R. M. Fujimoto. Efficient execution of time warp programs on heterogeneous, now platforms. IEEE Transactions on Parallel and Distributed Systems, 11:299--317, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Child and P. Wilsey. Dynamically adjusting core frequencies to accelerate time warp simulations in many-core processors. In Principles of Advanced and Distributed Simulation (PADS), pages 35--43. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Conway, N. Kalyanasundharam, G. Donley, K. Lepak, and B. Hughes. Cache hierarchy and memory subsystem of the amd opteron processor. IEEE Micro, 30(2):16--29, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Eduardo, D. Grande, and A. Boukerche. Dynamic load redistribution based on migration latency analysis for distributed virtual simulations. In Haptic Audio Visual Environments and Games (HAVE). IEEE, 2011.Google ScholarGoogle Scholar
  11. M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, pages 212--223, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Fujimoto. Parallel discrete event simulation. Communications of the ACM, 33(10):30--53, oct 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Fujimoto. Performance of time warp under synthetic workloads. Proceedings of the SCS Multiconference on Distributed Simulation, 22(1):23--28, 1990.Google ScholarGoogle Scholar
  14. R. M. Fujimoto. Parallel and Distributed Simulation Systems. Wiley Interscience, Jan. 2000.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Glazer and C. Tropper. On process migration and load balancing in time warp. IEEE Transactions on Parallel and Distributed Systems, 4(3):318--327, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Gupta. The fuzzy barrier: a mechanism for high speed synchronization of processors. In Proc. ASPLOS, pages 54--63, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Jagtap, N.Abu-Ghazaleh, and D.Ponomarev. Optimization of parallel discrete event simulator for multi-core systems. In Proc. International Parallel and Distributed Processing Symposium (IPDPS), pages 520--531. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Y. H. Low. Managing external workload with bsp time warp. In Proceedings of the 2002 Winter Simulation Conference. IEEE, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  19. A. W. Malik, A.J.Park, and R. Fujimoto. Optimistic synchronization of parallel simulations in cloud computing environments. In Proceedings of the International Conference on Cloud Computing, pages 49--56. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Nataraj, A. Morris, A. Malony, M. Sottile, and P. Beckman. The ghost in the machine: observing the effects of kernel operation on parallel application performance. In Proc. of ACM/IEEE Confernece on Supercomputing, pages 1--12. IEEE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Palaniswamy and P. A. Wilsey. An analytical comparison of periodic checkpointing and incremental state saving. In Proc. of the 7th Workshop on Parallel and Distributed Simulation (PADS 93), pages 127--134. Society for Computer Simulation, July 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. F. Petrini, D. J. Kerbyson, and S. Pakin. The case of the missing supercomputer performance: Achieving optimal performance on the 8,192 processors of asci q. In Proc. of ACM/IEEE Confernece on Supercomputing, page 55. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Reiher and D. Jefferson. Virtual time based dynamic load management in the time warp operating system. In Proceedings of the SCS Multiconference on Distributed Simulation, pages 103--111, 1990.Google ScholarGoogle Scholar
  24. V. Sachdev, M. Hybinette, and E. Kraemer. Controlling over-optimism in time-warp via cpu-based flow control. In Proceedings of the 2004 Winter Simulation Conference. IEEE, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. H. Shum. Replicating parallel simulation on heterogeneous clusters. Journal of Systems Architecture, 44:273--292, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. C. Tay, Y. M. Teo, and S. T. Kong. Speculative parallel simulation with an adaptive throttle scheme. In Principles of Advanced and Distributed Simulation (PADS), pages 116--123. IEEE, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Tsafrir, Y. Etsion, D. Feitelson, and S. Kirkpatrick. System noise, os clock ticks, and fine-grained parallel applications. In Proc. of ACM/IEEE Confernece on Supercomputing, pages 303--312. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Vitali, A. Pellegrini, and F. Quaglia. Assessing load-sharing within optimistic simulation platforms. In Proceedings of the 2012 Winter Simulation Conference. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Vitali, A. Pellegrini, and F. Quaglia. Towards symmetric multi-threaded optimistic simulation kernels. In Principles of Advanced and Distributed Simulation (PADS), pages 211--220. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. WarpIV Technologies (J. Steinman et al). The warpiv parallel simulation kernel version 1.5.2, 2008. Software available from http://www.warpiv.com/.Google ScholarGoogle Scholar
  31. S. Zhuravlev, S. Blagodurov, and A. Fedorova. Addressing shared resource contention in multicore processors via scheduling. In Proc. of ASPLOS, pages 129--142. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Interference resilient PDES on multi-core systems: towards proportional slowdown

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGSIM PADS '13: Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation
        May 2013
        426 pages
        ISBN:9781450319201
        DOI:10.1145/2486092

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 May 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGSIM PADS '13 Paper Acceptance Rate29of75submissions,39%Overall Acceptance Rate398of779submissions,51%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader