research-article

Interference resilient PDES on multi-core systems: towards proportional slowdown

Authors:
Jingjing Wang

Binghamton University, Binghamton, NY, USA

Binghamton University, Binghamton, NY, USA
View Profile

,
Nael Abu-Ghazaleh

Binghamton University, Binghamton, NY, USA

Binghamton University, Binghamton, NY, USA
View Profile

,
Dmitry Ponomarev

State University of New York at Binghamton, Binghamton, NY, USA

State University of New York at Binghamton, Binghamton, NY, USA
View Profile

SIGSIM PADS '13: Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete SimulationMay 2013Pages 115–126https://doi.org/10.1145/2486092.2486107

Published:19 May 2013Publication History

SIGSIM PADS '13: Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

Pages 115–126

ABSTRACT

Parallel Discrete Event Simulation (PDES) harnesses the power of parallel processing to improve the performance and capacity of simulation, supporting bigger models, in more details and for more scenarios. PDES engines are typically designed and evaluated assuming a homogeneous parallel computing system that is dedicated to the simulation application. In this paper, we first show that the presence of interference from other users, even a single process in an arbitrarily large parallel environment, can lead to dramatic slowdown in the performance of the simulation. We define a new metric, which we call proportional slowdown, that represents the idealized target for graceful slowdown in the presence of interference. We identify some of the reasons why simulators fall far short of proportional slowdown. Based on these observations, we design alternative simulation scheduling and mapping algorithms that are better able to tolerate interference. More precisely, the most resilient simulators will allow dynamic mapping of simulation event execution to processing resources (a work pool model). However, this model has significant overhead and can substantially impact locality. Thus, we propose a locality-aware adaptive dynamic-mapping (LADM) algorithm for PDES on multi-core systems. LADM reduces the number of active threads in the presence of interference, avoiding having threads disabled due to context switching. We show that LADM can substantially reduce the impact of interference while maintaining memory locality reducing the gap with proportional slowdown. LADM and similar techniques can also help in situations where there is load imbalance or processor heterogeneity.

References

G. R. Andrews. Foundations of Multithreaded, Parallel, and Distributed Programming. Addison-Wesley, Nov. 1999. Google ScholarDigital Library
K. Bahulkar, J. Wang, N. Abu-Ghazaleh, and D. Ponomarev. Partitioning on dynamic behavior for parallel discrete event simulation. In Principles of Advanced and Distributed Simulation (PADS), pages 221--230. IEEE, 2012. Google ScholarDigital Library
C. Carothers, D. Bauer, and S. Pearce. ROSS: A high-performance, low memory, modular time warp system. In Principles of Advanced and Distributed Simulation (PADS), pages 53--60. IEEE, 2000. Google ScholarDigital Library
C. Carothers, K. Perumalla, and R. Fujimoto. Efficient optimistic parallel simulations using reverse computation. ACM TOMACS, 1999. Google ScholarDigital Library
C. D. Carothers and R. M. Fujimoto. Background execution of time warp programs. In Principles of Advanced and Distributed Simulation (PADS), pages 12--19. IEEE, 1996. Google ScholarDigital Library
C. D. Carothers, R. M. Fujimoto, and Y.-B. Lin. A case study in simulating pcs networks using time warp. In Principles of Advanced and Distributed Simulation (PADS), pages 87--94. IEEE, 1995. Google ScholarDigital Library
C.D.Carothers and R. M. Fujimoto. Efficient execution of time warp programs on heterogeneous, now platforms. IEEE Transactions on Parallel and Distributed Systems, 11:299--317, 2000. Google ScholarDigital Library
R. Child and P. Wilsey. Dynamically adjusting core frequencies to accelerate time warp simulations in many-core processors. In Principles of Advanced and Distributed Simulation (PADS), pages 35--43. IEEE, 2012. Google ScholarDigital Library
P. Conway, N. Kalyanasundharam, G. Donley, K. Lepak, and B. Hughes. Cache hierarchy and memory subsystem of the amd opteron processor. IEEE Micro, 30(2):16--29, 2010. Google ScholarDigital Library
R. Eduardo, D. Grande, and A. Boukerche. Dynamic load redistribution based on migration latency analysis for distributed virtual simulations. In Haptic Audio Visual Environments and Games (HAVE). IEEE, 2011.Google Scholar
M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, pages 212--223, 1998. Google ScholarDigital Library
R. Fujimoto. Parallel discrete event simulation. Communications of the ACM, 33(10):30--53, oct 1990. Google ScholarDigital Library
R. Fujimoto. Performance of time warp under synthetic workloads. Proceedings of the SCS Multiconference on Distributed Simulation, 22(1):23--28, 1990.Google Scholar
R. M. Fujimoto. Parallel and Distributed Simulation Systems. Wiley Interscience, Jan. 2000.Google ScholarDigital Library
D. Glazer and C. Tropper. On process migration and load balancing in time warp. IEEE Transactions on Parallel and Distributed Systems, 4(3):318--327, 1993. Google ScholarDigital Library
R. Gupta. The fuzzy barrier: a mechanism for high speed synchronization of processors. In Proc. ASPLOS, pages 54--63, 1989. Google ScholarDigital Library
D. Jagtap, N.Abu-Ghazaleh, and D.Ponomarev. Optimization of parallel discrete event simulator for multi-core systems. In Proc. International Parallel and Distributed Processing Symposium (IPDPS), pages 520--531. IEEE, 2012. Google ScholarDigital Library
M. Y. H. Low. Managing external workload with bsp time warp. In Proceedings of the 2002 Winter Simulation Conference. IEEE, 2002.Google ScholarCross Ref
A. W. Malik, A.J.Park, and R. Fujimoto. Optimistic synchronization of parallel simulations in cloud computing environments. In Proceedings of the International Conference on Cloud Computing, pages 49--56. IEEE, 2009. Google ScholarDigital Library
A. Nataraj, A. Morris, A. Malony, M. Sottile, and P. Beckman. The ghost in the machine: observing the effects of kernel operation on parallel application performance. In Proc. of ACM/IEEE Confernece on Supercomputing, pages 1--12. IEEE, 2007. Google ScholarDigital Library
A. Palaniswamy and P. A. Wilsey. An analytical comparison of periodic checkpointing and incremental state saving. In Proc. of the 7th Workshop on Parallel and Distributed Simulation (PADS 93), pages 127--134. Society for Computer Simulation, July 1993. Google ScholarDigital Library
F. Petrini, D. J. Kerbyson, and S. Pakin. The case of the missing supercomputer performance: Achieving optimal performance on the 8,192 processors of asci q. In Proc. of ACM/IEEE Confernece on Supercomputing, page 55. ACM, 2003. Google ScholarDigital Library
P. Reiher and D. Jefferson. Virtual time based dynamic load management in the time warp operating system. In Proceedings of the SCS Multiconference on Distributed Simulation, pages 103--111, 1990.Google Scholar
V. Sachdev, M. Hybinette, and E. Kraemer. Controlling over-optimism in time-warp via cpu-based flow control. In Proceedings of the 2004 Winter Simulation Conference. IEEE, 2004. Google ScholarDigital Library
K. H. Shum. Replicating parallel simulation on heterogeneous clusters. Journal of Systems Architecture, 44:273--292, 1998. Google ScholarDigital Library
S. C. Tay, Y. M. Teo, and S. T. Kong. Speculative parallel simulation with an adaptive throttle scheme. In Principles of Advanced and Distributed Simulation (PADS), pages 116--123. IEEE, 1997. Google ScholarDigital Library
D. Tsafrir, Y. Etsion, D. Feitelson, and S. Kirkpatrick. System noise, os clock ticks, and fine-grained parallel applications. In Proc. of ACM/IEEE Confernece on Supercomputing, pages 303--312. ACM, 2005. Google ScholarDigital Library
R. Vitali, A. Pellegrini, and F. Quaglia. Assessing load-sharing within optimistic simulation platforms. In Proceedings of the 2012 Winter Simulation Conference. IEEE, 2012. Google ScholarDigital Library
R. Vitali, A. Pellegrini, and F. Quaglia. Towards symmetric multi-threaded optimistic simulation kernels. In Principles of Advanced and Distributed Simulation (PADS), pages 211--220. IEEE, 2012. Google ScholarDigital Library
WarpIV Technologies (J. Steinman et al). The warpiv parallel simulation kernel version 1.5.2, 2008. Software available from http://www.warpiv.com/.Google Scholar
S. Zhuravlev, S. Blagodurov, and A. Fedorova. Addressing shared resource contention in multicore processors via scheduling. In Proc. of ASPLOS, pages 129--142. ACM, 2010. Google ScholarDigital Library

Index Terms

Interference resilient PDES on multi-core systems: towards proportional slowdown
1. Computing methodologies
  1. Modeling and simulation
    1. Simulation types and techniques
      1. Discrete-event simulation
      2. Massively parallel and high-performance simulations

Recommendations

AIR: Application-Level Interference Resilience for PDES on Multicore Systems

Parallel discrete event simulation (PDES) harnesses parallel processing to improve the performance and capacity of simulation, supporting bigger and more detailed models simulated for more scenarios. The presence of interference from other users can ...
Read More
Exploring many-core architecture design space for parallel discrete event simulation
SIGSIM PADS '14: Proceedings of the 2nd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

As multicore and manycore processor architectures are emerging and the core counts per chip continue to increase, it is important to evaluate and understand the performance and scalability of Parallel Discrete Event Simulation (PDES) on these platforms. ...
Read More
Optimizing performance of parallel programs on multicomputer and multi-core architectures: a comparative evaluation
ISTA '09: Proceedings of the 2009 conference on Information Science, Technology and Applications

With the advent of multi-core architectures, there arises a need for comparative evaluations of the performance of well-understood parallel programs. This is because, it is necessary to gain an insight into the potential advantages of the available ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGSIM PADS '13: Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation
May 2013
426 pages
ISBN:9781450319201
DOI:10.1145/2486092
General Chair:
Margaret L. Loper
Georgia Institute of Technology, USA
,
Program Chair:
Gabriel A. Wainer
Carleton University, Canada
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 May 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
interference
multi-cores
pdes
proportional slowdown
Qualifiers
- research-article
Conference

Acceptance Rates
SIGSIM PADS '13 Paper Acceptance Rate29of75submissions,39%Overall Acceptance Rate398of779submissions,51%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 132
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Interference resilient PDES on multi-core systems: towards proportional slowdown

SIGSIM PADS '13: Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

ABSTRACT

References

Cited By

Index Terms

Recommendations

AIR: Application-Level Interference Resilience for PDES on Multicore Systems

Exploring many-core architecture design space for parallel discrete event simulation

Optimizing performance of parallel programs on multicomputer and multi-core architectures: a comparative evaluation