skip to main content
research-article

FNM: An Enhanced Null-Message Algorithm for Parallel Simulation of Multicore Systems

Published:29 January 2016Publication History
Skip Abstract Section

Abstract

As multicore computer systems become increasingly complex, parallel simulation is becoming an important tool for exploring design space and evaluating design tradeoffs. The key to the success of parallel simulation is the ability to maintain a high degree of parallelism under synchronization constraints. In this article, an enhanced Null-message algorithm called FNM is presented that uses domain-specific knowledge to improve the performance of the basic Null-message algorithm. Based on their runtime states, the components of the simulation model can make a conservative forecast of future interprocess events. The forecast information is carried in the enhanced Null-messages, and, by combining the forecast from both sides of an interprocess link, FNM can achieve a dynamic system lookahead that is much greater than what the static system structure provides. This improved lookahead allows better exploitation of the simulation model's inherent parallelism and leads to better performance. Compared with the basic Null-message algorithm, FNM greatly reduces the amount of Null-messages and improves parallel simulation performance as a result, while at the same time it guarantees simulation correctness as the basic Null-message algorithm does. In tests on cycle-level models with up to 128 cores, FNM shows good scalability and proves to be an effective method.

References

  1. C. Bienia and K. Li. 2009. PARSEC 2.0: A new benchmark suite for chip-multiprocessors. In Proceedings of the 5th Annual Workshop on Modeling, Benchmarking and Simulation.Google ScholarGoogle Scholar
  2. R. E. Bryant. 1977. Simulation of Packet Communications Architecture Computer Systems. Technical Report MIT-LCS-TR-188. Massachusetts Institute of Technology. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. W. Cai and S. Turner. 1990. An algorithm for distributed discrete event simulation: The “carrier null message” approach. In Proceedings of the 1990 SCS Multiconference on Distributed Simulation. 3--8.Google ScholarGoogle Scholar
  4. K. M. Chandy and J. Misra. 1979. Distributed simulation: A case study in design and verification of distributed programs. IEEE Transactions on Software Engineering SE-5, 5 (1979), 440--452. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Chen, L. K. Dabbiru, D. Wong, M. Annavaram, and M. Dubois. 2010. Adaptive and speculative slack simulations of CMPs on CMPs. In Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture. 523--534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Chidester and A. George. 2002. Parallel simulation of chip-multiprocessor architectures. ACM Transactions on Modeling and Computer Simulation 12, 3 (July 2002), 176--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M.-K. Chung and C.-M. Kyung. 2006. Improving lookahead in parallel multiprocessor simulation using dynamic execution path prediction. In Proceedings of the 20th Workshop on Principles of Advanced and Distributed Simulation. 11--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. C. DeVries. 1990. Reducing null messages in Misra's distributed discrete event simulation method. IEEE Transactions on Software Engineering 16, 1 (January 1990), 82--91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Z. Dong, J. Wang, G. Riley, and S. Yalamanchili. 2013. A study of the effect of partitioning on parallel simulation of multicore systems. In IEEE 21st International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS'13). 375--379. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Z. Dong, J. Wang, G. Riley, and S. Yalamanchili. 2014. An efficient front-end for timing-directed parallel simulation of multicore system. In Proceedings of the 7th International ICST Conference on Simulation Tools and Techniques (SIMUTools'14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Duato, S. Yalamanchili, and L. Ni. 2003. Interconnection Networks, an Engineering Approach. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. M. Fujimoto. 1989. Performance measurements of distributed simulation strategies. Transactions of the Society for Computer Simulation 6, 2 (April 1989), 89--132.Google ScholarGoogle Scholar
  13. R. M. Fujimoto. 2000. Parallel and Distributed Simulation Systems. John Wiley & Sons. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. L. Hennessy and D. A. Patterson. 2007. Computer Architecture: A Quantitative Approach (4th ed.). Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. W. Keckler, K. Olukotun, and H. P. Hofstee (Eds.). 2009. Multicore Processors and Systems. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. D. Kersey, A. Rodrigues, and S. Yalamanchili. 2012. A universal parallel front-end for execution driven microarchitecture simulation. In Proceedings of the 2012 Workshop on Rapid Simulation and Performance Evaluation Methods and Tools. 25--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Li and C. Tropper. 2009. A multiway design-driven partitioning algorithm for distributed verilog simulation. Simulation 85, 4 (April 2009), 257--270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. H. Loh, S. Subramaniam, and Y. Xie. 2009. Zesto: A cycle-level simulator for highly detailed microarchitecture exploration. In Proceedings of the International Symposium on Performance Analysis of Software and Systems. 53--64.Google ScholarGoogle Scholar
  19. J. Misra. 1986. Distributed discrete event simulation. Computer Surveys 18, 1 (March 1986), 39--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Papamarcos and J. Patel. 1984. A low-overhead coherence solution for multiprocessors with private cache memories. In Proceedings of the 11th Annual International Symposium on Computer Architecture. 348--354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Park, H. Oh, and S. Ha. 2009. Multiprocessor SoC design methods and tools. IEEE Signal Processing Magazine (November 2009), 72--79.Google ScholarGoogle Scholar
  22. J. Pelkey and G. Riley. 2011. Distributed simulation with MPI in ns-3. In Proceedings of the 4th International ICST Conference on Simulation Tools and Techniques. 410--414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Reinhardt, M. Hill, J. Larus, A. Lebeck, J. Lewis, and D. Wood. 1993. The Wisconsin wind tunnel: Virtual prototyping of parallel computers. In Proceedings of the 1993 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems. 48--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. F. Rodrigues, K. S. Hemmert, B. W. Barrett, C. Kersey, R. Oldfield, M. Weston, R. Risen, J. Cook, P. Rosenfeld, E. CooperBalls, and B. Jacob. 2011. The structural simulation toolkit. ACM SIGMETRICS Performance Evaluation Review 38, 4 (March 2011), 37--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. W.-K. Su and C. L. Seitz. 1988. Variants of the Chandy-Misra-Bryant Distributed Discrete-Event Simulation Algorithm. Technical Report Caltech-CS-TR-88-22. California Institute of Technology. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Wang, J. Beu, R. Bheda, T. Conte, Z. Dong, C. Kersey, M. Rasquinha, G. Riley, W. Song, H. Xiao, P. Xu, and S. Yalamanchili. 2014. Manifold: A parallel simulation framework for multicore systems. In Proceedings of the 2014 IEEE International Symposium on Performance w Analysis of Systems and Software (ISPASS'14). 106--115.Google ScholarGoogle Scholar
  27. J. Wang, J. Beu, S. Yalamanchili, and T. Conte. 2012. Designing configurable, modifiable and reusable components for simulation of multicore systems. In Proceedings of the 3rd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS'12). 472--476. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. FNM: An Enhanced Null-Message Algorithm for Parallel Simulation of Multicore Systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Modeling and Computer Simulation
      ACM Transactions on Modeling and Computer Simulation  Volume 26, Issue 2
      January 2016
      152 pages
      ISSN:1049-3301
      EISSN:1558-1195
      DOI:10.1145/2875131
      Issue’s Table of Contents

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 29 January 2016
      • Accepted: 1 February 2015
      • Revised: 1 December 2014
      • Received: 1 April 2014
      Published in tomacs Volume 26, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader