skip to main content
10.1145/1774088.1774582acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Accelerating multi-core simulators

Published:22 March 2010Publication History

ABSTRACT

Simulation is an important means of evaluating new microarchitectures. With the invention of multi-core (CMP) platforms, simulators are becoming larger and more complex. However, with the availability of CMPs with larger caches and higher operating frequency, the wall clock time required for simulating an application has become comparatively shorter. Reducing this simulation time further is a great challenge, especially in the case of multi-threaded workload due to indeterminacy introduced due to simultaneously executing various threads. In this paper, we propose a technique for speeding multi-core simulation. The model of the processor core and cache are replaced with functional models, to achieve speedup. A timed Petri net model is used to estimate the execution time of the processor and the memory access latencies are estimated using hit/miss information obtained from the functional model of the cache. This model can be used to predict performance of data parallel applications or multiprogramming workload on CMP platform with various cache hierarchies and shared bus interconnect. The error in estimation of the execution time of an application is within 6%. The speedup achieved ranges between an average of 2x--4x over the cycle accurate simulator.

References

  1. R. S. C. Aamer Jaleel. Cmpsim: A pin-based on-the-fly multi-core cache simulator. Workshop on Modeling, Benchmarking and Simulation, 2008.Google ScholarGoogle Scholar
  2. M. V. Biesbrouck, T. Sherwood, and B. Calder. A co-phase matrix to guide simultaneous multithreading simulation. In ISPASS '04: Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software, pages 45--56, Washington, DC, USA, 2004. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Chrystopher, A. Stanley, and F. jim. Compiled instruction set simulation. Software, Practice and Experience, 21(8), 1999.Google ScholarGoogle Scholar
  4. J. Edler and M. Hill. Dinero trace-driven uniprocessor cache simulator.Google ScholarGoogle Scholar
  5. S. A. M. Engin Ïpek, Bronis R. An approach to performance prediction for parallel applications. International Euro-Par Conference, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Gao, K. Karuri, S. Kraemer, R. Leupers, G. Ascheid, and H. Meyr. Multiprocessor performance estimation using hybrid simulation. In DAC '08: Proceedings of the 45th annual conference on Design automation, pages 325--330, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. Ïpek, S. A. McKee, R. Caruana, B. R. de Supinski, and M. Schulz. Efficiently exploring architectural design spaces via predictive modeling. SIGOPS Oper. Syst. Rev., 40(5):195--206, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In ISCA '81: Proceedings of the 8th annual symposium on Computer Architecture, pages 81--87, Los Alamitos, CA, USA, 1981. IEEE Computer Society Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M.-L. Li, R. Sasanka, S. A.-K. Chen, and E. Debes. The alpbench benchmark suite for complex multimedia applications. In IEEE International Symposium on Workload Characterization, 2005.Google ScholarGoogle Scholar
  10. A. Mandke, K. Varadarajan, A. Bharadwaj, and Y. N. Srikant. Accelerating multi-core simulator. Technical Report IISc-CSA-TR-2009-10, Computer Science and Automation, Indian Institute of Science, India, 2009. URL: http://csa.iisc.ernet.in/TR/2009/10/.Google ScholarGoogle Scholar
  11. M. M. K. Martin, D. J. Sorin, B. M. Beckmann, M. R. Marty, M. Xu, A. R. Alameldeen, K. E. Moore, M. D. Hill, and D. A. Wood. Multifacet's general execution-driven multiprocessor simulator (gems) toolset. SIGARCH Comput. Archit. News, 33(4):92--99, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Marwedel. Embedded system design. Springer International Edition.Google ScholarGoogle Scholar
  13. M. Monchiero, Ahn, J. Ho, Falconi, Ayose, Ortega, Daniel, Faraboschi, and Paolo. How to simulate 1000 cores. dasCMP Workshop, 2008.Google ScholarGoogle Scholar
  14. E. Perelman, M. Polito, J. yves Bouguet, J. Sampson, B. Calder, and C. Dulong. Detecting phases in parallel applications on shared memory architectures. In In International Parallel and Distributed Processing Symposium, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Renau, B. Fraguela, J. Tuck, W. Lui, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Struss, and P. Montesinos. Simulator for cmp architecture.Google ScholarGoogle Scholar
  16. T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In ASPLOS-X: Proceedings of the 10th international conference on Architectural support for programming languages and operating systems, pages 45--57, New York, NY, USA, 2002. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. J. Sorin, V. S. Pai, S. V. Adve, M. K. Vernon, and D. A. Wood. Analytic evaluation of shared-memory systems with ilp processors. In ISCA '98: Proceedings of the 25th annual international symposium on Computer architecture, pages 380--391, Washington, DC, USA, 1998. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The splash-2 programs: characterization and methodological considerations. In ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture, pages 24--36, New York, NY, USA, 1995. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Accelerating multi-core simulators

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SAC '10: Proceedings of the 2010 ACM Symposium on Applied Computing
      March 2010
      2712 pages
      ISBN:9781605586397
      DOI:10.1145/1774088

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 March 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SAC '10 Paper Acceptance Rate364of1,353submissions,27%Overall Acceptance Rate1,650of6,669submissions,25%
    • Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader