ABSTRACT
This paper proposes a synchronization approach for fast and accu-rate Multi-Core Instruction-Set Simulation (MCISS). An ideal MCISS should run accurately in a real-time fashion. In order to achieve accurate simulation results of MCISS, a lock-step approach, which synchronizes every cycle, is commonly used. However, this approach introduces immense overhead and lowers the simulation speed. Instead of synchronizing every cycle, our approach synchronizes the MCISS based on the data dependency among the simulated programs. Therefore, the synchronization overheads can be highly reduced while the accurate simulation results are ensured. With the proposed approach applied, the simulation speed of MCISS is up to 40 ~ 1,000 million instructions per second (MIPS) in general.
- Simplescalar, available at www.simplescalar.comGoogle Scholar
- J. Zhu and D. D. Gajski, "A retargetable, ultra-fast instruction set simulator," in DATE '99: Proceedings of the conference on Design, automation and test in Europe. pp. 62--69, 1999. Google ScholarDigital Library
- B. Cmelik and D. Keppel, "Shade: a fast instruction-set simulator for execution profiling," in SIGMETRICS '94: Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems. pp. 128--137, 1994. Google ScholarDigital Library
- E. Witchel and M. Rosenblum, "Embra: fast and flexible machine simulation," in SIGMETRICS '96: Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems. pp. 68--79, 1996. Google ScholarDigital Library
- A. Nohl, G. Braun, O. Schliebusch, R. Leupers, H. Meyr, and A. Hoffmann, "A universal technique for fast and flexible instruction-set architecture simulation," in DAC '02: Proceedings of the 39th conference on Design automation. pp. 22--27, 2002. Google ScholarDigital Library
- M. Reshadi, P. Mishra, and N. Dutt, "Instruction set compiled simulation: a technique for fast and flexible instruction set simulation," in DAC '03: Proceedings of the 40th conference on Design automation. pp. 758--763, 2003. Google ScholarDigital Library
- F. Bellard, "QEMU, a fast and portable dynamic translator," in Proc. of the USENIX Annual Technical Conference, pp. 41--46, 2005. Google ScholarDigital Library
- W. Qin, J. D'Errico, and X. Zhu, "A multiprocessing approach to accelerate retargetable and portable dynamic-compiled instruction-set simulation," in CODES+ISSS '06: Proceedings of the 4th international conference on Hardware/software codesign and system synthesis. pp. 193--198, 2006. Google ScholarDigital Library
- R. L. Sites, A. Chernoff, M. B. Kirk, M. P. Marks, and S. G. Robinson, "Binary translation," Commun. ACM, vol. 36, no. 2, pp. 69--81, 1993. Google ScholarDigital Library
- J. Schnerr, O. Bringmann, and W. Rosenstiel, "Cycle accurate binary translation for simulation acceleration in rapid prototyping of socs," in DATE '05: Proceedings of the conference on Design, Automation and Test in Europe. pp. 792--797, 2005. Google ScholarDigital Library
- S. Mukherjee et al., "Wisconsin Wind Tunnel II: a fast, portable parallel architecture simulator," in Concurrency, IEEE, vol. 8, pp. 12--20, 2000. Google ScholarDigital Library
- J. Jung, S. Yoo, and K. Choi, "Performance improvement of multi-processor systems cosimulation based on sw analysis," in DATE '01: Proceedings of the conference on Design, automation and test in Europe. pp. 749--753, 2001. Google ScholarDigital Library
- D. Kim, Y. Yi, and S. Ha, "Trace-driven hw/sw cosimulation using virtual synchronization technique," in DAC '05: Proceedings of the 42nd annual conference on Design automation. pp. 345--348, 2005. Google ScholarDigital Library
- J. Hennessy and D. Patterson, Computer Architecture: a quantitative approach, 4th ed., 2007. Google ScholarDigital Library
- Y. Hwang, S. Abdi, and D. Gajski, "Cycle-approximate re-targetable performance estimation at the transaction level," in DATE '08: Proceedings of the conference on Design, automation and test in Europe. pp. 3--8, 2008. Google ScholarDigital Library
- Andes, available at www.andestech.comGoogle Scholar
- S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, "The splash-2 programs: characterization and methodological considerations," in ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture. pp. 24--36, 1995. Google ScholarDigital Library
Index Terms
An effective synchronization approach for fast and accurate multi-core instruction-set simulation
Recommendations
A high-parallelism distributed scheduling mechanism for multi-core instruction-set simulation
DAC '11: Proceedings of the 48th Design Automation ConferenceIdeally, multi-core instruction-set simulation should run in parallel to improve simulation performance. However, the conventional low-parallelism centralized scheduler greatly constrains simulation performance. To resolve this issue, we propose a high-...
A distributed timing synchronization technique for parallel multi-core instruction-set simulation
Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systemsAs multi-core architecture has become the mainstream, the corresponding multi-core instruction-set simulation (MCISS) is also needed to aid system development. Ideally, we may run a MCISS in parallel to enhance the simulation speed. However, the ...
A critical-section-level timing synchronization approach for deterministic multi-core instruction set simulations
DATE '13: Proceedings of the Conference on Design, Automation and Test in EuropeThis paper proposes a Critical-Section-Level timing synchronization approach for deterministic Multi-Core Instruction-Set Simulation (MCISS). By synchronizing at each lock access instead of every shared-variable access and using a simple lock usage ...
Comments