ABSTRACT
Analysis of parallel programs is hard mainly because their behavior changes from run to run. We present an execution capture and deterministic replay system that enables repeatable analysis of parallel programs. Our goal is to provide an easy-to-use framework for capturing, deterministically replaying, and analyzing execution of large programs with reasonable runtime and disk usage. Our system, called PinPlay, is based on the popular Pin dynamic instrumentation system hence is very easy to use. PinPlay extends the capability of Pin-based analysis by providing a tool for capturing one execution instance of a program (as log files called pinballs) and by allowing Pin-based tools to run off the captured execution. Most Pintools can be trivially modified to work off pinballs thus doing their usual analysis but with a guaranteed repeatability. Furthermore, the capture/replay works across operating systems (Windows to Linux) as the pinball format is independent of the operating system. We have used PinPlay to analyze and deterministically debug large parallel programs running trillions of instructions. This paper describes the design of PinPlay and its applications for analyses such as simulation point selection, tracing, and debugging.
- F. Bellard. Qemu, a fast and portable dynamic translator. In USENIX Annual Technical Conference, FREENIX Track, pages 41--46. USENIX, 2005. Google ScholarDigital Library
- S. Bhansali, W.-K. Chen, S. de Jong, A. Edwards, R. Murray,M. Drinić, D. Mihočka, and J. Chau. Framework for instruction-level tracing and analysis of program executions. In Proceedings of the 2nd international conference on Virtual execution environments (VEE), pages 154--163, 2006. Google ScholarDigital Library
- B. Boothe. Efficient algorithms for bidirectional debugging. In Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation(PLDI), pages 299--310, 2000. Google ScholarDigital Library
- J.-D. Choi, B. Alpern, T. Ngo, and J. Vlissides. A perturbation-free replay platform for cross-optimized multithreaded applications. Parallel and Distributed Processing Symposium, International, 1:10023a, 2001. Google ScholarDigital Library
- J. Chow, T. Garfinkel, and P. M. Chen. Decoupling dynamic program analysis from execution in virtual environments. In R. Isaacs and Y. Zhou, editors, USENIX Annual Technical Conference, pages 1--14. USENIX Association, 2008. ISBN 978-1-931971-59-1. Google ScholarDigital Library
- F. Cornelis, M. Ronsse, and K. D. Bosschere. Tornado: A novel input replay tool. In In Proceedings of the 2003 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pages 1598--1604, 2003.Google Scholar
- G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. Revirt: Enabling intrusion analysis through virtual--machine logging and replay. In In Proceedings of the 2002 Symposium on Operating Systems Design and Implementation (OSDI), volume 36, pages 211--224, 2002. Google ScholarDigital Library
- A. Jaleel, R. S. Cohn, C.-K. Luk, and B. Jacob. Cmp im: A pin-based on-the-fly single/multi-core cache simulator. In Proceedings of the 4th Annual Workshop on Modeling, Benchmarking and Simulation, MoBS, 2008.Google Scholar
- S. T. King, G. W. Dunlap, and P. M. Chen. Debugging operating systems with time-traveling virtual machines. In In USENIX Annual Technical Conference, pages 1--15, 2005. Google ScholarDigital Library
- T. J. LeBlanc and J. M. Mellor-Crummey. Debugging parallel programs with instant replay. IEEE Trans. Comput., 36(4):471--482, 1987. Google ScholarDigital Library
- C.-K. Luk, R. S. Cohn, R. Muth, H. Patil, A. Klauser, P. G. Lowney, S. Wallace, V. J. Reddi, and K. M. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In V. Sarkar and M. W. Hall, editors, PLDI, pages 190--200, 2005. Google ScholarDigital Library
- P. Montesinos, M. Hicks, S. T. King, and J. Torrellas. Capo: a software-hardware interface for practical deterministic multiprocessor replay. In Proceeding of the 14th international conference on Architectural support for programming languages and operating systems (ASPLOS), pages 73--84, 2009. Google ScholarDigital Library
- S. Narayanasamy, G. Pokam, and B. Calder. Bugnet: Continuously recording program execution for deterministic replay debugging. In Proceedings of the 32nd annual international symposium on Computer Architecture(ISCA), pages 284--295, 2005. Google ScholarDigital Library
- S. Narayanasamy, C. Pereira, H. Patil, R. Cohn, and B. Calder. Automatic logging of operating system effects to guide application-level architecture simulation. In Proceedings of the joint international conference on Measurement and modeling of computer systems(SIGMETRICS), pages 216--227, 2006. Google ScholarDigital Library
- R. H. B. Netzer. Optimal tracing and replay for debugging shared-memory parallel programs. In Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging, pages 1--11, 1993. Google ScholarDigital Library
- M. Olszewski, J. Ansel, and S. P. Amarasinghe. Kendo: efficient deterministic multithreading in software. In M. L. Soffa and M. J. Irwin, editors, ASPLOS, pages 97--108. ACM, 2009. ISBN 978-1-60558-406-5. Google ScholarDigital Library
- D. Z. Pan and M. A. Linton. Supporting reverse execution for parallel programs. SIGPLAN Not., 24(1):124--129, 1989. Google ScholarDigital Library
- H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi. Pinpointing representative portions of large Intel Itanium programs with dynamic instrumentation. In MICRO-37, 2004. Google ScholarDigital Library
- C. Pereira, H. Patil, and B. Calder. Reproducible simulation of multithreaded workloads for architecture design exploration. In IISWC,pages 173--182, 2008.Google Scholar
- J. Ringenberg and T. N. Mudge. Suitespecks and suitespots: A methodology for the automatic conversion of benchmarking programs into intrinsically checkpointed assembly code. In ISPASS, pages 227--237. IEEE, 2009.Google ScholarCross Ref
- M. Ronsse and K. De Bosschere. Recplay: a fully integrated practical record/replay system. ACM Trans. Comput. Syst., 17(2):133--152,1999. Google ScholarDigital Library
- M. Russinovich and B. Cogswell. Replay for concurrent non-deterministic shared-memory applications. In PLDI '96: Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation, pages 258--266, 1996. Google ScholarDigital Library
- T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In ASPLOS-X, 2002. Google ScholarDigital Library
- S. M. Srinivasan, S. Kandula, S. K, C. R. Andrews, and Y. Zhou. Flashback: A lightweight extension for rollback and deterministic replay for software debugging. In In USENIX Annual Technical Conference, General Track, pages 29--44, 2004. Google ScholarDigital Library
- J. Tucek, S. Lu, C. Huang, S. Xanthos, and Y. Zhou. Triage: diagnosing production run failures at the user's site. In T. C. Bressoud and M. F. Kaashoek, editors, SOSP, pages 131--144. ACM, 2007. ISBN978-1-59593-591-5. Google ScholarDigital Library
- M. Xu, R. Bodik, and M. D. Hill. A "flight data recorder" for enabling full-system multiprocessor deterministic replay. In Proceedings of the 30th annual international symposium on Computer architecture(ISCA), pages 122--135, 2003. Google ScholarDigital Library
- M. Xu, V. Malyugin, J. Sheldon, G. Venkitachalam, B. Weissman, and V. Inc. Retrace: Collecting execution trace with virtual machine deterministic replay. In In Proceedings of the 3rd Annual Workshop on Modeling, Benchmarking and Simulation, MoBS, 2007.Google Scholar
Index Terms
- PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs
Recommendations
Processor-Oblivious Record and Replay
PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingRecord-and-replay systems are useful tools for debugging non-deterministic parallel programs by first recording an execution and then replaying that execution to produce the same access pattern. Existing record-and-replay systems generally target thread-...
Processor-Oblivious Record and Replay
Record-and-replay systems are useful tools for debugging non-deterministic parallel programs by first recording an execution and then replaying that execution to produce the same access pattern. Existing record-and-replay systems generally target thread-...
Processor-Oblivious Record and Replay
PPoPP '17Record-and-replay systems are useful tools for debugging non-deterministic parallel programs by first recording an execution and then replaying that execution to produce the same access pattern. Existing record-and-replay systems generally target thread-...
Comments