ABSTRACT
With the ubiquity of parallel commodity hardware, developers turn to high-level concurrency models such as the actor model to lower the complexity of concurrent software. However, debugging concurrent software is hard, especially for concurrency models with a limited set of supporting tools. Such tools often deal only with the underlying threads and locks, which obscures the view on e.g. actors and messages and thereby introduces additional complexity.
To improve on this situation, we present a low-overhead record & replay approach for actor languages. It allows one to debug concurrency issues deterministically based on a previously recorded trace. Our evaluation shows that the average run-time overhead for tracing on benchmarks from the Savina suite is 10% (min. 0%, max. 20%). For Acme-Air, a modern web application, we see a maximum increase of 1% in latency for HTTP requests and about 1.4 MB/s of trace data. These results are a first step towards deterministic replay debugging of actor systems in production.
- Joe Armstrong, Robert Virding, Claes Wikstrom, and Mike Williams. 1996. Concurrent Programming in Erlang (2 ed.). Prentice Hall PTR. Google ScholarDigital Library
- Earl T Barr, Mark Marron, Ed Maurer, Dan Moseley, and Gaurav Seth. 2016. Time-travel debugging for JavaScript/Node.js. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, 1003--1007. Google ScholarDigital Library
- Edd Barrett, Carl Friedrich Bolz-Tereick, Rebecca Killick, Sarah Mount, and Laurence Tratt. 2017. Virtual Machine Warmup Blows Hot and Cold. Proc. ACM Program. Lang. 1, OOPSLA, Article 52 (Oct. 2017), 27 pages. Google ScholarDigital Library
- Elisa Gonzalez Boix, Carlos Noguera, Tom Van Cutsem, Wolfgang De Meuter, and Theo D'Hondt. 2011. Reme-d: A reflective epidemic message-oriented debugger for ambient-oriented applications. In Proceedings of the 2011 ACM Symposium on Applied Computing. ACM, 1275--1281. Google ScholarDigital Library
- Gilad Bracha, Peter von der Ahé, Vassili Bykov, Yaron Kashai, William Maddox, and Eliot Miranda. 2010. Modules as Objects in Newspeak. In ECOOP 2010 - Object-Oriented Programming, Theo D'Hondt (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 405--428. Google ScholarDigital Library
- Brian Burg, Richard Bailey, Andrew J. Ko, and Michael D. Ernst. 2013. Interactive Record/Replay for Web Application Debugging. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology (UIST'13). ACM, 473--484. Google ScholarDigital Library
- Sergey Bykov, Alan Geller, Gabriel Kliot, James R. Larus, Ravi Pandya, and Jorgen Thelin. 2011. Orleans: Cloud Computing for Everyone. In Proceedings of the 2Nd ACM Symposium on Cloud Computing (SOCC '11). ACM, New York, NY, USA, Article 16, 14 pages. Google ScholarDigital Library
- Yunji Chen, Shijin Zhang, Qi Guo, Ling Li, Ruiyang Wu, and Tianshi Chen. 2015. Deterministic Replay: A Survey. ACM Comput. Surv. 48, 2, Article 17 (Sept. 2015), 47 pages. Google ScholarDigital Library
- Sylvan Clebsch, Sophia Drossopoulou, Sebastian Blessing, and Andy McNeil. 2015. Deny Capabilities for Safe, Fast Actors. In Proceedings of the 5th International Workshop on Programming Based on Actors, Agents, and Decentralized Control (AGERE! 2015). ACM, New York, NY, USA, 1--12. Google ScholarDigital Library
- Ronald Curtis and Larry D. Wittie. 1982. BUGNET: A debugging system for parallel programming environments. In Proceedings of the 3rd International Conference on Distributed Computing Systems (ICDCS'82). IEEE Computer Society, 394--400.Google Scholar
- Joeri De Koster, Tom Van Cutsem, and Wolfgang De Meuter. 2016. 43 Years of Actors: A Taxonomy of Actor Models and Their Key Properties. In Proceedings of the 6th International Workshop on Programming Based on Actors, Agents, and Decentralized Control (AGERE 2016). ACM, New York, NY, USA, 31--40. Google ScholarDigital Library
- Tim Felgentreff, Michael Perscheid, and Robert Hirschfeld. 2017. Implementing record and refinement for debugging timing-dependent communication. Science of Computer Programming 134 (2017), 4--18. Google ScholarDigital Library
- Jim Gray. 1986. Why do computers stop and what can be done about it?. In Symposium on reliability in distributed software and database systems. Los Angeles, CA, USA, 3--12.Google Scholar
- Emily H Halili. 2008. Apache JMeter: A practical beginner's guide to automated testing and performance measurement for your websites. Packt Publishing Ltd. Google ScholarDigital Library
- Carl Hewitt, Peter Bishop, and Richard Steiger. 1973. A Universal Modular ACTOR Formalism for Artificial Intelligence. In IJCAI'73: Proceedings of the 3rd International Joint Conference on Artificial Intelligence. Morgan Kaufmann, 235--245. Google ScholarDigital Library
- Jeff Huang, Peng Liu, and Charles Zhang. 2010. LEAP: Lightweight Deterministic Multi-processor Replay of Concurrent Java Programs. In Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE '10). ACM, New York, NY, USA, 207--216. Google ScholarDigital Library
- Shams M. Imam and Vivek Sarkar. 2014. Savina - An Actor Benchmark Suite: Enabling Empirical Evaluation of Actor Libraries. In Proceedings of the 4th International Workshop on Programming Based on Actors Agents & Decentralized Control (AGERE!'14). ACM, 67--80. Google ScholarDigital Library
- Jacques Chassin de Kergommeaux, Michiel Ronsse, and Koenraad De Bosschere. 1999. MPL*: Efficient Record/Play of Nondeterministic Features of Message Passing Libraries. In Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface. Springer, London, UK, 141--148. Google ScholarDigital Library
- Ivan Lanese, Naoki Nishida, Adrián Palacios, and Germán Vidal. 2018. CauDEr: A Causal-Consistent Reversible Debugger for Erlang. In Functional and Logic Programming (FLOPS'18), Vol. 10818. Springer, 247--263.Google Scholar
- Thomas J LeBlanc and John M Mellor-Crummey. 1987. Debugging parallel programs with instant replay. IEEE Trans. Comput. 4 (1987), 471--482. Google ScholarDigital Library
- Philipp Lengauer, Verena Bitto, and Hanspeter Mössenböck. 2015. Accurate and Efficient Object Tracing for Java Applications. In Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering (ICPE '15). ACM, New York, NY, USA, 51--62. Google ScholarDigital Library
- Hongyu Liu, Sam Silvestro, Wei Wang, Chen Tian, and Tongping Liu. 2018. iReplayer: In-situ and Identical Record-and-replay for Multithreaded Applications. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, New York, NY, USA, 344--358. Google ScholarDigital Library
- Stefan Marr. 2018. ReBench: Execute and Document Benchmarks Reproducibly. (August 2018). Version 1.0.Google Scholar
- Stefan Marr, Benoit Daloze, and Hanspeter Mössenböck. 2016. Cross-Language Compiler Benchmarking---Are We Fast Yet?. In Proceedings of the 12th ACM SIGPLAN International Symposium on Dynamic Languages (DLS'16), Vol. 52. ACM, 120--131. Google ScholarDigital Library
- Stefan Marr, Carmen Torres Lopez, Dominik Aumayr, Elisa Gonzalez Boix, and Hanspeter Mössenböck. 2017. A Concurrency-Agnostic Protocol for Multi-Paradigm Concurrent Debugging Tools. In Proceedings of the 13th ACM SIGPLAN International Symposium on Dynamic Languages (DLS'17). ACM. Google ScholarDigital Library
- Ali José Mashtizadeh, Tal Garfinkel, David Terei, David Mazieres, and Mendel Rosenblum. 2017. Towards Practical Default-On Multi-Core Record/Replay. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'17). ACM, 693--708. Google ScholarDigital Library
- Charles E. McDowell and David P. Helmbold. 1989. Debugging Concurrent Programs. ACM Comput. Surv. 21, 4 (Dec. 1989), 593--622. Google ScholarDigital Library
- Mark S. Miller, E. Dean Tribble, and Jonathan Shapiro. 2005. Concurrency Among Strangers: Programming in E As Plan Coordination. In Proceedings of the 1st International Conference on Trustworthy Global Computing (TGC'05). Springer, 195--229. Google ScholarDigital Library
- Satish Narayanasamy, Gilles Pokam, and Brad Calder. 2005. BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging. SIGARCH Comput. Archit. News 33, 2 (May 2005), 284--295. Google ScholarDigital Library
- Michael Perscheid, Benjamin Siegmund, Marcel Taeumel, and Robert Hirschfeld. 2016. Studying the advancement in debugging practice of professional software developers. Software Quality Journal 25, 1 (2016), 83--110. http://dblp.uni-trier.de/db/journals/sqj/sqj25.html#PerscheidSTH17 Google ScholarDigital Library
- Michiel Ronsse, Koen De Bosschere, and Jacques Chassin de Kergommeaux. 2000. Execution replay and debugging. In Proceedings of the Fourth International Workshop on Automated Debugging (AADebug).Google Scholar
- M. A. Ronsse and D. A. Kranzlmuller. 1998. RoltMP-replay of Lamport timestamps for message passing systems. In Parallel and Distributed Processing, 1998. PDP '98. Proceedings of the Sixth Euromicro Workshop on. 87--93.Google Scholar
- Andrea Rosà, Lydia Y. Chen, and Walter Binder. 2016. Actor Profiling in Virtual Execution Environments. In Proceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'16). ACM, 36--46. Google ScholarDigital Library
- Andrea Rosà, Lydia Y. Chen, and Walter Binder. 2016. Profiling Actor Utilization and Communication in Akka. In Proceedings of the 15th International Workshop on Erlang (Erlang 2016). ACM, 24--32. Google ScholarDigital Library
- Kazuhiro Shibanai and Takuo Watanabe. 2017. Actoverse: a reversible debugger for actors. In Proceedings of the 7th ACM SIGPLAN International Workshop on Programming Based on Actors, Agents, and Decentralized Control. ACM, 50--57. Google ScholarDigital Library
- Benjamin H. Sigelman, Luiz André Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan, and Chandan Shanbhag. 2010. Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. Technical report. Technical report, Google, Inc. https://research.google.com/archive/papers/dapper-2010-1.pdfGoogle Scholar
- Terry Stanley, Tyler Close, and Mark S Miller. 2009. Causeway: A message-oriented distributed debugger. Technical Report of HP, HPL-2009-78 (2009).Google Scholar
- Dave Thomas. 2014. Programming Elixir: Functional, Concurrent, Pragmatic, Fun (1st ed.). Pragmatic Bookshelf. Google ScholarDigital Library
- Stefan Tilkov and Steve Vinoski. 2010. Node.js: Using JavaScript to Build High-Performance Network Programs. IEEE Internet Computing 14, 6 (Nov 2010), 80--83. Google ScholarDigital Library
- Carmen Torres Lopez, Stefan Marr, Hanspeter Mössenböck, and Elisa Gonzalez Boix. 2016. Towards Advanced Debugging Support for Actor Languages: Studying Concurrency Bugs in Actor-based Programs. (30 Oct. 2016), 5 pages.Google Scholar
- Takanori Ueda, Takuya Nakaike, and Moriyoshi Ohara. 2016. Workload Characterization for Microservices. In 2016 IEEE International Symposium on Workload Characterization (IISWC'16). IEEE, 85--94.Google Scholar
- Tom Van Cutsem. 2012. AmbientTalk: Modern Actors for Modern Networks. In Proceedings of the 14th Workshop on Formal Techniques for Java-like Programs (FTfJP '12). ACM, 2--2. Google ScholarDigital Library
- Thomas Würthinger, Christian Wimmer, Christian Humer, Andreas Wöß, Lukas Stadler, Chris Seaton, Gilles Duboscq, Doug Simon, and Matthias Grimmer. 2017. Practical Partial Evaluation for High-performance Dynamic Language Runtimes. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'17). ACM, 662--676. Google ScholarDigital Library
- Thomas Würthinger, Andreas Wöß, Lukas Stadler, Gilles Duboscq, Doug Simon, and Christian Wimmer. 2012. Self-Optimizing AST Interpreters. In Proceedings of the 8th Dynamic Languages Symposium (DLS'12). 73--82. Google ScholarDigital Library
Index Terms
- Efficient and deterministic record & replay for actor languages
Recommendations
Debugging support for multi-paradigm concurrent programs
SPLASH Companion 2019: Proceedings Companion of the 2019 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for HumanityWith the widespread adoption of concurrent programming, debugging of non-deterministic failures becomes increasingly important. Record & replay debugging aids developers in this effort by reliably reproducing recorded bugs. Because each concurrency ...
Software-only system-level record and replay in wireless sensor networks
IPSN '15: Proceedings of the 14th International Conference on Information Processing in Sensor NetworksWireless sensor networks (WSNs) are plagued by the possibility of bugs manifesting only at deployment. However, debugging deployed WSNs is challenging for several reasons---the remote location of deployed sensor nodes, the non- determinism of execution ...
CARE: cache guided deterministic replay for concurrent Java programs
ICSE 2014: Proceedings of the 36th International Conference on Software EngineeringDeterministic replay tools help programmers debug concurrent programs. However, for long-running programs, a replay tool may generate huge log of shared memory access dependences. In this paper, we present CARE, an application-level deterministic ...
Comments