ABSTRACT
Register allocation is a mandatory task for almost every compiler and consumes a significant portion of compile time. In a just-in-time compiler, compile time is a particular issue because compilation happens during program execution and contributes to the overall application run time. Parallelization can help here. We developed a theoretical model for parallel register allocation and show that it can be used in practice without a negative impact on the quality of the allocation result. Doing so reduces compilation latency, i.e., the duration until the result of a compilation is available.
Our analysis shows that parallelization can theoretically decrease allocation latency by almost 50%. We implemented an initial prototype which reduces the register allocation latency by 28% when using four threads, compared to the single-threaded allocation.
- Bala, Vasanth, Evelyn Duesterwald, and Sanjeev Banerjia (2000). Dynamo: A Transparent Dynamic Optimization System. In: PLDI '00. ACM. Google ScholarDigital Library
- Barrett, Edd, Carl Friedrich Bolz-Tereick, Rebecca Killick, Sarah Mount, and Laurence Tratt (2017). Virtual Machine Warmup Blows Hot and Cold. In: Proc. ACM Program. Lang. Google ScholarDigital Library
- Blackburn, S. M. et al. (2006). The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In: OOPSLA'06. ACM Press. Google ScholarDigital Library
- Callahan, David and Brian Koblenz (1991). Register Allocation via Hierarchical Graph Coloring. In: SIGPLAN Not. Google ScholarDigital Library
- Chaitin, Gregory J, Marc A Auslander, Ashok K Chandra, John Cocke, Martin E Hopkins, and Peter W Markstein (1981). Register Allocation via Coloring. In: Computer languages. Google ScholarDigital Library
- Cytron, Ron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck (1991). Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. In: TOPLAS'91. Google ScholarDigital Library
- Duboscq, Gilles, Thomas Würthinger, Lukas Stadler, Christian Wimmer, Doug Simon, and Hanspeter Mössenböck (2013). An Intermediate Representation for Speculative Optimizations in a Dynamic Compiler. In: VMIL'13. Google ScholarDigital Library
- Eisl, Josef (2015). Trace Register Allocation. In: SPLASH Companion 2015. ACM. Google ScholarDigital Library
- Eisl, Josef, Matthias Grimmer, Doug Simon, Thomas Würthinger, and Hanspeter Mössenböck (2016). Trace-based Register Allocation in a JIT Compiler. In: PPPJ '16. ACM. Google ScholarDigital Library
- Eisl, Josef, Stefan Marr, Thomas Würthinger, and Hanspeter Mössenböck (2017). Trace Register Allocation Policies: Compile-time vs. Performance Trade-offs. In: ManLang 2017. ACM. Google ScholarDigital Library
- Fleming, Philip J. and John J. Wallace (1986). How Not To Lie With Statistics: The Correct Way To Summarize Benchmark Results. In: Communications of the ACM. Google ScholarDigital Library
- Gantt, Henry Laurence (1913). Work, Wages, and Profits. Second Edition. The Engineering Magazine Co.Google Scholar
- Hank, R.E., W.W. Hwu, and B.R. Rau (1995). Region-based compilation: an introduction and motivation. In: Proceedings of the 28th Annual International Symposium on Microarchitecture. Google ScholarDigital Library
- Häubl, Christian, Christian Wimmer, and Hanspeter Mössenböck (2013). Context-sensitive trace inlining for Java. In: Computer Languages, Systems & Structures. Google ScholarDigital Library
- Hennessy, John L. and David A. Patterson (2003). Computer Architecture: A Quantitative Approach. 3rd ed. Morgan Kaufmann Publishers Inc. <scp>isbn</scp>: 1558607242. Google ScholarDigital Library
- Hölzle, Urs, Craig Chambers, and David Ungar (1992). Debugging Optimized Code with Dynamic Deoptimization. In: SIGPLAN Not. Google ScholarDigital Library
- Kotzmann, Thomas, Christian Wimmer, Hanspeter Mössenböck, Thomas Rodriguez, Kenneth Russell, and David Cox (2008). Design of the Java HotSpot<sup>™</sup> client compiler for Java 6. In: TACO'08.Google Scholar
- Krintz, Chandra J., David Grove, Vivek Sarkar, and Brad Calder (2001). Reducing the overhead of dynamic compilation. In: Software: Practice and Experience.Google Scholar
- Leopoldseder, David, Lukas Stadler, Thomas Würthinger, Josef Eisl, Doug Simon, and Hanspeter Mössenböck (2018). Dominance-based Duplication Simulation (DBDS) - Code Duplication to Enable Compiler Optimizations. In: CGO'18. ACM. Google ScholarDigital Library
- Lueh, Guei-Yuan, Thomas Gross, and Ali-Reza Adl-Tabatabai (1997). "Global register allocation based on graph fusion". In: Lecture Notes in Computer Science. Springer Berlin Heidelberg.Google Scholar
- McIlroy, Ross (2018). V8 JavaScript Engine: Background compilation. url: https://v8project.blogspot.co.at/2018/03/background-compilation.html (visited on 03/28/2018).Google Scholar
- Oracle Corporation (2015). JRockit to HotSpot Migration Guide: Compilation Optimization. url: https://docs.oracle.com/javacomponents/jrockit-hotspot/migration-guide/comp-opt.htm (visited on 03/28/2018).Google Scholar
- Pinedo, Michael L. (2016). Scheduling: Theory, Algorithms, and Systems. 5th ed. Springer International Publishing.Google Scholar
- Poletto, Massimiliano and Vivek Sarkar (1999). Linear Scan Register Allocation. In: TOPLAS'99. Google ScholarDigital Library
- Prokopec, Aleksandar, David Leopoldseder, Gilles Duboscq, and Thomas Würthinger (2017). Making Collection Operations Optimal with Aggressive JIT Compilation. In: SCALA 2017. ACM. Google ScholarDigital Library
- Sewe, Andreas, Mira Mezini, Aibek Sarimbekov, and Walter Binder (2011). Da capo con scala. In: OOPSLA'11.Google Scholar
- Simon, Doug, Christian Wimmer, Bernhard Urban, Gilles Duboscq, Lukas Stadler, and Thomas Würthinger (2015). Snippets: Taking the High Road to a Low Level. In: TACO'15. Google ScholarDigital Library
- Stadler, Lukas, Gilles Duboscq, Hanspeter Mössenböck, Thomas Würthinger, and Doug Simon (2013). An Experimental Study of the Influence of Dynamic Compiler Optimizations on Scala Performance. In: SCALA'13. ACM. Google ScholarDigital Library
- Stadler, Lukas, Thomas Würthinger, and Hanspeter Mössenböck (2014). Partial Escape Analysis and Scalar Replacement for Java. In: CGO '14. ACM. Google ScholarDigital Library
- Traub, Omri, Glenn Holloway, and Michael D. Smith (1998). Quality and Speed in Linear-scan Register Allocation. In: PLDI '98. ACM. Google ScholarDigital Library
- Wimmer, Christian and Michael Franz (2010). Linear Scan Register Allocation on SSA Form. In: CGO'10. ACM. Google ScholarDigital Library
- Wimmer, Christian and Hanspeter Mössenböck (2005). Optimized Interval Splitting in a Linear Scan Register Allocator. In: VEE'05. ACM. Google ScholarDigital Library
Index Terms
- Parallel trace register allocation
Recommendations
Trace-based Register Allocation in a JIT Compiler
PPPJ '16: Proceedings of the 13th International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and ToolsState-of-the-art dynamic compilers often use global approaches, like Linear Scan or Graph Coloring, for register allocation. These algorithms consider the complete compilation unit for allocation, which increases the complexity of the implementation (...
Trace Register Allocation Policies: Compile-time vs. Performance Trade-offs
ManLang 2017: Proceedings of the 14th International Conference on Managed Languages and RuntimesRegister allocation is an integral part of compilation, regardless of whether a compiler aims for fast compilation or optimal code quality. State-of-the-art dynamic compilers often use global register allocation approaches such as linear scan. Recent ...
Trace register allocation
SPLASH Companion 2015: Companion Proceedings of the 2015 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for HumanityThis paper proposes the idea of Trace Register Allocation, a register allocation approach that is tailored for just-in-time (JIT) compilation in the context of virtual machines with run-time feedback. The basic idea is to offload costly operations such ...
Comments