Abstract
Virtual machines (VMs) enable the distribution of programs in an architecture-neutral format, which can easily be interpreted or compiled. A long-running question in the design of VMs is whether a stack architecture or register architecture can be implemented more efficiently with an interpreter. We extend existing work on comparing virtual stack and virtual register architectures in three ways. First, our translation from stack to register code and optimization are much more sophisticated. The result is that we eliminate an average of more than 46% of executed VM instructions, with the bytecode size of the register machine being only 26% larger than that of the corresponding stack one. Second, we present a fully functional virtual-register implementation of the Java virtual machine (JVM), which supports Intel, AMD64, PowerPC and Alpha processors. This register VM supports inline-threaded, direct-threaded, token-threaded, and switch dispatch. Third, we present experimental results on a range of additional optimizations such as register allocation and elimination of redundant heap loads. On the AMD64 architecture the register machine using switch dispatch achieves an average speedup of 1.48 over the corresponding stack machine. Even using the more efficient inline-threaded dispatch, the register VM achieves a speedup of 1.15 over the equivalent stack-based VM.
- Alexander, W. and Wortman, D. 1975. Static and dynamic chracteristics of XPL programs. Computer 8, 11 (Nov.), 41--46. Google ScholarDigital Library
- Antonioli, D. N. and Pilz, M. 1998. Analysis of the Java class file format. Tech. rep. Google ScholarDigital Library
- Arnold, M., Hind, M., and Ryder, B. G. 2002. Online feedback-directed optimization of Java. In OOPSLA '02: Proceedings of the 17th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications. ACM Press, New York. 111--129. Google ScholarDigital Library
- Bell, J. R. 1973. Threaded code. Commun. ACM 16, 6, 370--372. Google ScholarDigital Library
- Berndl, M., Vitale, B., Zaleski, M., and Brown, A. D. 2005. Context threading: A flexible and efficient dispatch technique for virtual machine interpreters. In 2005 International Symposium on Code Generation and Optimization. Google ScholarDigital Library
- Briggs, P., Cooper, K. D., and Torczon, L. 1994. Improvements to graph coloring register allocation. ACM Transactions on Programming Languages and Systems 16, 3 (May), 428--455. Google ScholarDigital Library
- Bull, M., Smith, L., Westhead, M., Henty, D., and Davey, R. 2000. Benchmarking Java Grande applications. In Second Ineternational Conference and Exhibtion on the Practical Application of Java. Manchester, UK.Google Scholar
- Choi, J.-D., Grove, D., Hind, M., and Sarkar, V. 1999. Efficient and precise modeling of exceptions for the analysis of Java programs. In PASTE '99: Proceedings of the 1999 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering. ACM Press, New York. 21--31. Google ScholarDigital Library
- Clausen, L. R., Schultz, U. P., Consel, C., and Muller, G. 2000. Java bytecode compression for low-end embedded systems. ACM Trans. Program. Lang. Syst. 22, 3, 471--489. Google ScholarDigital Library
- Cytron, R., Ferrante, J., Rosen, B. K., Wegman, M. N., and Zadeck, F. K. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems 13, 4 (Oct.), 451--490. Google ScholarDigital Library
- Davis, B., Beatty, A., Casey, K., Gregg, D., and Waldron, J. 2003. The case for virtual register machines. In Interpreters, Virtual Machines and Emulators (IVME '03). 41--49. Google ScholarDigital Library
- Diwan, A., McKinley, K. S., and Moss, J. E. B. 1998. Type-based alias analysis. In SIGPLAN Conference on Programming Language Design and Implementation. 106--117. Google ScholarDigital Library
- Eller, H. 2005. Optimizing interpreters with superinstructions. M.S. thesis, Institut für Computersprachen, Technische Universität Wien. http://www.complang.tuwien.ac.at/Dimplomarbeiten/eller05.ps.gz.Google Scholar
- Ertl, M. A. 1995. Stack caching for interpreters. In SIGPLAN '95 Conference on Programming Language Design and Implementation. 315--327. Google ScholarDigital Library
- Ertl, M. A. and Gregg, D. 2003. The structure and performance of efficient interpreters. The Journal of Instruction-Level Parallelism 5. http://www.jilp.org/vol5/.Google Scholar
- Ertl, M. A., Gregg, D., Krall, A., and Paysan, B. 2002. vmgen---A generator of efficient virtual machine interpreters. Software---Practice and Experience 32, 3, 265--294. Google ScholarDigital Library
- Ertl, M. A., Thalinger, C., and Krall, A. 2006. Superinstructions and replication in the Cacao JVM interpreter. Journal of .NET Technologies 4, 25--32. Journal papers from .NET Technologies 2006 conference.Google Scholar
- Fink, S., Knobe, K., and Sarkar, V. 2000. Unified analysis of array and object references in strongly typed languages. Lecture Notes in Computer Science Volume 1824 (Feb.), 155--174. Google ScholarDigital Library
- Gosling, J. 1995. Java Intermediate Bytecodes. In Proc. ACM SIGPLAN Workshop on Intermediate Representations. ACM Sigplan Notices, vol. 30:3. San Francisco, CA. 111--118. Google ScholarDigital Library
- Gregg, D., Beatty, A., Casey, K., Davis, B., and Nisbet, A. 2005. The case for virtual register machines. Science of Computer Programming, Special Issue on Interpreters Virtual Machines and Emulators 57, 319--338. Google ScholarDigital Library
- Ierusalimschy, R. L., de Figueiredo, H., and Celes, W. 1996. Lua---an extensible extension language. Software: Practice and Experience 26, 6, 635--652. Google ScholarDigital Library
- Ierusalimschy, R., de Figueiredo, L., and Celes, W. 2005. The implementation of Lua 5.0. Journal of Universal Computer Science 11, 7, 1159--1176. http://www.jucs.org/jucs_11_7/the_implementation_of_lua.Google Scholar
- Kistler, T. and Franz, M. 1999. A tree-based alternative to Java byte-codes. International Journal of Parallel Programming 27, 1, 21--33. Google ScholarDigital Library
- Klint, P. 1981. Interpretation techniques. Software---Practice and Experience 11, 963--973.Google ScholarCross Ref
- Krall, A. and Grafl, R. 1997. Cacao---a 64-bit JavaVM just-in-time compiler. Concurrency---Practice and Experience 9, 11, 1017--1030.Google ScholarCross Ref
- Lengauer, T. and Tarjan, R. E. 1979. A fast algorithm for finding dominators in a flowgraph. ACM Trans. Program. Lang. Syst. 1, 1, 121--141. Google ScholarDigital Library
- McGlashan, B. and Bower, A. 1999. The interpreter is dead (slow). Isn't it? In OOPSLA'99 Workshop: Simplicity, Performance and Portability in Virtual Machine Design.Google Scholar
- Mössenböck, H. 2000. Adding static single assignment form and a graph coloring register allocator to the Java Hotspot client compiler. Technical Report TR-15, Johannes Kepler University Linz Institute for Practical Computer Science, Altenbergerstra 69, A-4040 Linz.Google Scholar
- Muchnick, S. S. 1997. Advanced compiler design and implementation. Morgan Kaufmann, San Francisco, CA. Google ScholarDigital Library
- Myers, G. J. 1977. The case against stack-oriented instruction sets. Computer Architecture News 6, 3 (Aug.), 7--10. Google ScholarDigital Library
- Patterson, D. and Ditzel, D. 1980. The case for the reduced instruction set computer. Computer Architecture News 8, 6 (Oct.), 25--33. Google ScholarDigital Library
- Piumarta, I. and Riccardi, F. 1998. Optimizing direct threaded code by selective inlining. In SIGPLAN '98 Conference on Programming Language Design and Implementation. 291--300. Google ScholarDigital Library
- Pugh, W. 1999. Compressing Java class files. In PLDI '99: Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation. ACM Press, New York. 247--258. Google ScholarDigital Library
- Radhakrishnan, R., Vijaykrishnan, N., John, L. K., and Sivasubramaniam, A. 2000. Architectural issues in Java runtime systems. In Proceedings of the Sixth International Symposium on High-Performance Computer Architecture (8--12 Jan.). Toulouse, France. 387--398.Google Scholar
- Schulthess, P. and Mumprecht, E. 1977. Reply to the case against stack-oriented instruction sets. Computer Architecture News 6, 5 (Dec.), 24--27. Google ScholarDigital Library
- Shi, Y., Gregg, D., Beatty, A., and Ertl, M. A. 2005. Virtual machine showdown: stack versus registers. In ACM/SIGPLAN Conference on Virtual Execution Environments. ACM Press, New York. 153--163. Google ScholarDigital Library
- Shustek, L. 1978. Aanalysis and performance of computer instruction sets. Ph.D. thesis, Stanford University. Google ScholarDigital Library
- SPEC. 1998. SPEC releases SPEC JVM98, first industry-standard benchmark for measuring Java virtual machine performance. Press Release. http://www.spec.org/jvm98/press.html.Google Scholar
- Sugalski, D. 2002. Parrot in detail. In Yet Another Perl Conference (YAPC 02). Saint Louis, Missouri. http://www.parrotcode.org/talks/ParrotInDetail2.pdf.Google Scholar
- Sun-Microsystems. 2001. The Java Hotspot virtual machine. Tech. rep., Sun Microsystems Inc.Google Scholar
- Tip, F., Sweeney, P. F., Laffra, C., Eisma, A., and Streeter, D. 2002. Practical extraction techniques for Java. ACM Trans. Program. Lang. Syst. 24, 6, 625--666. Google ScholarDigital Library
- Vallée-Rai, R., Hendren, L., Sundaresan, V., Lam, P., Gagnon, E., and Co, P. 1999. Soot---a Java optimization framework. In Proceedings of CASCON 1999. 125--135.Google Scholar
- Vitale, B. and Abdelrahman, T. S. 2004. Catenation and specialization for Tcl virtual machine performance. In IVME '04: Proceedings of the 2004 workshop on Interpreters, Virtual Machines and Emulators. ACM Press, New York. 42--50. Google ScholarDigital Library
- Wiecek, C. 1982. A case study of the VAX 11 instruction set usage for compiler execution. In Proceedings of the Symposium on Architectural Support for Programming Languages and Operation Systems. IEEE/ACM, Palo Alto, California. 177--184. Google ScholarDigital Library
- Winterbottom, P. and Pike, R. 1997. The design of the Inferno virtual machine. In IEEE Compcon 97 Proceedings. San Jose, California. 241--244. Google ScholarDigital Library
Index Terms
- Virtual machine showdown: Stack versus registers
Recommendations
RegCPython: A Register-based Python Interpreter for Better Performance
Interpreters are widely used in the implementation of many programming languages, such as Python, Perl, and Java. Even though various JIT compilers emerge in an endless stream, interpretation efficiency still plays a critical role in program performance. ...
Virtual machine showdown: stack versus registers
VEE '05: Proceedings of the 1st ACM/USENIX international conference on Virtual execution environmentsVirtual machines (VMs) are commonly used to distribute programs in an architecture-neutral format, which can easily be interpreted or compiled. A long-running question in the design of VMs is whether stack architecture or register architecture can be ...
The case for virtual register machines
IVME '03: Proceedings of the 2003 workshop on Interpreters, virtual machines and emulatorsVirtual machines (VMs) are a popular target for language implementers. Conventional wisdom tells us that virtual stack architectures can be implemented with an interpreter more efficiently, since the location of operands is implicit in the stack ...
Comments