ABSTRACT
From a software engineering perspective, the Java programming language provides an attractive platform for writing numerically intensive applications. A major drawback hampering its widespread adoption in this domain has been its poor performance on numerical codes. This paper describes a prototype Java compiler which demonstrates that it is possible to achieve performance levels approaching those of current state-of-the-art C, C++ and Fortran compilers on numerical codes. We describe a new transformation called alias versioning that takes advantage of the simplicity of pointers in Java. This transformation, combined with other techniques that we have developed, enables the compiler to perform high order loop transformations (for better data locality) and parallelization completely automatically. We believe that our compiler is the first to have such capabilities of optimizing numerical Java codes. We achieve, with Java, between 80 and 100% of the performance of highly optimized Fortran code in a variety of benchmarks. Furthermore, the automatic parallelization achieves speedups of up to 3.8 on four processors. Combining this compiler technology with packages containing the features expected by programmers of numerical applications would enable Java to become a serious contender for implementing new numerical applications.
- 1.P. V. Artigas, M. Gupta, S. P. Midkiff, and J. E. Moreira. High performance numerical computing in Java: Language and compiler issues. In J. Ferrante et al., editors, 12th International Workshop on Languages and Compilers for Parallel Computing. Springer Vedag, August 1999. IBM Research Division report RC21482.]] Google ScholarDigital Library
- 2.U. Banerjee. Unimodular transformations of double loops. In Proc. Third Workshop on Programming Languages and Compilers for Parallel Computing, Irvine, California, August 1990.]]Google Scholar
- 3.U. Banerjee. Dependence Analysis. Loop Transformations for Restructuring Compilers. Kluwer Academic Publishers, Boston, MA, 1997.]] Google ScholarDigital Library
- 4.A. J. C. Bik and D. B. Gannon. A note on native level 1 BLAS in Java. Concurrency, Pract. Exp. (UK), 9(11):1091-1099, November 1997.]]Google Scholar
- 5.B. Blount and S. Chatterjee. An evaluation of Java for numerical computing. In Proceedings of lSCOPE'98, volume 1505 of Lecture Notes in Computer Science, pages 35-46. Springer Vedag, 1998.]] Google Scholar
- 6.R. E Boisvert, J. J. Dongarra, R. Pozo, K. A. Remington, and G. W. Stewart. Developing numerical libraries in Java. Concurrency, Pract. Exp. (UK), 10(11-13):1117-29, September-November 1998. ACM 1998 Workshop on Java for High-Performance Network Computing. URL: http : //www. cs. ucsb. edu/conferences / java98.]]Google Scholar
- 7.M. Byler, J. R. B. Davies, C. Huson, B. Leasure, and M. Wolfe. Multiple version loops. In Proceedings of the 1987 International Conference on Parallel Processing, pages 312-318, August 17-21 1987.]]Google Scholar
- 8.H. Casanova, J. Dongarra, and D. M. Doolin. Java access to numerical libraries. Concurrency, Pract. Exp. ( UK), 9( 11): 1279-91, November 1997. Java for Computational Science and Engineering - Simulation and Modeling II Las Vegas, NV, USA 21 June 1997.]]Google Scholar
- 9.M. Ciemiak and W. Li. Just-in-time optimization for high-performance Java programs. Concurrency, Pract. Exp. (UK), 9(11):1063-73, November 1997. Java for Computational Science and Engineering - Simulation and Modeling II, Las Vegas, NV, June 21, 1997.]]Google Scholar
- 10.K. D. Cooper and K. Kennedy. Faster interprocedural alias analysis. In Conference Record of the 16'th ACM Symposium on Principles of Programming Languages, pages 49-59, January 1989.]] Google ScholarDigital Library
- 11.M. Emami, R. Ghiya, and L. Hendren. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In Proceedings of the ACM SIGPLAN '94 Conference on Programming Language Design and Implementation, pages 242-256, June 1994.]] Google ScholarDigital Library
- 12.R. Fitzgerald, T. B. Knoblock, E. Ruf, B. Steensgaard, and D. Tarditi. Marmot: an optimizing compiler for Java. Technical report, Microsoft Research, October 1998. URL: http : //research .microsoft. com/apl% discretionary//default, htm.]]Google Scholar
- 13.V. Getov, S. Flynn-Hummel, and S. Mintchev. High-performance parallel programming in Java: Exploiting native libraries. In ACM 1998 Workshop on Java for High-Performance Network Computing. ACM SIGPLAN, 1998. URL: http : //www. cs. ucsb. edu/ conferences / java 98.]]Google Scholar
- 14.M. Hind, M. Burke, R Carini, and J.-D. Choi. Interprocedural pointer alias analysis. ACM Transactions on Programming Languages and Systems, 21 (4):847-893, July 1999. available as IBM Research Report RC28752.]] Google ScholarDigital Library
- 15.J. Hummel, L. J. Hendren, and A. Nicolau. A general data dependence test for dynamic, pointer-based data structures. In Proceedings of the ACM SIGPLAN '94 Conference on Programming Language Design and Implementation, pages 242-256, June 1994.]] Google ScholarDigital Library
- 16.Java Grande Forum. Java Grande Forum Report: Making Java Work for High-End Computing, November 1998. Java Grande Forum Panel, SC98, Orlando, FL. URL: http : //www. j avagrande, org/reports, htm.]]Google Scholar
- 17.S. R Midkiff, J. E. Moreira, and M. Snir. Optimizing array reference checking in Java programs. IBM Systems Journal, 37(3):409-.453, August 1998.]] Google ScholarDigital Library
- 18.J. E. Moreira and S. E Midkiff. Fortran 90 in CSE: A case study. IEEE Computational Science & Engineering, 5(2):39-..49, April-June 1998.]] Google ScholarDigital Library
- 19.J. E. Moreira, S. P. Midkiff, and M. Gupta. From flop to megaflops: Java for technical computing. ACM Transactions on Programming Languages and Systems, 1999. (to appear); Also available as IBM Research Report RC 21166.]] Google ScholarDigital Library
- 20.J.E. Moreira, S. R Midkiff, M. Gupta, E V. Artigas, M. Snir, and R. D. Lawrence. Iavaptogramming for high performance numerical computing. IBM Systems Journal, 39(1):21-56, 2000. IBM Research Division report RC21481.]] Google ScholarDigital Library
- 21.D. A. Padua and M. J. Wolfe. Advanced compiler optimizations for supercomputers. Communications of the ACM, 29(12):1184-1201, December 1986.]] Google ScholarDigital Library
- 22.V. Sarkar. Automatic selection of high-order transformations in the IBM XL Fortran compilers. IBM Journal of Research and Development, 41(3):233-264, May 1997.]] Google ScholarDigital Library
- 23.M. Schwab and J. Sehroeder. Algebraic Java classes for numerical optimization. In ACM 1998 Workshop on Java for High-Performance Network Computing. ACM SIGPLAN, 1998. URL: http : //www. CS. ucsb. edu/ conferences / java98.]]Google Scholar
- 24.V. Seshadri. IBM high performance compiler for Java. AIXpert Magazine, September 1997. URL: http:// www. developer, ibm. com/l ibrary/aixpert.]]Google Scholar
- 25.R. R Wilson and M. S. Lam. Efficient context-sensitive pointer analysis for C programs. In SIGPLAN '95 Conference on Programming Language Design and Implementation, pages 1-12, June 1995.]] Google ScholarDigital Library
- 26.M. E. Wolf and M. S, Lain. A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems, 2(4):452-471, October 1991.]] Google ScholarDigital Library
- 27.J.-S. Yur, B. G. Ryder, and W. A. Landi. An incremental flow- and context-sensitive pointer aliasing analysis. In Proceedings of the 1999 International Conference on Software Engineering, pages 442.-451, 1999.]] Google ScholarDigital Library
Index Terms
- Automatic loop transformations and parallelization for Java
Comments