ABSTRACT
Although compile-time optimizations generally improve program performance, degradations caused by individual techniques are to be expected. One promising research direction to overcome this problem is the development of dynamic, feedback-directed optimization orchestration algorithms, which automatically search for the combination of optimization techniques that achieves the best program performance. The challenge is to develop an orchestration algorithm that finds, in an exponential search space, a solution that is close to the best, in acceptable time. In this paper, we build such a fast and effective algorithm, called Combined Elimination (CE). The key advance of CE over existing techniques is that it takes the least tuning time (57% of the closest alternative), while achieving the same program performance. We conduct the experiments on both a Pentium IV machine and a SPARC II machine, by measuring performance of SPEC CPU2000 benchmarks under a large set of 38 GCC compiler options. Furthermore, through orchestrating a small set of optimizations causing the most degradation, we show that the performance achieved by CE is close to the upper bound obtained by an exhaustive search algorithm. The gap is less than 0.2% on average.
- {1} Forte C 6 /Sun WorkShop 6 Compilers C User's Guide. http://docs.sun.com/app/docs/doc/806-3567.Google Scholar
- {2} GCC online documentation. http://gcc.gnu.org/onlinedocs/.Google Scholar
- {3} L. Almagor, K. D. Cooper, A. Grosul, T. J. Harvey, S. W. Reeves, D. Subramanian, L. Torczon and T. Waterman. Finding effective compilation sequences. In LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems , pages 231-239, New York, NY, USA, 2004. ACM Press. Google ScholarDigital Library
- {4} G. E. P. Box, W. G. Hunter and J. S. Hunter. Statistics for experimenters : an introduction to design, data analysis, and model building. John Wiley and Sons, 1978.Google Scholar
- {5} K. Chow and Y. Wu. Feedback-directed selection and characterization of compiler optimizations. In Second Workshop on Feedback Directed Optimizations, Israel, November 1999.Google Scholar
- {6} K. D. Cooper, D. Subramanian and L. Torczon. Adaptive optimizing compilers for the 21st century. The Journal of Supercomputing, 23(1):7-22, 2002. Google ScholarDigital Library
- {7} E. D. Granston and A. Holler. Automatic recommendation of compiler options. In 4th Workshop on Feedback-Directed and Dynamic Optimization (FDDO-4), December 2001.Google Scholar
- {8} M. Haneda, P. Knijnenburg and H. Wijshoff. Generating new general compiler optimization settings. In Proceedings of the 19th ACM International Conference on Supercomputing , pages 161-168, June 2005. Google ScholarDigital Library
- {9} A. Hedayat, N. Sloane and J. Stufken. Orthogonal Arrays: Theory and Applications. Springer, 1999.Google Scholar
- {10} T. Kisuki, P. M. W. Knijnenburg and M. F. P. O'Boyle. Combined selection of tile sizes and unroll factors using iterative compilation. In IEEE PACT, pages 237-248, 2000. Google ScholarDigital Library
- {11} T. Kisuki, P. M. W. Knijnenburg, M. F. P. O'Boyle, F. Bodin, and H. A. G. Wijshoff. A feasibility study in iterative compilation. In International Symposium on High Performance Computing (ISHPC'99), pages 121-132, 1999. Google ScholarDigital Library
- {12} P. Kulkarni, S. Hines, J. Hiser, D. Whalley, J. Davidson and D. Jones. Fast searches for effective optimization phase sequences. In PLDI '04: Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation , pages 171-182, New York, NY, USA, 2004. ACM Press. Google ScholarDigital Library
- {13} Z. Pan and R. Eigenmann. Compiler optimization orchestration for peak performance. Technical Report TR-ECE-04- 01, School of Electrical and Computer Engineering, Purdue University, 2004.Google Scholar
- {14} Z. Pan and R. Eigenmann. Rating compiler optimizations for automatic performance tuning. In SC2004: High Performance Computing, Networking and Storage Conference, page (10 pages), November 2004. Google ScholarDigital Library
- {15} R. P. J. Pinkers, P. M. W. Knijnenburg, M. Haneda and H. A. G. Wijshoff. Statistical selection of compiler options. In The IEEE Computer Societys 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS' 04), pages 494-501, Volendam, The Netherlands, October 2004. Google ScholarDigital Library
- {16} N. J. A. Sloane. A Library of Orthogonal Arrays. http://www.research.att.com/njas/oadir/.Google Scholar
- {17} M. Stephenson, S. Amarasinghe, M. Martin and U.-M. O'Reilly. Meta optimization: improving compiler heuristics with machine learning. In Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, pages 77-90. ACM Press, 2003. Google ScholarDigital Library
- {18} S. Triantafyllis, M. Vachharajani, N. Vachharajani and D. I. August. Compiler optimization-space exploration. In Proceedings of the international symposium on Code generation and optimization, pages 204-215, 2003. Google ScholarDigital Library
- {19} M. J. Voss and R. Eigenmann. High-level adaptive program optimization with ADAPT. In Proceedings of the eighth ACM SIGPLAN symposium on principles and practices of parallel programming, pages 93-102. ACM Press, 2001. Google ScholarDigital Library
- {20} R. C. Whaley and J. Dongarra. Automatically tuned linear algebra software. In SuperComputing 1998: High Performance Networking and Computing, 1998. Google ScholarDigital Library
Index Terms
- Fast and Effective Orchestration of Compiler Optimizations for Automatic Performance Tuning
Recommendations
Tuning Compiler Optimizations for Simultaneous Multithreading
Special issue on the 30th annual ACM/IEEE international symposium on microarchitecture, part IISimultaneous Multithreading (SMT) is a processor architectural technique that promises to significantly improve the utilization and performance of modern wide-issue superscalar processors. An SM T processor is capable of issuing multiple instructions ...
Tuning compiler optimizations for simultaneous multithreading
MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on MicroarchitectureCompiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine. For example, when targeting shared-memory multiprocessors, parallel programs are compiled to minimize sharing, in ...
PEAK—a fast and effective performance tuning system via compiler optimization orchestration
Compile-time optimizations generally improve program performance. Nevertheless, degradations caused by individual compiler optimization techniques are to be expected. Feedback-directed optimization orchestration systems generate optimized code versions ...
Comments