Abstract
A large number of compiler optimizations are nowadays available to users. These optimizations interact with each other and with the input code in several and complex ways. The sequence of application of optimization passes can have a significant impact on the performance achieved. The effect of the optimizations is both platform and application dependent. The exhaustive exploration of all viable sequences of compiler optimizations for a given code fragment is not feasible. As this exploration is a complex and time-consuming task, several researchers have focused on Design Space Exploration (DSE) strategies both to select optimization sequences to improve the performance of each function of the application and to reduce the exploration time. In this article, we present a DSE scheme based on a clustering approach for grouping functions with similarities and exploration of a reduced search space resulting from the combination of optimizations previously suggested for the functions in each group. The identification of similarities between functions uses a data mining method that is applied to a symbolic code representation. The data mining process combines three algorithms to generate clusters: the Normalized Compression Distance, the Neighbor Joining, and a new ambiguity-based clustering algorithm. Our experiments for evaluating the effectiveness of the proposed approach address the exploration of optimization sequences in the context of the ReflectC compiler, considering 49 compilation passes while targeting a Xilinx MicroBlaze processor, and aiming at performance improvements for 51 functions and four applications. Experimental results reveal that the use of our clustering-based DSE approach achieves a significant reduction in the total exploration time of the search space (20× over a Genetic Algorithm approach) at the same time that considerable performance speedups (41% over the baseline) were obtained using the optimized codes. Additional experiments were performed considering the LLVM compiler, considering 124 compilation passes, and targeting a LEON3 processor. The results show that our approach achieved geometric mean speedups of 1.49 × , 1.32 × , and 1.24 × for the best 10, 20, and 30 functions, respectively, and a global improvement of 7% over the performance obtained when compiling with -O2.
- ACE. 2012. CoSy Compiler Development System. Retrieved from http://www.ace.nl/compiler/cosy.html.Google Scholar
- Aeroflex. 2003. TSIM2 ERC32/LEON Simulator. Retrieved from http://www.gaisler.com/index.php/products/simulators/tsim.Google Scholar
- Aeroflex. 2005. LEON3 Processor. Retrieved from http://www.gaisler.com/index.php/products/processors/leon3.Google Scholar
- F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M. F. P. O'Boyle, J. Thomson, M. Toussaint, and C. K. I. Williams. 2006. Using machine learning to focus iterative optimization. In Proc. of the Int. Symp. on Code Generation and Optimization (CGO’06). 295--305.Google ScholarDigital Library
- L. Almagor, K. D. Cooper, A. Grosul, T. J. Harvey, S. W. Reeves, D. Subramanian, L. Torczon, and T. Waterman. 2004. Finding effective compilation sequences. In Proc. of the ACM Conf. on Languages, Compilers, and Tools for Embedded Systems (LCTES’04), Vol. 39. 231--239.Google Scholar
- A. H. Ashouri, G. Mariani, G. Palermo, and C. Silvano. 2014. A Bayesian network approach for compiler auto-tuning for embedded processors. In IEEE 12th Symp. on Embedded Systems for Real-time Multimedia (ESTIMedia’14). 90--97.Google Scholar
- J. M. P. Cardoso, P. C. Diniz, J. G. F. Coutinho, and Z. M. Petrov (Eds.). 2013. Compilation and Synthesis for Embedded Reconfigurable Systems: An Aspect-Oriented Approach. Springer.Google Scholar
- J. M. P. Cardoso, T. Carvalho, J. G. F. Coutinho, W. Luk, R. Nobre, P. Diniz, and Z. M. Petrov. 2012. LARA: An aspect-oriented programming language for embedded systems. In Proc. of the 11th ACM Int. Conf. on Aspect-Oriented Software Development (AOSD’12). 179--190.Google Scholar
- J. Cavazos and M. F. P. O’Boyle. 2006. Method-specific dynamic compilation using logistic regression. In Proc. of the ACM Int. Conf. on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’06). 229--240.Google ScholarDigital Library
- Y. Chen, S. Fang, Y. Huang, L. Eeckhout, G. Fursin, O. Temam, and C. Wu. 2012. Deconstructing iterative optimization. ACM Trans. Archit. Code Optim. 9, 3 (2012), 21:1--21:30.Google ScholarDigital Library
- A. R. Cilibrasi and A. P. Vitanyi. 2005. Clustering by compression. In Proc. of the IEEE Trans. Information Theory (AOSD’12), Vol. 51. 1523--1545.Google Scholar
- A. R. Cilibrasi, A. L. Cruz, S. de Rooij, and M. Keijzer. 2008. CompLearn Toolkit. Retrieved from http://www.complearn.org/.Google Scholar
- K. D. Cooper, P. J. Schielke, and D. Subramanian. 1999. Optimizing for reduced code space using genetic algorithms. In Proc. of the ACM Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES’99). 1--9.Google Scholar
- K. D. Cooper, A. Grosul, T. J. Harvey, S. Reeves, D. Subramanian, L. Torczon, and T. Waterman. 2006. Exploring the structure of the space of compilation sequences using randomized search algorithms. J. Supercomput. 36, 2 (2006), 135--151.Google ScholarDigital Library
- J. Felsenstein. 2003. Inferring Phylogenies. Sinauer Associates.Google Scholar
- G. Fursin, Y. Kashnikov, A. Memon, Z. Chamski, O. Temam, M. Namolaru, E. Yom-Tov, B. Mendelson, A. Zaks, E. Courtois, F. Bodin, P. Barnard, E. Ashton, E. Bonilla, J. Thomson, C. Williams, and M. O’Boyle. 2011. Milepost GCC: Machine learning enabled self-tuning compiler. Int. J. Parallel Program. 39, 3 (2011), 296--327.Google ScholarCross Ref
- D. E. Goldberg. 1989. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman.Google ScholarDigital Library
- M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proc. of the IEEE Int. Workshop of the Workload Characterization (WWC’01). 3--14.Google Scholar
- M. Haneda, P. M. W. Knijnenburg, and H. A. G. Wijshoff. 2005. Optimizing general purpose compiler optimization. In Proc. of the 2nd Conf. on Computing Frontiers (CF’05). 180--188.Google ScholarDigital Library
- K. Hoste and L. Eeckhout. 2008. Cole: Compiler optimization level exploration. In Proc. of the 6th Int. Symp. on Code Generation and Optimization (CGO’08). 165--174.Google Scholar
- Q. Huang, R. Lian, A. Canis, J. Choi, R. Xi, S. Brown, and J. Anderson. 2013. The effect of compiler optimizations on high-level synthesis for FPGAs. In Proc. of the IEEE Int. Symp. on Field-Programmable Custom Computing Machines (FCCM’13). 89--96.Google Scholar
- M. R. Jantz and P. A. Kulkarni. 2013. Exploiting phase inter-dependencies for faster iterative compiler optimization phase order searches. In Proc. of the Conf. on Compilers, Archit. and Synthesis for Embedded Systems (CASES’13). 7:1--7:10.Google Scholar
- P. A. Kulkarni, M. R. Jantz, and D. B. Whalley. 2010. Improving both the performance benefits and speed of optimization phase sequence searches. In Proc. of the ACM Conf. on Lang., Comp., and Tools for Emb. Syst. (LCTES’10). 95--104.Google Scholar
- P. A. Kulkarni, D. B. Whalley, and G. S. Tyson. 2007. Evaluating heuristic optimization phase order search algorithms. In Proc. of the IEEE Int. Symp. on Code Generation and Optimization (CGO’07). 157--169.Google Scholar
- S. Kulkarni and J. Cavazos. 2012. Mitigating the compiler optimization phase-ordering problem using machine learning. In Proc. of the ACM Conf. on Object Oriented Programming Systems Lang. and Applications (OOPSLA’12). 147--162.Google Scholar
- P. A. Kulkarni, D. B. Whalley, G. S. Tyson, and J. W. Davidson. 2009. Practical exhaustive optimization phase order exploration and evaluation. ACM Trans. Architect. Code Optim. 6, 1 (April 2009), 1:1--1:36.Google ScholarDigital Library
- C. Lattner and V. Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proc. of the Int. Symp. on Code Generation and Optimization (CGO’’04). 75.Google Scholar
- C. G. Lee. 2002. UTDSP benchmark suite. (2002). http://www.eecg.toronto.edu/∼corinna/DSP/infrastructure/UTDSP.tar.gz.Google Scholar
- M. Li and P. M. B. Vitanyi. 1997. An Introduction to Kolmogorov Complexity and Its Applications (2nd ed.). Springer-Verlag.Google ScholarDigital Library
- L. G. A. Martins, R. Nobre, A. C. B. Delbem, E. Marques, and J. M. P. Cardoso. 2014a. A clustering-based approach for exploring sequences of compiler optimizations. In Proc. of the IEEE Congress on Evolutionary Computation (CEC’14). 1--8.Google ScholarCross Ref
- L. G. A. Martins, R. Nobre, A. C. B. Delbem, E. Marques, and J. M. P. Cardoso. 2014b. Exploration of compiler optimization sequences using clustering-based selection. In Proc. of the ACM Conf. on Languages, Compilers, and Tools for Embedded Systems (LCTES’14). 63--72.Google ScholarDigital Library
- M. Newman. 2010. Networks: An Introduction. Oxford University Press.Google Scholar
- Z. Pan and R. Eigenmann. 2008. PEAK: A fast and effective performance tuning system via compiler optimization orchestration. ACM Trans. Program. Lang. Syst. 30, 3 (2008), 1--17.Google ScholarDigital Library
- S. Purini and L. Jain. 2013. Finding good optimization sequences covering program space. ACM Trans. Architect. Code Optim. 9, 4 (January 2013), 56:1--56:23.Google Scholar
- C. K. Roy, J. R. Cordy, and R. Koschke. 2009. Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Sci. Comput. Program. 74, 7 (May 2009), 470--495.Google ScholarDigital Library
- A. Sanches and J. M. P. Cardoso. 2010. On identifying patterns in code repositories to assist the generation of hardware templates. In Proc. of the 20th Int. Conf. on Field Programmable Logic and Applications (FPL’10). 267--270.Google ScholarDigital Library
- A. Sanches, J. M. P. Cardoso, and A. C. B. Delbem. 2011. Identifying merge-beneficial software kernels for hardware implementation. In Proc. of the Int. conf. on Reconfigurable Computing and FPGAs (ReConFig’11). 74--79.Google ScholarDigital Library
- R. Sanchez, J. Amaral, D. Szafron, M. Pirvu, and M. Stoodley. 2011. Using machines to learn method-specific compilation strategies. In Proc. of the Int. Symp. on Code Generation and Optimization (CGO’11). 257--266.Google Scholar
- Texas Instruments. 2003a. TMS320C64x DSP Library: Programmer’s Reference. (2003).Google Scholar
- Texas Instruments. 2003b. TMS320C64x Image/Video Processing Library. (2003).Google Scholar
- M. P. J. van der Loo. 2014. The stringdist package for approximate string matching. The R Journal 6 (2014), 111--122.Google ScholarCross Ref
- T. Wheeler and J. Kececioglu. 2007. Multiple alignment by aligning alignments. In Proc. of the 15th ISCB Conf. on Intelligent Systems for Molecular Biology, Bioinformatics, Vol. 23. i559--i568.Google Scholar
- E. Zitzler, M. Laumanns, and L. Thiele. 2001. SPEA2: Improving the Strength Pareto Evolutionary Algorithm. Computer Engineering and Networks Lab Technical Report TR-200. Swiss Federal Institute of Technology, Cambridge, MA.Google Scholar
Index Terms
- Clustering-Based Selection for the Exploration of Compiler Optimization Sequences
Recommendations
A graph-based iterative compiler pass selection and phase ordering approach
LCTES '16Nowadays compilers include tens or hundreds of optimization passes, which makes it difficult to find sequences of optimizations that achieve compiled code more optimized than the one obtained using typical compiler options such as -O2 and -O3. The ...
Exploration of compiler optimization sequences using clustering-based selection
LCTES '14Due to the large number of optimizations provided in modern compilers and to compiler optimization specific opportunities, a Design Space Exploration (DSE) is necessary to search for the best sequence of compiler optimizations for a given code fragment (...
Exploration of compiler optimization sequences using clustering-based selection
LCTES '14: Proceedings of the 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systemsDue to the large number of optimizations provided in modern compilers and to compiler optimization specific opportunities, a Design Space Exploration (DSE) is necessary to search for the best sequence of compiler optimizations for a given code fragment (...
Comments