ABSTRACT
Programming standards like OpenMP, OpenCL and MPI are frequently considered programming languages for developing parallel applications for their respective kind of architecture. Nevertheless, compilers treat them like ordinary APIs utilized by an otherwise sequential host language. Their parallel control flow remains hidden within opaque runtime library calls which are embedded within a sequential intermediate representation lacking the concepts of parallelism. Consequently, the tuning and coordination of parallelism is clearly beyond the scope of conventional optimizing compilers and hence left to the programmer or the runtime system.
The main objective of the Insieme compiler is to overcome this limitation by utilizing INSPIRE, a unified, parallel, high-level intermediate representation. Instead of mapping parallel constructs and APIs to external routines, their behavior is modeled explicitly using a unified and fixed set of parallel language constructs. Making the parallel control flow accessible to the compiler lays the foundation for the development of reusable, static and dynamic analyses and transformations bridging the gap between a variety of parallel paradigms.
Within this paper we describe the structure of INSPIRE and elaborate the considerations which influenced its design. Furthermore, we demonstrate its expressiveness by illustrating the encoding of a variety of parallel language constructs and we evaluate its ability to preserve performance relevant aspects of input codes.
- R. Rabenseifner, G. Hager, and G. Jost, "Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes," in PDP, 2009. Google ScholarDigital Library
- M. O'Boyle and F. Bodin, "Compiler reduction of synchronisation in shared virtual memory systems," in ICS, 1995, pp. 318--327. Google ScholarDigital Library
- E. Petit, F. Bodin, and R. Dolbeau, "An hybrid data transfer optimization for gpu," Compilers for Parallel Computers (CPC2007), 2007.Google Scholar
- S. Pai, R. Govindarajan, and M. J. Thazhuthaveetil, "Fast and efficient automatic memory management for gpus using compiler-assisted runtime coherence scheme," in PACT, 2012, pp. 33--42. Google ScholarDigital Library
- P. Thoman, H. Jordan, S. Pellegrini, and T. Fahringer, "Automatic OpenMP loop scheduling: a combined compiler and runtime approach," OpenMP in a Heterogeneous World, pp. 88--101, 2012. Google ScholarDigital Library
- S. Pellegrini, T. Hoefler, and T. Fahringer, "Exact dependence analysis for increased communication overlap," in EuroMPI, 2012, pp. 89--99. Google ScholarDigital Library
- "ROSE user manual: A tool for building source-to-source translators," 2012. {Online}. Available: http://rosecompiler.org/ROSE_UserManualGoogle Scholar
- G. C. Necula, S. McPeak, S. P. Rahul, and W. Weimer, "Cil: Intermediate language and tools for analysis and transformation of c programs," in CC, 2002, pp. 213--228. Google ScholarDigital Library
- "Insieme compiler and runtime infrastructure." Institute of Computer Science, Distributed and Parallel Systems Group, University of Innsbruck. {Online}. Available: http://insieme-compiler.orgGoogle Scholar
- "clang: a C language family frontend for LLVM," October 2012. {Online}. Available: http://clang.llvm.orgGoogle Scholar
- M.-W. Benabderrahmane, L.-N. Pouchet, A. Cohen, and C. Bastoul, "The polyhedral model is more widely applicable than you think," in CC, 2010, pp. 283--303. Google ScholarDigital Library
- B. Pierce, Types and programming languages. MIT press, 2002. Google ScholarDigital Library
- G. Holzmann, "The model checker spin," Software Engineering, IEEE Transactions on, vol. 23, no. 5, pp. 279--295, 1997. Google ScholarDigital Library
- D. H. Bailey, E. Barszcz, J. T. Barton et al., "The NAS parallel benchmarks," The International Journal of Supercomputer Applications, Tech. Rep., 1991.Google Scholar
- G. Dos Reis and B. Stroustrup, "A principled, complete, and efficient representation of CGoogle Scholar
- ," Mathematics in Computer Science, vol. 5, no. 3, pp. 335--356, 2011.Google Scholar
- I. Grasso, K. Kofler, B. Cosenza, and T. Fahringer, "Automatic problem size sensitive task partitioning on heterogeneous parallel systems," in PPOPP, 2013, pp. 281--282. Google ScholarDigital Library
- F. Logozzo and M. F\"ahndrich, "On the relative completeness of bytecode analysis versus source code analysis," in CC, 2008, pp. 197--212. Google ScholarDigital Library
- H. Jordan, P. Thoman, J. J. D. Barrionuevo, S. Pellegrini, P. Gschwandtner, T. Fahringer, and H. Moritsch, "A multi-objective auto-tuning framework for parallel codes," in SC, 2012, p. 10.Google Scholar
- L.-N. Pouchet, "PolyBench: The polyhedral benchmark suite." {Online}. Available: http://www.cse.ohio-state.edu/ pouchet/software/polybenchGoogle Scholar
- A. Duran, X. Teruel et al., "Barcelona OpenMP Tasks Suite: A set of benchmarks targeting the exploitation of task parallelism in OpenMP," in ICPP, 2009, pp. 124--131. Google ScholarDigital Library
- C. Lattner and V. Adve, "LLVM: A compilation framework for lifelong program analysis & transformation," in CGO, 2004, pp. 75--86. Google ScholarDigital Library
- T. Lindholm and F. Yellin, Java virtual machine specification. Addison-Wesley Longman Publishing Co., Inc., 1999. Google ScholarDigital Library
- J. Merrill, "Generic and gimple: A new tree representation for entire functions," in GCC Developers? Summit, 2003, pp. 171--179.Google Scholar
- G. Aigner, A. Diwan et al., "An overview of the SUIF2 compiler infrastructure," Computer Systems Laboratory, Stanford University, 2000.Google Scholar
- I. Dillig, T. Dillig, and A. Aiken, "SAIL: Static analysis intermediate language with a two-level representation," Stanford University Technical Report, Tech. Rep., 2009.Google Scholar
- J. Lee, S. P. Midkiff, and D. A. Padua, "Concurrent static single assignment form and constant propagation for explicitly parallel programs," in LCPC, 1997, pp. 114--130. Google ScholarDigital Library
- D. Novillo, R. C. Unrau, and J. Schaeffer, "Concurrent ssa form in the presence of mutual exclusion," in ICPP, 1998, pp. 356--. Google ScholarDigital Library
- A. Pop, A. Cohen et al., "Preserving high-level semantics of parallel programming annotations through the compilation flow of optimizing compilers," in CPC, 2010.Google Scholar
- D. Khaldi, P. Jouvelot, C. Ancourt, and F. Irigoin, "Spire: A sequential to parallel intermediate representation extension," Technical Report CRI/A-487, MINES ParisTech, Tech. Rep., 2012.Google Scholar
- C. Miranda, A. Pop, P. Dumont, A. Cohen, and M. Duranton, "Erbium: a deterministic, concurrent intermediate representation to map data-flow tasks to scalable, persistent streaming processes," in CASES, 2010, pp. 11--20. Google ScholarDigital Library
- M. I. Cole, Algorithmic skeletons: structured management of parallel computation. Pitman, 1989. Google ScholarDigital Library
- B. Bacci, M. Danelutto, S. Orlando, S. Pelagatti, and M. Vanneschi, "P3l: A structured high-level parallel language, and its structured support," Concurrency: Practice and Experience, vol. 7, no. 3, pp. 225--255, 1995.Google ScholarCross Ref
- K. J. Brown, A. K. Sujeeth, H. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun, "A heterogeneous parallel framework for domain-specific languages," in PACT, 2011, pp. 89--100. Google ScholarDigital Library
- C. Liao, D. J. Quinlan, T. Panas, and B. R. de Supinski, "A rose-based openmp 3.0 research compiler supporting multiple runtime libraries," in IWOMP, 2010, pp. 15--28. Google ScholarDigital Library
- J. Balart, A. Duran, M. Gonzàlez, X. Martorell, E. Ayguadé, and J. Labarta, "Nanos mercurium: a research compiler for openmp," in Proceedings of the European Workshop on OpenMP, vol. 8, 2004.Google Scholar
Index Terms
INSPIRE: the insieme parallel intermediate representation
Recommendations
Reflections on LMS: exploring front-end alternatives
SCALA 2016: Proceedings of the 2016 7th ACM SIGPLAN Symposium on ScalaMetaprogramming techniques to generate code at runtime in a general-purpose meta-language have seen a surge of interest in recent years, driven by the widening performance gap between high-level languages and emerging hardware platforms. In the context ...
Comparing Parallel Functional Languages: Programming and Performance
This paper presents a practical evaluation and comparison of three state-of-the-art parallel functional languages. The evaluation is based on implementations of three typical symbolic computation programs, with performance measured on a Beowulf-class ...
Codon: A Compiler for High-Performance Pythonic Applications and DSLs
CC 2023: Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler ConstructionDomain-specific languages (DSLs) are able to provide intuitive high-level abstractions that are easy to work with while attaining better performance than general-purpose languages. Yet, implementing new DSLs is a burdensome task. As a result, new DSLs ...
Comments