Effective automatic parallelization of programs requires solving two problems. First, a compiler must discover what parallelism is available in the program, and second, it must compile the available parallelism to execute efficiently on a multiprocessor. This thesis examines the first problem for scientific and engineering applications written in FORTRAN. Since most of the parallelism in such programs occurs within loops, the techniques studied are aimed at exposing loop-level parallelism.
The roles of specific optimizations and analyses for exposing parallelism are explored in the absence of machine constraints. In particular, the roles of transformations that help parallelize loops that are not trivially parallel (that is, that contain some initial loop-carried dependences) are explored. These include privatization and scalar expansion (which eliminate dependences involving scalars) and loop distribution (which breaks apart loops that contain parallel and sequential regions).
Algorithms for manipulating data used in vectorizing compilers are analyzed to demonstrate that those techniques require some rethinking in the context of parallelization. Highly flexible algorithms for implementing scalar expansion and privatization are proposed. Results generated from these algorithms show that they perform better than previous algorithms.
In addition to applying parallelization transformations, the compiler system performs a variety of analyses and transformations, ranging from interprocedural analysis to traditional scalar optimizations. Using different combinations of analyses and transformations, interactions among and effectiveness of these techniques are explored.
After investigating the parallelism that can be uncovered in a set of real programs and comparing this to what is available within the application, the reasons why these transformation techniques fail to uncover all of the parallelism are explored. This investigation leads to suggestions for future transformations and enhancements of existing transformations to expose more parallelism.
Cited By
- Barthou D, Cohen A and Collard J (2019). Maximal Static Expansion, International Journal of Parallel Programming, 28:3, (213-243), Online publication date: 1-Jun-2000.
- Barthou D, Cohen A and Collard J Maximal static expansion Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, (98-106)
Index Terms
- Parallelizing compilers: implementation and effectiveness
Recommendations
An Efficient Data Dependence Analysis for Parallelizing Compilers
A novel algorithm, called the lambda test, is presented for an efficient and accurate data dependence analysis of multidimensional array references. It extends the numerical methods to allow all dimensions of array references to be tested ...
An Empirical Study of Fortran Programs for Parallelizing Compilers
Some results are reported from an empirical study of program characteristics, that are important in parallelizing compiler writers, especially in the area of data dependence analysis and program transformations. The state of the art in data dependence ...
Statement-Level Communication-Free Partitioning Techniques for Parallelizing Compilers
This paper addresses the problem of communication-free partition of iteration spaces and data spaces along hyperplanes. To finding more possible communication-free hyperplane partitions, we treat statements within a loop body as separate schedulable ...