Exploiting instruction level parallelism in the presence of conditional branches

January 1997

Author:
Scott Alan Mahlke

Publisher:

University of Illinois at Urbana-Champaign
Champaign, IL
United States

Order Number:UMI Order No. GAX97-17305

Bibliometrics

Abstract

Wide issue superscalar and VLIW processors utilize instruction-level parallelism (ILP) to achieve high performance. However, if insufficient ILP is found, the performance potential of these processors suffers dramatically. Branch instructions, which are one of the major limitations to exploiting ILP, enforce strict ordering conditions in programs to ensure correct execution. Therefore, it is difficult to achieve the desired overlap of instruction execution with branches in the instruction stream. To effectively exploit ILP in the presence of branches requires efficient handling of branches and the dependences they impose.

This dissertation investigates two techniques for exposing and enhancing ILP in the presence of branches, speculative execution and predicated execution. Speculative execution enables an ILP compiler to remove dependences between instructions and prior branches. In this manner, the execution of instructions and predicted future instructions may be overlapped. Compiler-controlled speculative execution is employed using an efficient structure called the superblock. The formation and optimization of superblocks increase ILP along important execution paths by systematically removing constraints due to unimportant paths. In conjunction with superblock optimizations, speculative execution is utilized to remove control dependences in the superblock to aggressively reorder instructions across branches to achieve a high degree of execution overlap.

For many applications, speculative execution alone is not sufficient to achieve high performance. The fundamental limitation is that speculation only removes dependences between branches and other instructions. The branches themselves remain in the code, which causes difficult problems. This motivates the second technique investigated in this dissertation, predicated execution, which is an architectural capability that enables the conditional execution of instructions based on the value of a Boolean source operand. Predicated execution allows a compiler to eliminate branch instructions using this conditional execution support. Additionally, predicated execution provides an efficient interface for the compiler to overlap the execution of multiple paths of control. Predicated execution is exploited in the compiler via a generalized form of a superblock, called the hyperblock. Hyperblocks provide the framework for the compiler to selectively eliminate branches using predicated execution as well as apply speculative execution to exploit ILP.

Cited By

Contributors

Scott Alan Mahlke
University of Michigan, Ann Arbor
- Publication Years1991 - 2024
- Publication counts202
- Citation count11,370
- Available for Download190
- Downloads (cumulative)126,727
- Downloads (12 months)9,317
- Downloads (6 weeks)1,198
- Average Downloads per Article667
- Average Citation per Article56
View Full Profile

Index Terms

Exploiting instruction level parallelism in the presence of conditional branches
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
      1. Very long instruction word
    2. Serial architectures
      1. Complex instruction set computing
      2. Reduced instruction set computing
2. Theory of computation
  1. Models of computation
    1. Concurrency
      1. Parallel computing models

Recommendations

Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruction-level parallelism (ILP) and thread-level parallelism (TLP). Wide-issue super-scalar processors exploit ILP by executing multiple instructions from a ...
Read More
Efficient Exploitation of Instruction-Level Parallelism for Superscalar Processors by the Conjugate Register File Scheme

This paper introduces a novel superscalar micro-architecture, called IAS-S, and its related software techniques. We treat two basic problems in superscalar machines. First, we seek a feasible hardware platform which allows the compiler to perform more ...
Read More
Exploiting Java instruction/thread level parallelism with horizontal multithreading

Java bytecodes can be executed with the following three methods: a Java interpretor running on a particular machine interprets bytecodes; a Just-In-Time (JIT) compiler translates bytecodes to the native primitives of the particular machine and the ...
Read More

Comments

Browse Theses

Sections

Cited By

Index Terms

Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

Efficient Exploitation of Instruction-Level Parallelism for Superscalar Processors by the Conjugate Register File Scheme

Exploiting Java instruction/thread level parallelism with horizontal multithreading

Sections

Cited By

Save to Binder

Index Terms

Recommendations

Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

Efficient Exploitation of Instruction-Level Parallelism for Superscalar Processors by the Conjugate Register File Scheme

Exploiting Java instruction/thread level parallelism with horizontal multithreading