ABSTRACT
Pipelining allows processors to exploit parallelism. Unfortunately, critical loops---pieces of logic that must evaluate in a single cycle to meet IPC (Instructions Per Cycle) goals---prevent deeper pipelining. In today's processors, one of these loops is the instruction scheduling (wakeup and select) logic [10]. This paper describes a technique that pipelines this loop by breaking it into two smaller loops: a critical, single-cycle loop for wakeup; and a non-critical, potentially multi-cycle, loop for select. For the 12 SPECint*2000 benchmarks, a machine with two-cycle select logic (i. e., three-cycle scheduling logic) using this technique has an average IPC 15% greater than a machine with three-cycle pipelined conventional scheduling logic, and an IPC within 3% of a machine of the same pipeline depth and one-cycle (ideal) scheduling logic. Since select accounts for more than half the scheduling latency [10], this technique could significantly increase clock frequency while having minimal impact on IPC.
- R. Canal and A. González. A low-complexity issue logic. In Proceedings of the 2000 International Conference on Supercomputing, pages 327-335, 2000. Google ScholarDigital Library
- A. Chandrakasan, W. J. Bowhill, and F. Fox, editors. Design of High-Performance Microprocessor Circuits. IEEE Press, 2001. Google ScholarDigital Library
- J. A. Farrell and T. C. Fischer. Issue logic for a 600-MHz Out-of-Order execution microprocessor. IEEE Journal of Solid-State Circuits, 33(5), 1998.Google Scholar
- D. S. Henry, B. C. Kuszmaul, G. H. Loh, and R. Sami. Circuits for wide-window superscalar processors. In Proceedings of the 27th Annual International Symposium on Computer Architecture, pages 236-247, 2000. Google ScholarDigital Library
- G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the Pentium 4 processor. Intel Technical Journal, Feb. 2001. Q1 2001 Issue.Google Scholar
- Intel Corporation. IA-32 Intel Architecture Software Developer's Manual Volume 1: Basic Architecture, 2001.Google Scholar
- P. Michaud and A. Seznec. Data-flow prescheduling for large instruction windows in out-of-order processors. In Proceedings of the Seventh IEEE International Symposium on High Performance Computer Architecture, pages 27-36, 2001. Google ScholarDigital Library
- E. Morancho, J. M. Llabería, and À. Olivé. Recovery mechanism for latency misprediction. In Proceedings of the 2001 ACM/IEEE International Conference on Parallel Architectures and Compilation Techniques, 2001. Google ScholarDigital Library
- S. Önder and R. Gupta. Superscalar execution with dynamic data forwarding. In Proceedings of the 1998 ACM/IEEE Conference on Parallel Architectures and Compilation Techniques, pages 130-135, 1998. Google ScholarDigital Library
- S. Palacharla, N. P. Jouppi, and J. E. Smith. Complexity-effective superscalar processors. In Proceedings of the 24th Annual International Symposium on Computer Architecture, 1997. Google ScholarDigital Library
- J. Stark, M. D. Brown, and Y. N. Patt. On pipelining dynamic instruction scheduling logic. In Proceedings of the 33th Annual ACM/IEEE International Symposium on Microarchitecture, 2000. Google ScholarDigital Library
- S. Weiss and J. E. Smith. Instruction issue logic in pipelined supercomputers. IEEE Transactions on Computers, C-33(11):1013-1022, Nov. 1984.Google ScholarDigital Library
- K. C. Yeager. The MIPS R10000 superscalar microprocessor. IEEE Micro, 16(2):28-41, Apr. 1996. Google ScholarDigital Library
- Select-free instruction scheduling logic
Recommendations
Lazy instruction scheduling: keeping performance, reducing power
ISLPED '08: Proceedings of the 2008 international symposium on Low Power Electronics & DesignAn important approach to reduce power dissipation is reducing the number of instructions executed by the processor. To achieve this goal, this paper introduces a novel instruction scheduling algorithm that executes an instruction only when its result is ...
On the Boosting of Instruction Scheduling by Renaming
Speculative execution is the execution of instructions before it is known whether these instructions should be executed. In the speculative execution for instruction level parallelism (ILP) processors, the concept of shadow register provides a hardware ...
Comments