Article

Compiler-assisted leakage energy optimization for clustered VLIW architectures

Authors:
Rahul Nagpal

Indian Institute of Science, Bangalore, India

Indian Institute of Science, Bangalore, India
View Profile

,
Y. N. Srikant

Indian Institute of Science, Bangalore, India

Indian Institute of Science, Bangalore, India
View Profile

EMSOFT '06: Proceedings of the 6th ACM & IEEE International conference on Embedded softwareOctober 2006Pages 233–241https://doi.org/10.1145/1176887.1176921

Published:22 October 2006Publication History

EMSOFT '06: Proceedings of the 6th ACM & IEEE International conference on Embedded software

Pages 233–241

ABSTRACT

Miniaturization of devices and the ensuing decrease in the threshold voltage has led to a substantial increase in the leakage component of the total processor energy consumption. Relatively simpler issue logic and the presence of a large number of function units in the VLIW and the clustered VLIW architectures attribute a large fraction of this leakage energy consumption in the functional units. However, functional units are not fully utilized in the VLIW architectures because of the inherent variations in the ILP of the programs. This underutilization is even more pronounced in the context of clustered VLIW architectures because of the contentions for the limited number of slow intercluster communication channels which lead to many short idle cycles.In the past, some architectural schemes have been proposed to obtain leakage energy bene .ts by aggressively exploiting the idleness of functional units. However, presence of many short idle cycles cause frequent transitions from the active mode to the sleep mode and vice-versa and adversely a ffects the energy benefits of a purely hardware based scheme. In this paper, we propose and evaluate a compiler instruction scheduling algorithm that assist such a hardware based scheme in the context of VLIW and clustered VLIW architectures. The proposed scheme exploits the scheduling slacks of instructions to orchestrate the functional unit mapping with the objective of reducing the number of transitions in functional units thereby keeping them off for a longer duration. The proposed compiler-assisted scheme obtains a further 12% reduction of energy consumption of functional units with negligible performance degradation over a hardware-only scheme for a VLIW architecture. The benefits are 15% and 17% in the context of a 2-clustered and a 4-clustered VLIW architecture respectively. Our test bed uses the Trimaran compiler infrastructure.

References

MediaBench.http://cares.icsl.ucla.edu/MediaBench/.]]Google Scholar
MiBench. http://www.eecs.umich.edu/mibench/.]]Google Scholar
NetBench. http://cares.icsl.ucla.edu/NetBench/.]]Google Scholar
Trimaran System. http://www.trimaran.org/.]]Google Scholar
S. G. Abraham, W. M. Meleis, and I. D. Baev. Efficient Backtracking Instruction Schedulers. In Proc. of Intl. Conf. on Parallel Architectures and Compilation Techniques pages 301--308, 2000.]] Google ScholarDigital Library
A. Aleta, J. M. Codina, J. Sanchez, and A. Gonzalez. Graph-partitioning based Instruction Scheduling for Clustered Processors. In Proc. of Intl. Symp. on Microarchitecture pages 150--159, 2001.]] Google ScholarDigital Library
S. Borkar. Design Challenges of Technology Scaling. IEEE Micro 19(4): 23--29,1999.]] Google ScholarDigital Library
J. A. Buttsand G. S. Sohi. A Static Power Model for Architects. In Proc. of the Intl. Symp. on Microarchitecture pages 191--201, New York, NY, USA, 2000.]] Google ScholarDigital Library
M. Chu, K. Fan, and S. Mahlke. Region-based Hierarchical Operation Partitioning for Multicluster Processors. SIGPLAN Notices pages 300--311, 2003.]] Google ScholarDigital Library
G. Desoli. Instruction Assignment for Clustered VLIW DSP Compilers: A New Approach. Technical Report, Hewlett-Packard, 1998.]]Google Scholar
S. Dropsho, V. Kursun, D. H. Albonesi, S. Dwarkadas, and E. G. Friedman. Managing Static Leakage Energy in Microprocessor Functional Units. In Proc. of the Intl. Symp. on Microarchitecture pages 321--332, Los Alamitos, CA, USA, 2002.]] Google ScholarDigital Library
J. R. Ellis. Bulldog: A Compiler for VLIW Architectures MIT Press, 1986.]] Google ScholarDigital Library
PFaraboschi, G. Brown, J. A. Fisher, and G. Desoli. Clustered Instruction-level Parallel Processors. Technical report, Hewlett-Packard, 1998.]]Google Scholar
K. Flautner, N. S. Kim, S. Martin, D. Blaauw, and T. Mudge. Drowsy Caches: Simple Techniques for Reducing Leakage Power. In Proc. of the Intl. Symp. on Computer Architecture pages 148--157, Washington, DC, USA, 2002.]] Google ScholarDigital Library
B. M.-S. Gokhan Memic and W. Hu. NetBench: A Benchmarking Suit for Network Processor. CARES Technical Report 2002.]]Google Scholar
M. Guthaus, J. Ringenberg, and D. Ernst. MiBench: A Free, Commercially Representative Embedded Benchmark Suite. IEEE 4th Annual Workshop on Workload Characterization 2001.]] Google ScholarDigital Library
K. Kailas, A. Agrawala, and K. Ebcioglu. CARS: A New Code Generation Framework for Clustered ILP Processors. In Proc. of Intl. Symp. on High-Performance Computer Architecture page 133, 2001.]] Google ScholarDigital Library
S. Kaxiras, Z. Hu, and M. Martonosi. Cache Decay: Exploiting Generational Behavior to Reduce Cache Leakage Power. In Proc. of the Intl. Symp. on Computer Architecture pages 240--251,New York, NY, USA, 2001.]] Google ScholarDigital Library
H. S. Kim, N. Vijaykrishnan, M. Kandemir, and M. J. Irwin. Adapting Instruction Level Parallelism for Optimizing Leakage in VLIW Architectures. In Proc. of Conf. on Language, Compiler, and Tool for Embedded Systems pages 275--283,2003.]] Google ScholarDigital Library
V. Kursun and E. G. Friedman. Low swing Dual Threshold Voltage Domino Logic. In Proc. of the ACM Great Lakes Symp. on VLSI pages 47--52, New York, NY, USA, 2002.]] Google ScholarDigital Library
V. S. Lapinskii, M. F. Jacome, and G. A. De Veciana. Cluster Assignment for High-Performance Embedded VLIW Processors. ACM Trans. on Design and Automation of Electronic Systems pages 430--454, 2002.]] Google ScholarDigital Library
C. Lee, M. Potkonjak, and W. H. Mangione-Smith. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems. In Proc. of Intl. Symp. on Microarchitecture 1997.]] Google ScholarDigital Library
W. Lee, D. Puppin, S. Swenson, and S. Amarasinghe. Convergent Scheduling.In Proc. of Intl. Symp. on Microarchitecture pages 111--122, 2002.]] Google ScholarDigital Library
R. Leupers. Instruction Scheduling for Clustered VLIW DSPs. In Proc. of Intl. Conf. on Parallel Architectures and Compilation Techniques page 291, Washington, DC, USA, 2000.]] Google ScholarDigital Library
T. N. Mudge. Power: A First Class Design Constraint for Future Architecture and Automation.In Proc. of the Intl. Conf. on High Performance Computing pages 215--224, London, UK, 2000. Springer-Verlag.]] Google ScholarDigital Library
R. Nagpal and Y. N. Srikant. A Graph Matching Based Integrated Scheduling Framework for Clustered VLIW Processors.In Proc. of ICPP Workshop on Compile and Runtime Techniques Parallel Computing pages 530--537, 2004.]] Google ScholarDigital Library
R. Nagpal and Y. N. Srikant. Integrated Temporal and Spatial Scheduling for Extended Operand Clustered VLIW Processors. In Proc. of Conf. on computing frontiers pages 457--470, 2004.]] Google ScholarDigital Library
R. Nagpal and Y. N. Srikant. Compiler-Assisted Leakage Energy Optimization for Clustered VLIW Architectures. Technical Report, Dept. of CSA, Indian Institute of Science(http://www.archive.csa.iisc.ernet.in/TR), 2005.]]Google Scholar
E. Nystrom and A. E. Eichenberger. Effective Cluster Assignment for Modulo Scheduling. In Proc. of 31st annual ACM/IEEE Intl. Symp. on Microarchitecture pages 103--114, 1998.]] Google ScholarDigital Library
E. Ozer, S. Banerjia, and T. M. Conte. Unified Assign and Schedule: A New Approach to Scheduling for Clustered Register File Microarchitectures. In Proc. of Intl. Symp. on Microarchitecture pages 308--315, 1998.]] Google ScholarDigital Library
S. Rele, S. Pande, S. Onder, and R. Gupta. Optimizing Static Power Dissipation by Functional Units in Superscalar Processors. In Proc. of 11th Intl. Conf. on Compiler Construction pages 261--275, 2002.]] Google ScholarDigital Library
D. Sylvester and H. Kaul. Power-Driven Challenges in Nanometer Design.IEEE Design and Test of Computers 18(6): 12--22, 2001.]] Google ScholarDigital Library
K. A. Vardhan and Y. N. Srikant. Transition Aware Scheduling: Increasing Continuous Idle-Periods in Resource Units. In Proc. of the Conf. on Computing frontiers pages 189--198, New York, NY, USA, 2005.]] Google ScholarDigital Library
S.-H. Yang, B. Falsa., M. D. Powell, K. Roy, and T. N. Vijaykumar. An Integrated Circuit/Architecture Approach to Reducing Leakage in Deep-Submicron High-Performance I Caches. In Proc. of the Intl. Symp. on High-Performance Computer Architecture page 147, Washington, DC, USA, 2001.]] Google ScholarDigital Library
H. Yun and J. Kim. Power-aware Modulo Scheduling for High-Performance VLIW Processors. In Proc. of Intl. Symp. on Low Power Electronics and Design pages 40--45,2001.]] Google ScholarDigital Library
J. Zalamea, J. Llosa, E. Ayguade, and M. Valero. Modulo Scheduling with Integrated Register Spilling for Clustered VLIW Architectures. In Proc. of Intl. Symp. on Microarchitecture pages 160--169, 2001.]] Google ScholarDigital Library
W. Zhang, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, D. Duarte, and Y.-F. Tsai. Exploiting VLIW Schedule Slacks for Dynamic and Leakage Energy Reduction. In Proc. of Intl. Symp. on Microarchitecture pages 102--113,2001.]] Google ScholarDigital Library

Index Terms

Compiler-assisted leakage energy optimization for clustered VLIW architectures
1. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Source code generation

Recommendations

Compiler-assisted power optimization for clustered VLIW architectures

Clustered VLIW architectures solve the scalability problem associated with flat VLIW architectures by partitioning the register file and connecting only a subset of the functional units to a register file. However, inter-cluster communication in ...
Read More
Compiler-assisted energy optimization for clustered VLIW processors

Clustered architecture processors are preferred for embedded systems because centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption. Although clustering helps by improving the clock speed, reducing ...
Read More
Compiler-Assisted Instruction Decoder Energy Optimization for Clustered VLIW Architectures
High Performance Computing – HiPC 2007
Abstract
Traditionally, an instruction decoder is designed as a monolithic structure that inhibit the leakage energy optimization. In this paper, we consider a split instruction decoder that enable the leakage energy optimization. We also propose a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EMSOFT '06: Proceedings of the 6th ACM & IEEE International conference on Embedded software
October 2006
346 pages
ISBN:1595935428
DOI:10.1145/1176887
General Chairs:
Sang Lyul Min
Seoul National University
,
Wang Yi
Uppsala University
Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 October 2006
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustered VLIW processors
energy-aware scheduling
leakage energy
scheduling
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate60of203submissions,30%
Upcoming Conference
ESWEEK '24

Sponsor:

sigbed

sigbed

sigbed

Twentieth Embedded Systems Week

September 29 - October 4, 2024

Raleigh , NC , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 343
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Compiler-assisted leakage energy optimization for clustered VLIW architectures

EMSOFT '06: Proceedings of the 6th ACM & IEEE International conference on Embedded software

ABSTRACT

References

Cited By

Index Terms

Recommendations

Compiler-assisted power optimization for clustered VLIW architectures

Compiler-assisted energy optimization for clustered VLIW processors

Compiler-Assisted Instruction Decoder Energy Optimization for Clustered VLIW Architectures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Compiler-assisted leakage energy optimization for clustered VLIW architectures

EMSOFT '06: Proceedings of the 6th ACM & IEEE International conference on Embedded software

ABSTRACT

References

Cited By

Index Terms

Recommendations

Compiler-assisted power optimization for clustered VLIW architectures

Compiler-assisted energy optimization for clustered VLIW processors

Compiler-Assisted Instruction Decoder Energy Optimization for Clustered VLIW Architectures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media