skip to main content
10.1145/1176887.1176921acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
Article

Compiler-assisted leakage energy optimization for clustered VLIW architectures

Published:22 October 2006Publication History

ABSTRACT

Miniaturization of devices and the ensuing decrease in the threshold voltage has led to a substantial increase in the leakage component of the total processor energy consumption. Relatively simpler issue logic and the presence of a large number of function units in the VLIW and the clustered VLIW architectures attribute a large fraction of this leakage energy consumption in the functional units. However, functional units are not fully utilized in the VLIW architectures because of the inherent variations in the ILP of the programs. This underutilization is even more pronounced in the context of clustered VLIW architectures because of the contentions for the limited number of slow intercluster communication channels which lead to many short idle cycles.In the past, some architectural schemes have been proposed to obtain leakage energy bene .ts by aggressively exploiting the idleness of functional units. However, presence of many short idle cycles cause frequent transitions from the active mode to the sleep mode and vice-versa and adversely a ffects the energy benefits of a purely hardware based scheme. In this paper, we propose and evaluate a compiler instruction scheduling algorithm that assist such a hardware based scheme in the context of VLIW and clustered VLIW architectures. The proposed scheme exploits the scheduling slacks of instructions to orchestrate the functional unit mapping with the objective of reducing the number of transitions in functional units thereby keeping them off for a longer duration. The proposed compiler-assisted scheme obtains a further 12% reduction of energy consumption of functional units with negligible performance degradation over a hardware-only scheme for a VLIW architecture. The benefits are 15% and 17% in the context of a 2-clustered and a 4-clustered VLIW architecture respectively. Our test bed uses the Trimaran compiler infrastructure.

References

  1. MediaBench.http://cares.icsl.ucla.edu/MediaBench/.]]Google ScholarGoogle Scholar
  2. MiBench. http://www.eecs.umich.edu/mibench/.]]Google ScholarGoogle Scholar
  3. NetBench. http://cares.icsl.ucla.edu/NetBench/.]]Google ScholarGoogle Scholar
  4. Trimaran System. http://www.trimaran.org/.]]Google ScholarGoogle Scholar
  5. S. G. Abraham, W. M. Meleis, and I. D. Baev. Efficient Backtracking Instruction Schedulers. In Proc. of Intl. Conf. on Parallel Architectures and Compilation Techniques pages 301--308, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Aleta, J. M. Codina, J. Sanchez, and A. Gonzalez. Graph-partitioning based Instruction Scheduling for Clustered Processors. In Proc. of Intl. Symp. on Microarchitecture pages 150--159, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Borkar. Design Challenges of Technology Scaling. IEEE Micro 19(4): 23--29,1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. A. Buttsand G. S. Sohi. A Static Power Model for Architects. In Proc. of the Intl. Symp. on Microarchitecture pages 191--201, New York, NY, USA, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Chu, K. Fan, and S. Mahlke. Region-based Hierarchical Operation Partitioning for Multicluster Processors. SIGPLAN Notices pages 300--311, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Desoli. Instruction Assignment for Clustered VLIW DSP Compilers: A New Approach. Technical Report, Hewlett-Packard, 1998.]]Google ScholarGoogle Scholar
  11. S. Dropsho, V. Kursun, D. H. Albonesi, S. Dwarkadas, and E. G. Friedman. Managing Static Leakage Energy in Microprocessor Functional Units. In Proc. of the Intl. Symp. on Microarchitecture pages 321--332, Los Alamitos, CA, USA, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. R. Ellis. Bulldog: A Compiler for VLIW Architectures MIT Press, 1986.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. PFaraboschi, G. Brown, J. A. Fisher, and G. Desoli. Clustered Instruction-level Parallel Processors. Technical report, Hewlett-Packard, 1998.]]Google ScholarGoogle Scholar
  14. K. Flautner, N. S. Kim, S. Martin, D. Blaauw, and T. Mudge. Drowsy Caches: Simple Techniques for Reducing Leakage Power. In Proc. of the Intl. Symp. on Computer Architecture pages 148--157, Washington, DC, USA, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. M.-S. Gokhan Memic and W. Hu. NetBench: A Benchmarking Suit for Network Processor. CARES Technical Report 2002.]]Google ScholarGoogle Scholar
  16. M. Guthaus, J. Ringenberg, and D. Ernst. MiBench: A Free, Commercially Representative Embedded Benchmark Suite. IEEE 4th Annual Workshop on Workload Characterization 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. Kailas, A. Agrawala, and K. Ebcioglu. CARS: A New Code Generation Framework for Clustered ILP Processors. In Proc. of Intl. Symp. on High-Performance Computer Architecture page 133, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Kaxiras, Z. Hu, and M. Martonosi. Cache Decay: Exploiting Generational Behavior to Reduce Cache Leakage Power. In Proc. of the Intl. Symp. on Computer Architecture pages 240--251,New York, NY, USA, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. S. Kim, N. Vijaykrishnan, M. Kandemir, and M. J. Irwin. Adapting Instruction Level Parallelism for Optimizing Leakage in VLIW Architectures. In Proc. of Conf. on Language, Compiler, and Tool for Embedded Systems pages 275--283,2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Kursun and E. G. Friedman. Low swing Dual Threshold Voltage Domino Logic. In Proc. of the ACM Great Lakes Symp. on VLSI pages 47--52, New York, NY, USA, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. V. S. Lapinskii, M. F. Jacome, and G. A. De Veciana. Cluster Assignment for High-Performance Embedded VLIW Processors. ACM Trans. on Design and Automation of Electronic Systems pages 430--454, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Lee, M. Potkonjak, and W. H. Mangione-Smith. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems. In Proc. of Intl. Symp. on Microarchitecture 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. W. Lee, D. Puppin, S. Swenson, and S. Amarasinghe. Convergent Scheduling.In Proc. of Intl. Symp. on Microarchitecture pages 111--122, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Leupers. Instruction Scheduling for Clustered VLIW DSPs. In Proc. of Intl. Conf. on Parallel Architectures and Compilation Techniques page 291, Washington, DC, USA, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. T. N. Mudge. Power: A First Class Design Constraint for Future Architecture and Automation.In Proc. of the Intl. Conf. on High Performance Computing pages 215--224, London, UK, 2000. Springer-Verlag.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Nagpal and Y. N. Srikant. A Graph Matching Based Integrated Scheduling Framework for Clustered VLIW Processors.In Proc. of ICPP Workshop on Compile and Runtime Techniques Parallel Computing pages 530--537, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Nagpal and Y. N. Srikant. Integrated Temporal and Spatial Scheduling for Extended Operand Clustered VLIW Processors. In Proc. of Conf. on computing frontiers pages 457--470, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Nagpal and Y. N. Srikant. Compiler-Assisted Leakage Energy Optimization for Clustered VLIW Architectures. Technical Report, Dept. of CSA, Indian Institute of Science(http://www.archive.csa.iisc.ernet.in/TR), 2005.]]Google ScholarGoogle Scholar
  29. E. Nystrom and A. E. Eichenberger. Effective Cluster Assignment for Modulo Scheduling. In Proc. of 31st annual ACM/IEEE Intl. Symp. on Microarchitecture pages 103--114, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. E. Ozer, S. Banerjia, and T. M. Conte. Unified Assign and Schedule: A New Approach to Scheduling for Clustered Register File Microarchitectures. In Proc. of Intl. Symp. on Microarchitecture pages 308--315, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. Rele, S. Pande, S. Onder, and R. Gupta. Optimizing Static Power Dissipation by Functional Units in Superscalar Processors. In Proc. of 11th Intl. Conf. on Compiler Construction pages 261--275, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. Sylvester and H. Kaul. Power-Driven Challenges in Nanometer Design.IEEE Design and Test of Computers 18(6): 12--22, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. K. A. Vardhan and Y. N. Srikant. Transition Aware Scheduling: Increasing Continuous Idle-Periods in Resource Units. In Proc. of the Conf. on Computing frontiers pages 189--198, New York, NY, USA, 2005.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S.-H. Yang, B. Falsa., M. D. Powell, K. Roy, and T. N. Vijaykumar. An Integrated Circuit/Architecture Approach to Reducing Leakage in Deep-Submicron High-Performance I Caches. In Proc. of the Intl. Symp. on High-Performance Computer Architecture page 147, Washington, DC, USA, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. H. Yun and J. Kim. Power-aware Modulo Scheduling for High-Performance VLIW Processors. In Proc. of Intl. Symp. on Low Power Electronics and Design pages 40--45,2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Zalamea, J. Llosa, E. Ayguade, and M. Valero. Modulo Scheduling with Integrated Register Spilling for Clustered VLIW Architectures. In Proc. of Intl. Symp. on Microarchitecture pages 160--169, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. W. Zhang, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, D. Duarte, and Y.-F. Tsai. Exploiting VLIW Schedule Slacks for Dynamic and Leakage Energy Reduction. In Proc. of Intl. Symp. on Microarchitecture pages 102--113,2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Compiler-assisted leakage energy optimization for clustered VLIW architectures

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      EMSOFT '06: Proceedings of the 6th ACM & IEEE International conference on Embedded software
      October 2006
      346 pages
      ISBN:1595935428
      DOI:10.1145/1176887

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 October 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate60of203submissions,30%

      Upcoming Conference

      ESWEEK '24
      Twentieth Embedded Systems Week
      September 29 - October 4, 2024
      Raleigh , NC , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader