skip to main content
10.5555/1870926.1870992acmconferencesArticle/Chapter ViewAbstractPublication PagesdateConference Proceedingsconference-collections
research-article

Energy-performance design space exploration in SMT architectures exploiting selective load value predictions

Published:08 March 2010Publication History

ABSTRACT

This paper presents a design space exploration of a selective load value prediction scheme suitable for energy-aware Simultaneous Multi-Threaded (SMT) architectures. A load value predictor is an architectural enhancement which speculates over the results of a micro-processor load instruction to speed-up the execution of the following instructions. The proposed architectural enhancement differs from a classic predictor due to an improved selection scheme that allows to activate the predictor only when a miss occurs in the first level of cache. We analyze the effectiveness of the selective predictor in terms of overall energy reduction and performance improvement. To this end, we show how the proposed predictor can produce benefits (in terms of overall cost) when the cache size of the SMT architecture is reduced and we compare it with a classic non-selective load value prediction scheme. The experimental results have been gathered with a state-of-the-art SMT simulator running the SPEC2000 benchmark suite, both in SMT and non-SMT mode.

References

  1. D. A. Patterson and J. L. Hennessy. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, 2nd edition, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Mikko H. Lipasti, Christopher B. Wilkerson, and John Paul Shen. Value locality and load value prediction. In ASPLOS-VII: Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, pages 138--147, New York, NY, USA, 1996. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Toshinori Sato and Itsujiro Arita. Reducing energy consumption via low-cost value prediction. In PATMOS '02: Proceedings of the 12th International Workshop on Integrated Circuit Design. Power and Timing Modeling, Optimization and Simulation, pages 380--389, London, UK, 2002. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ravi Bhargava and Lizy K. John. Latency and energy aware value prediction for high-frequency processors. In ICS '02: Proceedings of the 16th international conference on Supercomputing, pages 45--56, New York, NY, USA, 2002. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. N. Vintan, A. Florea, and A. Gellert. Focalising dynamic value prediction to cpu's context. IEE Proceedings - Computers and Digital Techniques, 152(4):473--481, 2005.Google ScholarGoogle Scholar
  6. Brad Calder, Glenn Reinman, and Dean M. Tullsen. Selective value prediction. In ISCA '99: Proceedings of the 26th annual international symposium on Computer architecture, pages 64--74, Washington, DC, USA, 1999. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Luis Ceze, Karin Strauss, James Tuck, Josep Torrellas, and Jose Renau. Cava: Using checkpoint-assisted value prediction to hide 12 misses. ACM Transactions on Architecture and Code Optimization, (3), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Huiyang Zhou and Thomas M. Conte. Enhancing memory level parallelism via recovery-free value prediction. In Proceedings of the 17th International Conference on Supercomputing, pages 326--335, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Nathan Tuck and Dean M. Tullsen. Multithreaded value prediction. In HPCA '05: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, pages 5--15, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Arpad Gellert, Adrian Florea, and Lucian N. Vintan. Exploiting selective instruction reuse and value prediction in a superscalar architecture. Journal of Systems Architecture, 55(3):188--195, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Wilton and N. Jouppi. CACTI: An Enhanced Cache Access and Cycle Time Model. volume 31, pages 677--688, 1996.Google ScholarGoogle Scholar
  12. J. Sharkey, D. Ponomarev, and K. Ghose. M-sim: a flexible, multi-threaded architectural simulation environment. Technical Report CS-TR-05-DP01, Department of Computer Science, State University of New York at Binghamton, 2005.Google ScholarGoogle Scholar
  13. David Brooks, Vivek Tiwari, and Margaret Martonosi. Wattch: a framework for architectural-level power analysis and optimizations. In Proceedings ISCA 2000: International Symposium on Computer Architecture, pages 83--94, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Doug Burger, Todd M. Austin, and Steve Bennett. Evaluating future microprocessors: The simplescalar tool set. Technical Report CS-TR-1996-1308, University of Wisconsin, 1996.Google ScholarGoogle Scholar
  15. John L. Henning. Spec cpu2000: Measuring cpu performance in the new millennium. Computer, 33(7):28--35, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    DATE '10: Proceedings of the Conference on Design, Automation and Test in Europe
    March 2010
    1868 pages
    ISBN:9783981080162

    Publisher

    European Design and Automation Association

    Leuven, Belgium

    Publication History

    • Published: 8 March 2010

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate518of1,794submissions,29%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader