skip to main content
10.1145/3149412.3149419acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Public Access

PoLiMEr: An Energy Monitoring and Power Limiting Interface for HPC Applications

Published:12 November 2017Publication History

ABSTRACT

Power and energy consumption are now key design concerns in HPC. To develop software that meets power and energy constraints, scientific application developers must have a reliable way to measure these values and relate them to application-specific events. Scientists face two challenges when measuring and controlling power: (1) diversity---power and energy measurement interfaces differ between vendors---and (2) distribution---power measurements of MPI simulations should be unaffected by the mapping of MPI processes to physical hardware nodes. While some prior work defines standardized software interfaces for power management, these efforts do not support distributed environments. The result is that the current state-of-the-art requires scientists interested in power optimization to write tedious, error-prone application-and system-specific code. To make power measurement and management easier for scientists, we propose PoLiMEr, a user-space library that supports fine-grained application-level power monitoring and capping. We evaluate PoLiMEr by deploying it on Argonne National Laboratory's Theta system and using it to measure and cap power, scaling the performance and power of several applications on up to 1024 nodes. We find that PoLiMEr requires only a few additional lines of code, but easily allows users to detect energy anomalies, apply power caps, and evaluate Theta's unique architectural features.

References

  1. Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, Jayesh Krishna, Ewing Lusk, and Rajeev Thakur. 2010. PMI: A scalable parallel process-management interface for extreme-scale systems. Recent Advances in the Message Passing Interface (2010), 31--41.Google ScholarGoogle Scholar
  2. Arnaldo Carvalho de Melo. 2010. The new linux perf tools. In Slides from Linux Kongress, Vol. 18.Google ScholarGoogle Scholar
  3. Spencer Desrochers, Chad Paradis, and Vincent M Weaver. 2016. A Validation of DRAM RAPL Power Measurements. In Proceedings of the Second International Symposium on Memory Systems. ACM, 455--470. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jonathan Eastep, Steve Sylvester, Christopher Cantalupo, Federico Ardanaz, Brad Geltz, Asma Al-Rawi, Fuat Keceli, and Kelly Livingston. 2016. Global extensible open power manager: a vehicle for HPC community collaboration toward co-designed energy management solutions. Supercomputing PMBS (2016).Google ScholarGoogle Scholar
  5. Vladimir Getov, Darren J. Kerbyson, Matt Macduff, and Adolfy Hoisie. 2015. Towards an Application-specific Thermal Energy Model of Current Processors. In Proceedings of the 3rd International Workshop on Energy Efficient Supercomputing (E2SC '15). ACM, New York, NY, USA, Article 5, 10 pages. https://doi.org/10.1145/2834800.2834805Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. E. Grant, M. Levenhagen, S. L. Olivier, D. DeBonis, K. T. Pedretti, and J. H. Laros III. 2016. Standardizing Power Monitoring and Control at Exascale. Computer 49, 10 (Oct 2016), 38--46. https://doi.org/10.1109/MC.2016.308 Google ScholarGoogle ScholarCross RefCross Ref
  7. Connor Imes, Lars Bergstrom, and Henry Hoffmann. 2016. A portable interface for runtime energy monitoring. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 968--974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Intel. 2015. 64 and IA-32 Architectures Software Developer's Manual. Volume 3A: System Programming Guide, Part (2015).Google ScholarGoogle Scholar
  9. LAMMPS. 2004. LAMMPS WWW Site. http://lammps.sandia. gov. (2004). Accessed: August, 2017.Google ScholarGoogle Scholar
  10. James H Laros, Phil Pokorny, and David DeBonis. 2013. Powerinsight-a commodity power measurement capability. In Green Computing Conference (IGCC), 2013 International. IEEE, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  11. Gary Lawson, Vaibhav Sundriyal, Masha Sosonkina, and Yuzhong Shen. 2016. Runtime power limiting of parallel applications on Intel Xeon Phi processors. In Proceedings of the 4th International Workshop on Energy Efficient Supercomputing. IEEE Press, 39--45. Google ScholarGoogle ScholarCross RefCross Ref
  12. SJ Martin, D Rush, and M Kappel. 2015. Cray advanced platform monitoring and control (CAPMC). In Proc. Cray Users' Group Technical Conference (CUG).Google ScholarGoogle Scholar
  13. S Martin, D Rush, M Kappel, M Sandstedt, and J Williams. 2016. Cray XC40 Power Monitoring and Control for Knights Landing. Proceedings of the Cray User Group (CUG) (2016).Google ScholarGoogle Scholar
  14. Philip J Mucci, Shirley Browne, Christine Deane, and George Ho. 1999. PAPI: A portable interface to hardware performance counters. In Proceedings of the department of defense HPCMP users group conference, Vol. 710.Google ScholarGoogle Scholar
  15. Scott Parker, Vitali Morozov, Sudheer Chunduri, Kevin Harms, Chris Knight, and Kalyan Kumaran. 2017. Early Evaluation of the Cray XC40 Xeon Phi System Theta at Argonne. Cray User Group 2017 proceedings (2017).Google ScholarGoogle Scholar
  16. Kevin Pedretti, Stephen L. Olivier, Kurt B. Ferreira, Galen Shipman, and Wei Shu. 2015. Early Experiences with Node-level Power Capping on the Cray XC40 Platform. In Proceedings of the 3rd International Workshop on Energy Efficient Supercomputing (E2SC '15). ACM, New York, NY, USA, Article 1, 10 pages. https://doi.org/10.1145/2834800.2834801Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Steve Plimpton. 1995. Fast parallel algorithms for short-range molecular dynamics. Journal of computational physics 117, 1 (1995), 1--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kathleen Shoga, Barry Rountree, Martin Schulz, and Jeff Shafer. 2014. Whitelisting MSRs with msr-safe. In 3rd Workshop on Exascale Systems Programming Tools, in conjunction with SC14.Google ScholarGoogle Scholar
  19. Sean Wallace, Venkatram Vishwanath, Susan Coghlan, Zhiling Lan, and Michael E Papka. 2015. Comparison of vendor supplied environmental data collection mechanisms. In Cluster Computing (CLUSTER), 2015 IEEE International Conference on. IEEE, 690--697.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Sean Wallace, Zhou Zhou, Venkatram Vishwanath, Susan Coghlan, John Tramm, Zhiling Lan, and Michael E Papka. 2016. Application power profiling on IBM Blue Gene/Q. Parallel Comput. 57 (2016), 73--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Vincent M Weaver, Matt Johnson, Kiran Kasichayanula, James Ralph, Piotr Luszczek, Dan Terpstra, and Shirley Moore. 2012. Measuring energy and power with PAPI. In Parallel Processing Workshops (ICPPW), 2012 41st International Conference on. IEEE, 262--268.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. PoLiMEr: An Energy Monitoring and Power Limiting Interface for HPC Applications

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        E2SC'17: Proceedings of the 5th International Workshop on Energy Efficient Supercomputing
        November 2017
        84 pages
        ISBN:9781450351324
        DOI:10.1145/3149412

        Copyright © 2017 ACM

        © 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 November 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        E2SC'17 Paper Acceptance Rate10of21submissions,48%Overall Acceptance Rate17of33submissions,52%

        Upcoming Conference

        ICSE 2025

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader