skip to main content
10.1145/3295500.3356217acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article
Open Access

Performance optimality or reproducibility: that is the question

Published:17 November 2019Publication History

ABSTRACT

The era of extremely heterogeneous supercomputing brings with itself the devil of increased performance variation and reduced reproducibility. There is a lack of understanding in the HPC community on how the simultaneous consideration of network traffic, power limits, concurrency tuning, and interference from other jobs impacts application performance.

In this paper, we design a methodology that allows both HPC users and system administrators to understand the trade-off space between optimal and reproducible performance. We present a first-of-its-kind dataset that simultaneously varies multiple system- and user-level parameters on a production cluster, and introduce a new metric, called the desirability score, which enables comparison across different system configurations. We develop a novel, model-agnostic machine learning methodology based on the graph signal theory for comparing the influence of parameters on application predictability, and using a new visualization technique, make practical suggestions for best practices for multi-objective HPC environments.

References

  1. 2016. OSU Benchmarks. http://mvapich.cse.ohio-state.edu/benchmarks/. (2016).Google ScholarGoogle Scholar
  2. Ana Gainaru Ana, Guillaume Aupy, Anne Benoit, Franck Cappello, Yves Robert, and Marc Snir. 2015. Scheduling the I/O of HPC applications under congestion. In <u>IEEE International Parallel and Distributed Processing Symposium (IPDPS).</u>Google ScholarGoogle Scholar
  3. David H. Bailey. 2006. NASA Advanced Supercomputing Division, NAS Parallel Benchmark Suite v3.3. (2006).Google ScholarGoogle Scholar
  4. D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. [n. d.]. The NAS Parallel Benchmarks. In <u>Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).</u>Google ScholarGoogle Scholar
  5. Bradley J. Barnes, Barry Rountree, David K. Lowenthal, Jaxk Reeves, Bronis de Supinski, and Martin Schulz. 2008. A Regression-based Approach to Scalability Prediction. In <u>Proceedings of the 22nd Annual International Conference on Supercomputing.</u> 368--377.Google ScholarGoogle Scholar
  6. Abhinav Bhatele. 2010. Automating Topology Aware Mapping for Supercomputers. In <u>PhD Thesis, Dept. of Computer Science, University of Illinois.</u> http://hdl.handle.net/2142/16578.Google ScholarGoogle Scholar
  7. Abhinav Bhatele, Todd Gamblin, Steven H. Langer, Peer-Timo Bremer, Erik W. Draeger, Bernd Hamann, Katherine E. Isaacs, Aaditya G. Landge, Joshua A. Levine, Valerio Pascucci, Martin Schulz, and Charles H. Still. 2012. Mapping Applications with Collectives over Sub-communicators on Torus Networks. In <u>Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC '12).</u>Google ScholarGoogle Scholar
  8. Abhinav Bhatele, Kathryn Mohror, Steven H. Langer, and Katherine E. Isaacs. 2013. There Goes the Neighborhood: Performance Degradation Due to Nearby Jobs. In <u>Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC '13).</u>Google ScholarGoogle Scholar
  9. A. Bhatele, A. R. Titus, J. J. Thiagarajan, N. Jain, T. Gamblin, P. T. Bremer, M. Schulz, and L. V. Kale. 2015. Identifying the Culprits Behind Network Congestion. In <u>Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International.</u>Google ScholarGoogle Scholar
  10. H. Bhatia, N. Jain, A. Bhatele, Y. Livnat, J. Domke, V. Pascucci, and P.-T. Bremer. 2018. Interactive Investigation of Traffic Congestion on Fat-Tree Networks Using TreeScope. <u>Computer Graphics Forum</u> 37, 3 (2018), 561--572. Google ScholarGoogle ScholarCross RefCross Ref
  11. S.H. Bokhari. 1981. On the Mapping Problem. <u>Computers, IEEE Transactions on</u> C-30, 3 (March 1981), 207--214.Google ScholarGoogle Scholar
  12. Shekhar Borkar, Tanay Karnik, Siva Narendra, Jim Tschanz, Ali Keshavarzi, and Vivek De. 2003. Parameter Variations and Impact on Circuits and Microarchitecture. In <u>Proceedings of the 40th annual Design Automation Conference.</u> 338--342.Google ScholarGoogle Scholar
  13. M. Broyles, C. Cain, T. Rosedahl, and G. Silva. 2015. IBM Energy Scale for POWER8 Processor-Based Systems. In <u>IBM Whitepaper.</u>Google ScholarGoogle Scholar
  14. R. R. Chandrasekar, A. Venkatesh, K. Hamidouche, and D. K. Panda. 2015. Power-Check: An Energy-Efficient Checkpointing Framework for HPC Clusters. In <u>2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.</u>Google ScholarGoogle Scholar
  15. Siheng Chen, Rohan Varma, Aliaksei Sandryhaila, and Jelena Kovačević. 2015. Discrete signal processing on graphs: Sampling theory. <u>IEEE transactions on signal processing</u> 63, 24 (2015), 6510--6523.Google ScholarGoogle Scholar
  16. Ryan Cochran, Can Hankendi, Ayse K Coskun, and Sherief Reda. 2011. Pack & Cap: adaptive DVFS and thread packing under power caps. In <u>Proceedings of the 44th annual IEEE/ACM international symposium on microarchitecture.</u> ACM, 175--185.Google ScholarGoogle Scholar
  17. Diego Crupnicoff, Sujal Das, and Eitan Zahavi. 2005. <u>Deploying Quality of Service and Congestion Control in InfiniBand-based Data Center Networks.</u> Technical Report. Mellanox Technologies.Google ScholarGoogle Scholar
  18. Howard David, Eugene Gorbatov, Ulf Hanebutte, Rahul Khanna, and Christian Le. 2010. RAPL: Memory Power Estimation and Capping. In <u>Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design (ISLPED '10).</u> 189--194.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Dighe, S.R. Vangal, P. Aseron, S. Kumar, T. Jacob, K.A. Bowman, J. Howard, J. Tschanz, V. Erraguntla, N. Borkar, V.K. De, and S. Borkar. 2011. Within-Die Variation-Aware Dynamic-Voltage-Frequency-Scaling With Optimal Core Allocation and Thread Hopping for the 80-Core TeraFLOPS Processor. <u>Solid-State Circuits, IEEE Journal of</u> 46, 1 (Jan 2011), 184--193.Google ScholarGoogle Scholar
  20. Maja Etinski, Julita Corbalan, Jesus Labarta, and Mateo Valero. 2010. Optimizing Job Performance Under a Given Power Constraint in HPC Centers. In <u>Green Computing Conference.</u> 257--267.Google ScholarGoogle Scholar
  21. Maja Etinski, Julita Corbalan, Jesus Labarta, and Mateo Valero. 2011. Linear Programming Based Parallel Job Scheduling for Power Constrained Systems. In <u>International Conference on High Performance Computing and Simulation.</u> 72--80.Google ScholarGoogle Scholar
  22. Maja Etinski, Julita Corbalan, Jesus Labarta, and Mateo Valero. 2012. Parallel Job Scheduling for Power Constrained HPC Systems. Parallel Comput. 38, 12 (Dec. 2012), 615--630.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Fan, P. Rich, W. E. Allcock, M. E. Papka, and Z. Lan. 2017. Trade-Off Between Prediction Accuracy and Underestimation Rate in Job Runtime Estimates. In <u>2017 IEEE International Conference on Cluster Computing (CLUSTER).</u> 530--540. Google ScholarGoogle ScholarCross RefCross Ref
  24. T. Fujiwara, P. Malakar, K. Reda, V. Vishwanath, M. E. Papka, and K. Ma. 2017. A Visual Analytics System for Optimizing Communications in Massively Parallel Applications. In <u>2017 IEEE Conference on Visual Analytics Science and Technology (VAST).</u> 59--70.Google ScholarGoogle Scholar
  25. Yiannis Georgiou, Thomas Cadeau, David Glesser, Danny Auble, Morris Jette, and Matthieu Hautreux. 2014. Energy Accounting and Control with SLURM Resource and Job Management System. In <u>Distributed Computing and Networking.</u> Lecture Notes in Computer Science, Vol. 8314. Springer Berlin Heidelberg, 96--118.Google ScholarGoogle Scholar
  26. Luís Fabrício Góes, Pedro Guerra, Bruno Coutinho, Leonardo Rocha, Wagner Meira, Renato Ferreira, Dorgival Guedes, and Walfredo Cirne. 2005. AnthillSched: A Scheduling Strategy for Irregular and Iterative I/O-Intensive Parallel Jobs. In <u>Job Scheduling Strategies for Parallel Processing: 11th International Workshop, JSSPP 2005.</u>Google ScholarGoogle Scholar
  27. I. Goiri, Kien Le, M. E. Haque, R. Beauchea, T. D. Nguyen, J. Guitart, J. Torres, and R. Bianchini. 2011. GreenSlot: Scheduling Energy Consumption in Green Datacenters. In <u>High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for.</u> 1--11.Google ScholarGoogle Scholar
  28. T. Hoefler and M. Snir. 2011. Generic Topology Mapping Strategies for Large-scale Parallel Architectures. In <u>Proceedings of the 2011 ACM International Conference on Supercomputing (ICS'11).</u> ACM, 75--85.Google ScholarGoogle Scholar
  29. Yuichi Inadomi, Tapasya Patki, Koji Inoue, Mutsumi Aoyagi, Barry Rountree, Martin Schulz, David Lowenthal, Yasutaka Wada, Keiichiro Fukazawa, Masatsugu Ueda, Masaaki Kondo, and Ikuo Miyoshi. 2015. Analyzing and Mitigating the Impact of Manufacturing Variability in Power-constrained Supercomputing. In <u>Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '15).</u>Google ScholarGoogle Scholar
  30. Intel. 2011. Intel-64 and IA-32 Architectures Software Developer's Manual, Volumes 3A and 3B: System Programming Guide. (2011).Google ScholarGoogle Scholar
  31. Katherine E. Isaacs, Alfredo Giménez, Ilir Jusufi, Todd Gamblin, Abhinav Bhatele, Martin Schulz, Bernd Hamann, and Timo Bremer. 2014. State of the Art of Performance Visualization. In <u>EuroVis.</u>Google ScholarGoogle Scholar
  32. Nikhil Jain, Abhinav Bhatele, Louis H. Howell, David Böhme, Ian Karlin, Edgar A. León, Misbah Mubarak, Noah Wolfe, Todd Gamblin, and Matthew L. Leininger. 2017. Predicting the Performance Impact of Different Fat-tree Configurations. In <u>Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '17).</u> ACM, New York, NY, USA, Article 50, 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Nikhil Jain, Abhinav Bhatele, Xiang Ni, Todd Gamblin, and Laxmikant V. Kale. 2017. Partitioning Low-diameter Networks to Eliminate Inter-job Interference. In <u>Proceedings of the IEEE International Parallel & Distributed Processing Symposium (IPDPS '17 (to appear)).</u> IEEE Computer Society. LLNL-CONF-.Google ScholarGoogle Scholar
  34. Sudhakar Jilla. 2013. Minimizing The Effects of Manufacturing Variation During Physcial Layout. <u>Chip Design Magazine</u> (2013). http://chipdesignmag.com/display.php?articleId=2437.Google ScholarGoogle Scholar
  35. A. Jokanovic, J. C. Sancho, G. Rodriguez, A. Lucero, C. Minkenberg, and J. Labarta. 2015. Quiet Neighborhoods: Key to Protect Job Performance Predictability. In <u>2015 IEEE International Parallel and Distributed Processing Symposium.</u> 449--459. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Kyong Hoon Kim, R Buyya, and Jong Kim. 2007. Power Aware Scheduling of Bag-of-Tasks Applications with Deadline Constraints on DVS-enabled Clusters. In <u>Cluster Computing and the Grid, 2007. CCGRID 2007.</u> 541--548.Google ScholarGoogle Scholar
  37. R. Kent Koeninger. 2003. The Ultra-Scalable HPTC Lustre Filesystem. <u>Cluster World</u> (2003).Google ScholarGoogle Scholar
  38. A. J. Kunen, T. S. Bailey, and P. N. Brown. [n. d.]. KRIPKE - A Massively Parallel Transport Mini-App. In <u>American Nuclear Society M&C 2015.</u>Google ScholarGoogle Scholar
  39. Aaditya G Landge, Joshua A Levine, Abhinav Bhatele, Katherine E Isaacs, Todd Gamblin, Martin Schulz, Steve H Langer, P-T Bremer, and Valerio Pascucci. 2012. Visualizing network traffic to understand the performance of massively parallel simulations. <u>Visualization and Computer Graphics, IEEE Transactions on</u> 18, 12 (2012), 2467--2476.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Barry Lawson and Evgenia Smirni. 2005. Power-aware Resource Allocation in High-end Systems via Online Simulation. In <u>International onference on Supercomputing.</u> 229--238.Google ScholarGoogle Scholar
  41. Kangkang Li, Maciej Malawski, and Jarek Nabrzyski. 2017. Topology-aware Job Allocation in 3D Torus-based HPC Systems with Hard Job Priority Constraints. <u>Procedia Computer Science</u> 108 (2017), 515--524. International Conference on Computational Science, ICCS 2017, 12--14 June 2017, Zurich, Switzerland. Google ScholarGoogle ScholarCross RefCross Ref
  42. Xiaoyao Liang and David Brooks. 2006. Mitigating the Impact of Process Variations on Processor Register Files and Execution Units. In <u>International Symposium on Microarchitecture.</u> 504--514.Google ScholarGoogle Scholar
  43. Aniruddha Marathe, Rushil Anirudh, Nikhil Jain, Abhinav Bhatele, Jayaraman Thiagarajan, Bhavya Kailkhura, Jae-Seung Yeom, Barry Rountree, and Todd Gamblin. 2017. Performance Modeling Under Resource Constraints Using Deep Transfer Learning. In <u>Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '17).</u> ACM, New York, NY, USA, Article 31, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Aleksander Maricq, Dmitry Duplyakin, Ivo Jimenez, Carlos Maltzahn, Ryan Stutsman, and Robert Ricci. 2018. <u>Taming Performance Variability.</u> Berkeley, CA, USA. http://dl.acm.org/citation.cfm?id=3291168.3291198Google ScholarGoogle Scholar
  45. C. M. McCarthy, K. E. Isaacs, A. Bhatele, P. Bremer, and B. Hamann. 2014. Visualizing the Five-dimensional Torus Network of the IBM Blue Gene/Q. In <u>2014 First Workshop on Visual Performance Analysis.</u> 24--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Jie Meng, Eduard Llamosí, Fulya Kaplan, Chulian Zhang, Jiayi Sheng, Martin Herbordt, Gunar Schirner, and Ayse K Coskun. 2016. Communication and cooling aware job allocation in data centers for communication-intensive workloads. J. Parallel and Distrib. Comput. 96 (2016), 181--193.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Jie Meng, Samuel McCauley, Fulya Kaplan, Vitus J. Leung, and Ayse K. Coskun. 2015. Simulation and optimization of {HPC} job allocation for jointly reducing communication and cooling costs. <u>Sustainable Computing: Informatics and Systems</u> 6 (2015), 48--57. Special Issue on Selected Papers from 2013 International Green Computing Conference (IGCC).Google ScholarGoogle Scholar
  48. G. Michelogiannakis, K. Z. Ibrahim, J. Shalf, J. J. Wilke, S. Knight, and J. P. Kenny. 2017. APHiD: Hierarchical Task Placement to Enable a Tapered Fat Tree Topology for Lower Power and Cost in HPC Networks. In <u>2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).</u> 228--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Adam Moody. 2009. Contention-Free Routing for Shift-based Communication in MPI Applications on Large-scale InfiniBand Clusters. <u>LLNL-TR-418522, Lawrence Livermore National Laboratory, Livermore, CA</u> (October 2009).Google ScholarGoogle Scholar
  50. T. Patki, E. Ates, A. Coskun, and J. Thiagarajan. 2018. Understanding Simultaneous Impact of Network QoS and Power on HPC Application Performance. In <u>Computational Reproducibility at Exascale (CRE'18), Supercomputing Workshop 2018.</u>Google ScholarGoogle Scholar
  51. Tapasya Patki, David K. Lowenthal, Barry Rountree, Martin Schulz, and Bronis R. de Supinski. 2013. Exploring Hardware Overprovisioning in Power-constrained, High Performance Computing. In <u>International Conference on Supercomputing.</u>Google ScholarGoogle Scholar
  52. Tapasya Patki, Anjana Sasidharan, Matthias Maiterth, David Lowenthal, Barry Rountree, Martin Schulz, and Bronis de Supinski. 2015. Practical Resource Management in Power-Constrained, High Performance Computing. In <u>High Performance Parallel and Distributed Computing (HPDC).</u>Google ScholarGoogle Scholar
  53. Olga Pearce, Hadia Ahmed, Rasmus W. Larsen, Peter Pirkelbauer, and David F. Richards. 2017. Exploring dynamic load imbalance solutions with the CoMD proxy application. <u>Future Generation Computer Systems</u> (2017). http://www.sciencedirect.com/science/article/pii/S0167739X17300560Google ScholarGoogle Scholar
  54. Samuel D. Pollard, Nikhil Jain, Stephen Herbein, and Abhinav Bhatele. 2018. Evaluation of an Interference-free Node Allocation Policy on Fat-tree Clusters. In <u>Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC '18).</u> IEEE Press, Piscataway, NJ, USA, Article 26, 13 pages. http://dl.acm.org/citation.cfm?id=3291656.3291691Google ScholarGoogle Scholar
  55. R. Rajachandrasekar, J. Jaswani, H. Subramoni, and D. K. Panda. 2012. Minimizing Network Contention in InfiniBand Clusters with a QoS-Aware Data-Staging Framework. In <u>2012 IEEE International Conference on Cluster Computing.</u>Google ScholarGoogle Scholar
  56. Barry Rountree, Dong H. Ahn, Bronis R. de Supinski, David K. Lowenthal, and Martin Schulz. 2012. Beyond DVFS: A First Look at Performance under a Hardware-Enforced Power Bound. In <u>IPDPS Workshops (HPPAC).</u> IEEE Computer Society, 947--953.Google ScholarGoogle Scholar
  57. Barry Rountree and Stephanie Labasan. [n. d.]. Libmsr. https://github.com/LLNL/libmsr. ([n. d.]).Google ScholarGoogle Scholar
  58. P. Sadayappan and F. Ercal. 1987. Nearest-Neighbor Mapping of Finite Element Graphs onto Processor Meshes. <u>Computers, IEEE Transactions on</u> C-36, 12 (Dec 1987), 1408--1424.Google ScholarGoogle Scholar
  59. R. Sakamoto, T. Cao, M. Kondo, K. Inoue, M. Ueda, T. Patki, D. Ellsworth, B. Rountree, and M. Schulz. 2017. Production Hardware Overprovisioning: Real-World Performance Optimization Using an Extensible Power-Aware Resource Management Framework. In <u>2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).</u> 957--966. Google ScholarGoogle ScholarCross RefCross Ref
  60. R. Sakamoto, T. Patki, T. Cao, M. Kondo, K. Inoue, M. Ueda, D. Ellsworth, B. Rountree, and M. Schulz. 2018. Analyzing Resource Trade-offs in Hardware Over-provisioned Supercomputers. In <u>2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).</u> 526--535. Google ScholarGoogle ScholarCross RefCross Ref
  61. Samie B. Samaan. 2004. The Impact of Device Parameter Variations on the Frequency and Performance of VLSI Chips. In <u>Computer Aided Design, 2004. ICCAD-2004. IEEE/ACM International Conference on.</u> 343--346.Google ScholarGoogle Scholar
  62. Aliaksei Sandryhaila and José MF Moura. 2013. Discrete signal processing on graphs. <u>IEEE transactions on signal processing</u> 61, 7 (2013), 1644--1656.Google ScholarGoogle Scholar
  63. Osman Sarood, Akhil Langer, Abhishek Gupta, and Laxmikant V. Kale. 2014. Maximizing Throughput of Overprovisioned HPC Data Centers Under a Strict Power Budget. In <u>Supercomputing.</u>Google ScholarGoogle Scholar
  64. Lee Savoie, David K Lowenthal, Bronis R De Supinski, Tanzima Islam, Kathryn Mohror, Barry Rountree, and Martin Schulz. 2016. I/O Aware Power Shifting. In <u>Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016.</u> Institute of Electrical and Electronics Engineers Inc., United States, 740--749. Google ScholarGoogle ScholarCross RefCross Ref
  65. Kathleen Shoga, Barry Rountree, and Martin Schulz. 2014. Whitelisting MSRs with msr-safe. <u>Third Workshop on Extreme-Scale Programming Tools, held with SC 14</u> (November 2014).Google ScholarGoogle Scholar
  66. Wei Tang, N. Desai, D. Buettner, and Zhiling Lan. 2010. Analyzing and Adjusting User Runtime Estimates to Improve Job Scheduling on the Blue Gene/P. In <u>Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on.</u> 1--11.Google ScholarGoogle Scholar
  67. R. Teodorescu and J. Torrellas. 2008. Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors. In <u>Computer Architecture, 2008. ISCA '08. 35th International Symposium on.</u> 363--374.Google ScholarGoogle Scholar
  68. Sagar Thapaliya, Purushotham Bangalore, Jay Lofstead, Kathryn Mohror, and Adam Moody. 2014. IO-Cop: Managing Concurrent Accesses to Shared Parallel File System. In <u>International Conference on Parallel Processing Workshops (ICCPW).</u>Google ScholarGoogle Scholar
  69. L. Theisen, A. Shah, and F. Wolf. 2014. Down to Earth - How to Visualize Traffic on High-dimensional Torus Networks. In <u>2014 First Workshop on Visual Performance Analysis.</u> 17--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. J. J. Thiagarajan, R. Anirudh, B. Kailkhura, N. Jain, T. Islam, A. Bhatele, J. Yeom, and T. Gamblin. 2018. PADDLE: Performance Analysis Using a Data-Driven Learning Environment. In <u>2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).</u> 784--793. Google ScholarGoogle ScholarCross RefCross Ref
  71. Ehsan Totoni, Akhil Langer, Josep Torrellas, and Laxmikant Kale. 2015. Scheduling for HPC Systems with Process Variation Heterogeneity. (January 2015).Google ScholarGoogle Scholar
  72. James W. Tschanz, James T. Kao, Siva G. Narendra, Raj Nair, Dmitri A. Antoniadis, Anantha P. Chandrakasan, and Vivek De. 2002. Adaptive Body Bias for Reducing Impacts of Die-to-die and Within-die Parameter Variations on Microprocessor Frequency and Leakage. <u>Solid-State Circuits, IEEE Journal of</u> 37, 11 (Nov 2002), 1396--1402.Google ScholarGoogle Scholar
  73. Ozan Tuncer, Emre Ates, Yijia Zhang, Ata Turk, Jim Brandt, Vitus Leung, Manuel Egele, and Ayse K. Coskun. 2017. Diagnosing Performance Variations in HPC Applications using Machine Learning. <u>International Supercomputing Conference in High Performance Computing (ISC-HPC)</u> (June 2017).Google ScholarGoogle Scholar
  74. C. T. Vaughan and R. F. Barrett. 2015. Enabling Tractable Exploration of the Performance of Adaptive Mesh Refinement. In <u>2015 IEEE International Conference on Cluster Computing.</u> 746--752. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. X. Yang, J. Jenkins, M. Mubarak, R. B. Ross, and Z. Lan. 2016. Watch Out for the Bully! Job Interference Study on Dragonfly Network. In <u>SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis.</u> 750--760. Google ScholarGoogle ScholarCross RefCross Ref
  76. Xu Yang, Zhou Zhou, Sean Wallace, Zhiling Lan, Wei Tang, Susan Coghlan, and Michael E. Papka. 2013. Integrating Dynamic Pricing of Electricity into Energy Aware Scheduling for HPC Systems. In <u>International Conference for High Performance Computing, Networking, Storage and Analysis.</u> 17--22.Google ScholarGoogle Scholar
  77. Ziming Zhang, Michael Lang, Scott Pakin, and Song Fu. 2014. Trapped Capacity: Scheduling under a Power Cap to Maximize Machine-room Through-put. In <u>Proceedings of the 2nd International Workshop on Energy Efficient Supercomputing.</u> IEEE Press, 41--50.Google ScholarGoogle Scholar
  78. Zhou Zhou, Zhiling Lan, Wei Tang, and Narayan Desai. 2014. Reducing Energy Costs for IBM Blue Gene/P via Power-Aware Job Scheduling. In <u>Job Scheduling Strategies for Parallel Processing.</u> Springer Berlin Heidelberg, 96--115.Google ScholarGoogle Scholar
  79. Z. Zhou, X. Yang, Z. Lan, P. Rich, W. Tang, V. Morozov, and N. Desai. 2015. Improving Batch Scheduling on Blue Gene/Q by Relaxing 5D Torus Network Allocation Constraints. In <u>2015 IEEE International Parallel and Distributed Processing Symposium.</u> 439--448. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Performance optimality or reproducibility: that is the question

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
            November 2019
            1921 pages
            ISBN:9781450362290
            DOI:10.1145/3295500

            Copyright © 2019 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 17 November 2019

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate1,516of6,373submissions,24%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader