skip to main content
research-article
Public Access

Characterizing Output Bottlenecks of a Production Supercomputer: Analysis and Implications

Published:16 January 2020Publication History
Skip Abstract Section

Abstract

This article studies the I/O write behaviors of the Titan supercomputer and its Lustre parallel file stores under production load. The results can inform the design, deployment, and configuration of file systems along with the design of I/O software in the application, operating system, and adaptive I/O libraries.

We propose a statistical benchmarking methodology to measure write performance across I/O configurations, hardware settings, and system conditions. Moreover, we introduce two relative measures to quantify the write-performance behaviors of hardware components under production load. In addition to designing experiments and benchmarking on Titan, we verify the experimental results on one real application and one real application I/O kernel, XGC and HACC IO, respectively. These two are representative and widely used to address the typical I/O behaviors of applications.

In summary, we find that Titan’s I/O system is variable across the machine at fine time scales. This variability has two major implications. First, stragglers lessen the benefit of coupled I/O parallelism (striping). Peak median output bandwidths are obtained with parallel writes to many independent files, with no striping or write sharing of files across clients (compute nodes). I/O parallelism is most effective when the application—or its I/O libraries—distributes the I/O load so that each target stores files for multiple clients and each client writes files on multiple targets in a balanced way with minimal contention. Second, our results suggest that the potential benefit of dynamic adaptation is limited. In particular, it is not fruitful to attempt to identify “good locations” in the machine or in the file system: component performance is driven by transient load conditions and past performance is not a useful predictor of future performance. For example, we do not observe diurnal load patterns that are predictable.

References

  1. Argonne National Laboratory. 2018. Retrieved November 9, 2019 from Darshan: HPC I/O Characterization Tool. http://www.mcs.anl.gov/research/projects/darshan.Google ScholarGoogle Scholar
  2. Philip Carns, Kevin Harms, William Allcock, Charles Bacon, Samuel Lang, Robert Latham, and Robert Ross. 2011. Understanding and improving computational science storage access through continuous characterization. ACM Transactions on Storage 7, 3, 8--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Philip Carns, Robert Latham, Robert Ross, Kamil Iskra, Samuel Lang, and Katherine Riley. 2009. 24/7 characterization of petascale I/O workloads. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’09). New Orleans, LA, 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  4. Luis Chacón. 2004. A non-staggered, conservative, finite-volume scheme for 3D implicit extended magnetohydrodynamics in curvilinear geometries. Computer Physics Communications 163, 3, 143--171.Google ScholarGoogle ScholarCross RefCross Ref
  5. C. S. Chang and Susan Ku. 2008. Spontaneous rotation sources in a quiescent tokamak edge plasma. Physics of Plasmas 15, 6, 062510.Google ScholarGoogle ScholarCross RefCross Ref
  6. J. H. Chen, A. Choudhary, B. de Supinski, M. DeVries, E. R. Hawkes, S. Klasky, W. Liao, K. Ma, J. Mellor-Crummey, N. Podhorszki, R. Sankaran, S. Shende, and C. Yoo. 2009. Terascale direct numerical simulations of turbulent combustion using S3D. Computational Science 8 Discovery 2, 1, 015001.Google ScholarGoogle Scholar
  7. Yanpei Chen, Kiran Srinivasan, Garth Goodson, and Randy Katz. 2011. Design implications for enterprise storage systems via multi-dimensional trace analysis. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11). Cascais, Portugal, 43--56.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Cui, K. Olsen, T. Jordan, K. Lee, J. Zhou, P. Small, D. Roten, G. Ely, D. K. Panda, A. Chourasia, J. Levesque, S. Day, and P. Maechling. 2010. Scalable earthquake simulation on petascale supercomputers. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC’10). Washington, DC. 1--20.Google ScholarGoogle Scholar
  9. David A. Dillow, Galen M. Shipman, Sarp Oral, Zhe Zhang, and Youngjae Kim. 2011. Enhancing I/O throughput via efficient routing and placement for large-scale parallel file systems. In Proceedings of the 30th IEEE International Performance Computing and Communications Conference (IPCCC’11). Orlando, FL, 21--29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Matthieu Dorier, Shadi Ibrahim, Gabriel Antoniu, and Rob Ross. 2014. Omnisc’IO: A grammar-based approach to spatial and temporal I/O patterns prediction. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’14). New Orleans, LA, 623--634.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Matt Ezell, David Dillow, Sarp Oral, Feiyi Wang, Devesh Tiwari, Don Maxwell, Dustin Leverman, and Jason Hill. 2014. I/O router placement and fine-grained routing on Titan to support Spider II. In Proceedings of the Cray User Group Conference (CUG’14). Lugano, Switzerland, 1--6.Google ScholarGoogle Scholar
  12. Youngjae Kim and Raghul Gunasekaran. 2014. Understanding I/O workload characteristics of a peta-scale storage system. The Journal of Supercomputing 71, 3, 761--780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Youngjae Kim, Raghul Gunasekaran, Galen M. Shipman, David A. Dillow, Zhe Zhang, and Bradley W. Settlemyer. 2010. Workload characterization of a leadership class storage cluster. In Proceedings of the 5th Petascale Data Storage Workshop (PDSW’10). New Orleans, LA, 1--5.Google ScholarGoogle Scholar
  14. S. Klasky, S. Ethier, Z. Lin, K. Martins, D. McCune, and R. Samtaney. 2003. Grid-based parallel data streaming implemented for the Gyrokinetic Toroidal Code. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC’03). Phoenix, AZ, 24--36.Google ScholarGoogle Scholar
  15. Nancy P. Kronenberg, Henry M. Levy, and William D. Strecker. 1986. VAXcluster: A closely-coupled distributed system. ACM Transactions on Computer Systems 4, 2, 130--146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Ku, C. S. Chang, M. Adams, J. Cummings, F. Hinton, D. Keyes, S. Klasky, W. Lee, Z. Lin, S. Parker, and the CPES team. 2006. Gyrokinetic particle simulation of neoclassical transport in the pedestal/scrape-off region of a tokamak plasma. Journal of Physics 46, 1, 87--91.Google ScholarGoogle Scholar
  17. Julian Kunkel, Michaela Zimmer, and Eugen Betke. 2015. Predicting performance of non-contiguous I/O with machine learning. In Proceedings of the International Conference on High Performance Computing (ISC’15). Frankfurt, Germany, 257--273.Google ScholarGoogle ScholarCross RefCross Ref
  18. S. Lang, P. Carns, R. Latham, R. Ross, K. Harms, and W. Allcock. 2009. I/O performance challenges at leadership scale. In Proceedings of the ACM/IEEE International Conference for High Performance Computing Networking, Storage and Analysis (SC’09). Portland, OR, 40--52.Google ScholarGoogle Scholar
  19. Qing Liu, Jeremy Logan, Yuan Tian, Hasan Abbasi, Norbert Podhorszki, Jong Youl Choi, Scott Klasky, Roselyne Tchoua, Jay Lofstead, Ron Oldfield, et al. 2014. Hello ADIOS: The challenges and lessons of developing leadership class I/O frameworks. Concurrency and Computation: Practice and Experience 26, 7, 1453--1473.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jay Lofstead, Fang Zheng, Scott Klasky, and Karsten Schwan. 2009. Adaptable, metadata-rich I/O methods for portable high performance I/O. In Proceedings of the 23rd IEEE International Parallel 8 Distributed Processing Symposium (IPDPS’09). Rome, Italy, 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jay Lofstead, Fang Zheng, Qing Liu, Scott Klasky, Ron Oldfield, Todd Kordenbrock, Karsten Schwan, and Matthew Wolf. 2010. Managing variability in the I/O performance of petascale storage systems. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC’10). Washington, DC, 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Huong Luu, Marianne Winslett, William Gropp, Robert Ross, Philip Carns, Kevin Harms, Mr Prabhat, Suren Byna, and Yushu Yao. 2015. A multiplatform study of I/O behavior on petascale supercomputers. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (HPDC’15). Portland, OR, 33--44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Sandeep Madireddy, Prasanna Balaprakash, Phil Carns, Robert Latham, Robert Ross, Shane Snyder, and Stefan M. Wild. 2018. Machine learning based parallel I/O predictive modeling: A case study on Lustre file systems. In Proceedings of the International Conference on High Performance Computing. Hyderabad, India, 184--204.Google ScholarGoogle Scholar
  24. Ryan McKenna, Stephen Herbein, Adam Moody, Todd Gamblin, and Michela Taufer. 2016. Machine learning predictions of runtime and IO traffic on high-end clusters. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’16). Taipei, Taiwan, 255--258.Google ScholarGoogle ScholarCross RefCross Ref
  25. David A. Nowark and Mark Seager. 1999. ASCI terascale simulation: Requirements and deployments. In Oak Ridge Interconnect Workshop (ASCI-00-003.1). Oak Ridge, TN, 1--15.Google ScholarGoogle Scholar
  26. Oak Ridge National Laboratory. 2018. HACC. Retrieved November 9, 2019 from https://www.olcf.ornl.gov/caar/hacc/.Google ScholarGoogle Scholar
  27. Sarp Oral, Feiyi Wang, David Dillow, Galen Shipman, Ross Miller, and Oleg Drokin. 2010. Efficient object storage journaling in a distributed parallel file system. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). San Jose, CA, 143--154.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Hongzhang Shan, Katie Antypas, and John Shalf. 2008. Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC’08). Austin, TX, 42--54.Google ScholarGoogle ScholarCross RefCross Ref
  29. Hongzhang Shan and John Shalf. 2007. Using IOR to analyze the I/O performance for HPC platforms. In Proceedings of the Cray User Group Meeting (CUG’07). Washington, DC, 1--15.Google ScholarGoogle Scholar
  30. Galen Shipman, David Dillow, Douglas Fuller, Raghul Gunasekaran, Jason Hill, Youngjae Kim, Sarp Oral, Doug Reitz, James Simmons, and Feiyi Wang. 2012. A next-generation parallel file system environment for the OLCF. In Proceedings of the Cray User Group Conference (CUG’12). Stuttgart, Germany, 1--12.Google ScholarGoogle Scholar
  31. Galen Shipman, David Dillow, Sarp Oral, and Feiyi Wang. 2009. The Spider center wide file system: from concept to reality. In Proceedings of the Cray User Group Meeting (CUG’09). Atlanta GA, 1--10.Google ScholarGoogle Scholar
  32. Rajeev Thakur, William Gropp, and Ewing Lusk. 1999. Data sieving and collective I/O in ROMIO. In Proceedings of the 7th Symposium on the Frontiers of Massively Parallel Computation (Frontiers’99). Annapolis, MD, 182--189.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yuan Tian, Scott Klasky, Hasan Abbasi, Jay Lofstead, Ray Grout, Norbert Podhorszki, Qing Liu, Yandong Wang, and Weikuan Yu. 2011. EDO: Improving read performance for scientific applications through elastic data organization. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’11). Austin, TX, 93--102.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Andrew Uselton, Mark Howison, Nicholas J. Wright, David Skinner, Noel Keen, John Shalf, Karen L. Karavanic, and Leonid Oliker. 2010. Parallel I/O performance: From events to ensembles. In Proceedings of the 24th IEEE International Parallel 8 Distributed Processing Symposium (IPDPS’10). Atlanta, GA, 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  35. Lipeng Wan, Matthew Wolf, Feiyi Wang, Jong Youl Choi, George Ostrouchov, and Scott Klasky. 2017. Analysis and modeling of the end-to-end I/O performance on OLCF’s titan supercomputer. In Proceedings of the 19th IEEE International Conference on High Performance Computing and Communications; IEEE 15th International Conference on Smart City; IEEE 3rd International Conference on Data Science and Systems (HPCC/SmartCity/DSS’17). Salt Lake City, Utah, 1--9.Google ScholarGoogle Scholar
  36. Feiyi Wang, Sarp Oral, Galen Shipman, Oleg Drokin, Tom Wang, and Isaac Huang. 2009. Understanding Lustre filesystem internals. Technical Report ORNL TM-2009, 117, 1--80.Google ScholarGoogle Scholar
  37. Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell Long, and Carlos Maltzahn. 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). Seattle, WA, 307--320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Bing Xie. 2017. Output Performance of Petascale File Systems. Ph.D. Dissertation. Duke University, Durham, NC.Google ScholarGoogle Scholar
  39. Bing Xie, Jeffrey Chase, David Dillow, Oleg Drokin, Scott Klasky, Sarp Oral, and Norbert Podhorszki. 2012. Characterizing output bottlenecks in a supercomputer. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC’12). Salt Lake City, UT, 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Bing Xie, Jeffrey S. Chase, David Dillow, Scott Klasky, Jay Lofstead, Sarp Oral, and Norbert Podhorszki. 2017. Output performance study on a production petascale filesystem. In HPC I/O in the Data Center Workshop (HPC-IODC’17). Frankfurt, Germany, 1--14.Google ScholarGoogle Scholar
  41. Bing Xie, Yezhou Huang, Jeffrey Chase, Jong Youl Choi, Scott Klasky, Jay Lofstead, and Sarp Oral. 2017. Predicting output performance of a petascale supercomputer. In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing (HPDC’17). ACM, Washington DC, 181--192.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Characterizing Output Bottlenecks of a Production Supercomputer: Analysis and Implications

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 15, Issue 4
          Usenix Fast 2019 Special Section and Regular Papers
          November 2019
          228 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/3373756
          • Editor:
          • Sam H. Noh
          Issue’s Table of Contents

          Copyright © 2020 Public Domain

          This paper is authored by an employee(s) of the United States Government and is in the public domain. Non-exclusive copying or redistribution is allowed, provided that the article citation is given and the authors and agency are clearly identified as its source.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 January 2020
          • Accepted: 1 May 2019
          • Revised: 1 March 2019
          • Received: 1 April 2018
          Published in tos Volume 15, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format