skip to main content
10.1145/3295500.3357156acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

A data-centric approach to extreme-scale ab initio dissipative quantum transport simulations

Published:17 November 2019Publication History

ABSTRACT

The computational efficiency of a state of the art ab initio quantum transport (QT) solver, capable of revealing the coupled electrothermal properties of atomically-resolved nano-transistors, has been improved by up to two orders of magnitude through a data centric reorganization of the application. The approach yields coarse- and fine-grained data-movement characteristics that can be used for performance and communication modeling, communication-avoidance, and dataflow transformations. The resulting code has been tuned for two top-6 hybrid supercomputers, reaching a sustained performance of 85.45 Pflop/s on 4,560 nodes of Summit (42.55% of the peak) in double precision, and 90.89 Pflop/s in mixed precision. These computational achievements enable the restructured QT simulator to treat realistic nanoelectronic devices made of more than 10,000 atoms within a 14x shorter duration than the original code needs to handle a system with 1,000 atoms, on the same number of CPUs/GPUs and with the same physical accuracy.

References

  1. T. Ben-Nun, J. de Fine Licht, A. N. Ziogas, T. Schneider, and T. Hoefler. 2019. Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures. In Proc. Int'l Conference for High Performance Computing, Networking, Storage and Analysis.Google ScholarGoogle Scholar
  2. M. Calderara, S. Brück, A. Pedersen, M. H. Bani-Hashemian, J. VandeVondele, and M. Luisier. 2015. Pushing Back the Limit of Ab-initio Quantum Transport Simulations on Hybrid Supercomputers. In Proc. Int'l Conference for High Performance Computing, Networking, Storage and Analysis (SC '15). ACM, 3:1--3:12.Google ScholarGoogle Scholar
  3. E. Carson, J. Demmel, L. Grigori, N. Knight, P. Koanantakool, O. Schwartz, and H. V. Simhadri. 2016. Write-Avoiding Algorithms. In 2016 IEEE Int'l Parallel and Distributed Processing Symposium (IPDPS). 648--658.Google ScholarGoogle Scholar
  4. Swiss National Supercomputing Centre. 2019. Piz Daint. https://www.cscs.ch/computers/piz-daint/Google ScholarGoogle Scholar
  5. S. Datta. 1995. Electronic Transport in Mesoscopic Systems. Cambridge Uni. Press.Google ScholarGoogle Scholar
  6. J. Demmel. 2013. Communication-avoiding algorithms for linear algebra and beyond. In IEEE 27th Int'l Symposium on Parallel and Distributed Processing.Google ScholarGoogle Scholar
  7. Oak Ridge Leadership Computing Facility. 2019. Summit. https://www.olcf.ornl.gov/olcf-resources/compute-systems/summit/Google ScholarGoogle Scholar
  8. J. Ferrer, C. J. Lambert, V. M. García-Suárez, D. Manrique, D. Visontai, L. Oroszlany, R. Rodríguez-Ferradás, I. Grace, S. W. D. Bailey, K. Gillemot, et al. 2014. GOLLUM: a next-generation simulation tool for electron, thermal and spin transport. New Journal of Physics 16, 9 (2014), 093029.Google ScholarGoogle ScholarCross RefCross Ref
  9. CEA Grenoble. 2013. TB_Sim. http://inac.cea.fr/Lsim/TBsim/Google ScholarGoogle Scholar
  10. C. W. Groth, M. Wimmer, A. R. Akhmerov, and X. Waintal. 2014. Kwant: a software package for quantum transport. New Journal of Physics 16, 6 (2014).Google ScholarGoogle ScholarCross RefCross Ref
  11. The Nanoelectronic Modeling Group and Gerhard Klimeck. 2018. NEMO5. https://engineering.purdue.edu/gekcogrp/software-projects/nemo5/Google ScholarGoogle Scholar
  12. W. Kohn and L. J. Sham. 1965. Self-Consistent Equations Including Exchange and Correlation Effects. Phys. Rev. 140 (Nov 1965), A1133-A1138. Issue 4A.Google ScholarGoogle Scholar
  13. M. Luisier. 2010. A Parallel Implementation of Electron-Phonon Scattering in Nanoelectronic Devices up to 95k Cores. In SC '10: Proc. ACM/IEEE Int'l Conference for High Performance Computing, Networking, Storage and Analysis. 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Luisier, T. B. Boykin, G. Klimeck, and W. Fichtner. 2011. Atomistic Nanoelectronic Device Engineering with Sustained Performances Up to 1.44 PFlop/s. In Proc. Int'l Conference for High Performance Computing, Networking, Storage and Analysis (SC '11). ACM, 2:1--2:11.Google ScholarGoogle Scholar
  15. M. Luisier, A. Schenk, W. Fichtner, and G. Klimeck. 2006. Atomistic simulation of nanowires in the sp3 d5 s* tight-binding formalism: From boundary conditions to strain calculations. Phys. Rev. B 74 (2006), 12. Issue 20.Google ScholarGoogle ScholarCross RefCross Ref
  16. I. Masliah, A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra. 2016. High-Performance Matrix-Matrix Multiplications of Very Small Matrices. In Proc. 22Nd Int'l Conference on Euro-Par 2016: Parallel Processing - Volume 9833. Springer-Verlag New York, Inc., 659--671.Google ScholarGoogle Scholar
  17. NanoTCAD. 2017. ViDES. http://vides.nanotcad.com/vides/Google ScholarGoogle Scholar
  18. P. McCormick. 2019. Yin & Yang: Hardware Heterogeneity & Software Productivity. Talk at SOS23 meeting, Asheville, NC.Google ScholarGoogle Scholar
  19. R. Pawlik. 2016. Current CPUs produce 4 times more heat than hot plates. https://cloudandheat.com/blog/current-cpus-produce-4-times-more/Google ScholarGoogle Scholar
  20. E. Pop, S. Sinha, and K. E. Goodson. 2006. Heat Generation and Transport in Nanometer-Scale Transistors. Proc. IEEE 94, 8 (Aug 2006), 1587--1601.Google ScholarGoogle ScholarCross RefCross Ref
  21. B. Prisacari, G. Rodriguez, C. Minkenberg, and T. Hoefler. 2013. Bandwidth-optimal all-to-all exchanges in fat tree networks. In Proc. 27th Int'l ACM conference on supercomputing. ACM, 139--148.Google ScholarGoogle Scholar
  22. C. Stieger, A. Szabo, T. Bunjaku, and M. Luisier. 2017. Ab-initio quantum transport simulation of self-heating in single-layer 2-D materials. Journal of Applied Physics 122, 4 (2017), 045708.Google ScholarGoogle ScholarCross RefCross Ref
  23. A. Svizhenko, M. P. Anantram, T. R. Govindan, B. Biegel, and R. Venugopal. 2002. Two-dimensional quantum mechanical modeling of nanotransistors. Journal of Applied Physics 91, 4 (2002), 2343--2354.Google ScholarGoogle ScholarCross RefCross Ref
  24. Synopsys. 2019. QuantumATK. http://synopsys.com/silicon/quantumatk.htmlGoogle ScholarGoogle Scholar
  25. TOP500.org. 2019. TOP500 Supercomputer Sites.Google ScholarGoogle Scholar
  26. D. Unat et al. 2017. Trends in Data Locality Abstractions for HPC Systems. IEEE Transactions on Parallel and Distributed Systems 28, 10 (Oct 2017), 3007--3020.Google ScholarGoogle ScholarCross RefCross Ref
  27. J. VandeVondele, M. Krack, F. Mohamed, M. Parrinello, T. Chassaing, and J. Hutter. 2005. Quickstep: Fast and accurate density functional calculations using a mixed Gaussian and plane waves approach. Comput. Phys. Comm. 167, 2 (2005), 103--128.Google ScholarGoogle ScholarCross RefCross Ref
  28. J. Wei. 2008. Challenges in Cooling Design of CPU Packages for High-Performance Servers. Heat Transfer Engineering 29, 2 (2008), 178--187.Google ScholarGoogle ScholarCross RefCross Ref
  29. S. Williams, A. Waterman, and D. Patterson. 2009. Roofline: An Insightful Visual Performance Model for Multicore Architectures. Commun. ACM 52, 4 (2009).Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. N. Ziogas, T. Ben-Nun, G. Indalecio Fernandez, T. Schneider, M. Luisier, and T. Hoefler. 2019. Optimizing the Data Movement in Quantum Transport Simulations via Data-Centric Parallel Programming. In Proc. Int'l Conference for High Performance Computing, Networking, Storage and Analysis.Google ScholarGoogle Scholar

Index Terms

  1. A data-centric approach to extreme-scale ab initio dissipative quantum transport simulations

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
          November 2019
          1921 pages
          ISBN:9781450362290
          DOI:10.1145/3295500

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 17 November 2019

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,516of6,373submissions,24%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader