skip to main content
Skip header Section
Numerical Linear Algebra for High Performance ComputersNovember 1998
Publisher:
  • Society for Industrial and Applied Mathematics
  • 3600 University City Science Center Philadelphia, PA
  • United States
ISBN:978-0-89871-428-9
Published:01 November 1998
Pages:
342
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

From the Publisher:

This book presents a unified treatment of recently developed techniques and current understanding about solving systems of linear equations and large scale eigenvalue problems on high-performance computers. It provides a rapid introduction to the world of vector and parallel processing for these linear algebra applications.

Topics include major elements of advanced-architecture computers and their performance, recent algorithmic development, and software for direct solution of dense matrix problems, direct solution of sparse systems of equations, iterative solution of sparse systems of equations, and solution of large sparse eigenvalue problems.

This book supersedes the SIAM publication Solving Linear Systems on Vector and Shared Memory Computers, which appeared in 1990. The new book includes a considerable amount of new material in addition to incorporating a substantial revision of existing text.

About the Authors:

Jack J. Dongarra is a Distinguished Professor of Computer Science at the University of Tennessee and a Distinguished Scientist at Oak Ridge National Laboratory. Iain S. Duff is Group Leader of Numerical Analysis at the CCLRC Rutherford Appleton Laboratory, the Project Leader for the Parallel Algorithms Group at CERFACS in Toulouse, and a Visiting Professor of Mathematics at the University of Strathclyde. Danny C. Sorensen is a Professor of Computational and Applied Mathematics at Rice University. Henk A. van der Vorst is a Professor in Numerical Analysis at Utrecht University in the Netherlands.

Cited By

  1. ACM
    Dumas J, van der Hoeven J, Pernet C and Roche D LU Factorization with Errors Proceedings of the 2019 on International Symposium on Symbolic and Algebraic Computation, (131-138)
  2. ACM
    Amestoy P, Buttari A, L'Excellent J and Mary T (2019). Performance and Scalability of the Block Low-Rank Multifrontal Factorization on Multicore Architectures, ACM Transactions on Mathematical Software, 45:1, (1-26), Online publication date: 28-Mar-2019.
  3. Thomas A and Kumar A (2018). A comparative evaluation of systems for scalable linear algebra-based analytics, Proceedings of the VLDB Endowment, 11:13, (2168-2182), Online publication date: 1-Sep-2018.
  4. Thomas A and Kumar A (2019). A comparative evaluation of systems for scalable linear algebra-based analytics, Proceedings of the VLDB Endowment, 11:13, (2168-2182), Online publication date: 1-Sep-2018.
  5. ACM
    Dumas J and Pernet C Symmetric Indefinite Triangular Factorization Revealing the Rank Profile Matrix Proceedings of the 2018 ACM International Symposium on Symbolic and Algebraic Computation, (151-158)
  6. Bentbib A, El Guide M, Jbilou K and Reichel L (2017). Global GolubKahan bidiagonalization applied to large discrete ill-posed problems, Journal of Computational and Applied Mathematics, 322:C, (46-56), Online publication date: 1-Oct-2017.
  7. Dumas J, Gautier T, Pernet C, Roch J and Sultan Z (2016). Recursion based parallelization of exact dense linear algebra routines for Gaussian elimination, Parallel Computing, 57:C, (235-249), Online publication date: 1-Sep-2016.
  8. Gangwon Jo , Jeongho Nah , Jun Lee , Jungwon Kim and Jaejin Lee (2015). Accelerating LINPACK with MPI-OpenCL on Clusters of Multi-GPU Nodes, IEEE Transactions on Parallel and Distributed Systems, 26:7, (1814-1825), Online publication date: 1-Jul-2015.
  9. ACM
    Dumas J, Pernet C and Sultan Z Computing the Rank Profile Matrix Proceedings of the 2015 ACM on International Symposium on Symbolic and Algebraic Computation, (149-156)
  10. Gilli M and Schumann E (2014). Optimization cultures, WIREs Computational Statistics, 6:5, (352-358), Online publication date: 18-Aug-2014.
  11. ACM
    Dumas J, Pernet C and Sultan Z Simultaneous computation of the row and column rank profiles Proceedings of the 38th International Symposium on Symbolic and Algebraic Computation, (181-188)
  12. ACM
    Zhang W, Betz V and Rose J (2012). Portable and scalable FPGA-based acceleration of a direct linear system solver, ACM Transactions on Reconfigurable Technology and Systems, 5:1, (1-26), Online publication date: 1-Mar-2012.
  13. Luszczek P and Dongarra J Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modeling Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I, (730-739)
  14. ACM
    Rozložník M, Shklarski G and Toledo S (2011). Partitioned Triangular Tridiagonalization, ACM Transactions on Mathematical Software, 37:4, (1-16), Online publication date: 1-Feb-2011.
  15. Michelini P (2010). Direct multi-grid methods for linear systems with harmonic aliasing patterns, IEEE Transactions on Signal Processing, 58:10, (5091-5105), Online publication date: 1-Oct-2010.
  16. ACM
    Verschelde J and Yoffe G Polynomial homotopies on multicore workstations Proceedings of the 4th International Workshop on Parallel and Symbolic Computation, (131-140)
  17. Anzt H, Heuveline V and Rocker B An error correction solver for linear systems Proceedings of the 9th international conference on High performance computing for computational science, (58-70)
  18. Anzt H, Heuveline V and Rocker B Mixed precision iterative refinement methods for linear systems Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2, (237-247)
  19. ACM
    Gustavson F, Waśniewski J, Dongarra J and Langou J (2010). Rectangular full packed format for cholesky's algorithm, ACM Transactions on Mathematical Software, 37:2, (1-21), Online publication date: 1-Apr-2010.
  20. Emiris I, Pan V and Tsigaridas E Algebraic and numerical algorithms Algorithms and theory of computation handbook, (17-17)
  21. Soveiko N, Nakhla M and Achar R (2010). Comparison study of performance of parallel steady state solver on different computer architectures, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 29:1, (65-77), Online publication date: 1-Jan-2010.
  22. ACM
    Bekas C, Curioni A and Fedulova I Low cost high performance uncertainty quantification Proceedings of the 2nd Workshop on High Performance Computational Finance, (1-8)
  23. Burckhardt K, Szczerba D, Brown J, Muralidhar K and Székely G Fast Implicit Simulation of Oscillatory Flow in Human Abdominal Bifurcation Using a Schur Complement Preconditioner Proceedings of the 15th International Euro-Par Conference on Parallel Processing, (747-759)
  24. ACM
    Fatica M Accelerating linpack with CUDA on heterogenous clusters Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, (46-51)
  25. ACM
    Gustavson F, Karlsson L and Kågström B (2009). Distributed SBP Cholesky factorization algorithms with near-optimal scheduling, ACM Transactions on Mathematical Software, 36:2, (1-25), Online publication date: 1-Mar-2009.
  26. Kurzak J and Dongarra J (2009). QR factorization for the Cell Broadband Engine, Scientific Programming, 17:1-2, (31-42), Online publication date: 1-Jan-2009.
  27. Volkov V and Demmel J Benchmarking GPUs to tune dense linear algebra Proceedings of the 2008 ACM/IEEE conference on Supercomputing, (1-11)
  28. Pan V, Ivolgin D, Murphy B, Rosholt R, Tang Y and Yan X Additive preconditioning for matrix computations Proceedings of the 3rd international conference on Computer science: theory and applications, (372-383)
  29. ACM
    Howell G, Demmel J, Fulton C, Hammarling S and Marmol K (2008). Cache efficient bidiagonalization using BLAS 2.5 operators, ACM Transactions on Mathematical Software, 34:3, (1-33), Online publication date: 1-May-2008.
  30. ACM
    Sala M, Stanley K and Heroux M (2008). On the design of interfaces to sparse direct solvers, ACM Transactions on Mathematical Software, 34:2, (1-22), Online publication date: 1-Mar-2008.
  31. Burke E, Dror M and Orlin J (2008). Scheduling malleable tasks with interdependent processing rates, Discrete Applied Mathematics, 156:5, (620-626), Online publication date: 1-Mar-2008.
  32. Coulaud O, Fortin P and Roman J (2008). High performance BLAS formulation of the multipole-to-local operator in the fast multipole method, Journal of Computational Physics, 227:3, (1836-1862), Online publication date: 1-Jan-2008.
  33. Zekri A and Sedukhin S Performance evaluation of basic linear algebra subroutines on a matrix co-processor Proceedings of the 7th international conference on Parallel processing and applied mathematics, (1190-1199)
  34. ACM
    Pan V, Murphy B, Rosholt R and Tabanjeh M The schur aggregation for solving linear systems of equations Proceedings of the 2007 international workshop on Symbolic-numeric computation, (142-151)
  35. ACM
    Belgin M, Ribbens C and Back G An operation stacking framework for large ensemble computations Proceedings of the 21st annual international conference on Supercomputing, (83-92)
  36. ACM
    Gould N, Scott J and Hu Y (2007). A numerical evaluation of sparse direct solvers for the solution of large sparse symmetric linear systems of equations, ACM Transactions on Mathematical Software, 33:2, (10-es), Online publication date: 1-Jun-2007.
  37. Bertoldo A, Bianco M and Pucci G A static parallel multifrontal solver for finite element meshes Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications, (734-746)
  38. ACM
    Dhillon I, Parlett B and Vömel C (2006). The design and implementation of the MRRR algorithm, ACM Transactions on Mathematical Software, 32:4, (533-560), Online publication date: 1-Dec-2006.
  39. Chen W and Poirier B (2006). Parallel implementation of efficient preconditioned linear solver for grid-based applications in chemical physics. II, Journal of Computational Physics, 219:1, (198-209), Online publication date: 20-Nov-2006.
  40. Gradl T, Spörl A, Huckle T, Glaser S and Schulte-Herbrüggen T Parallelising matrix operations on clusters for an optimal control-based quantum compiler Proceedings of the 12th international conference on Parallel Processing, (751-762)
  41. Acebrón J, Durán R, Rico R and Spigler R A new domain decomposition approach suited for grid computing Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing, (744-753)
  42. Remón A, Quintana-Ortí E and Quintana-Ortí G Cholesky factorization of band matrices using multithreaded BLAS Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing, (608-616)
  43. Kurzak J and Dongarra J Implementing linear algebra routines on multi-core processors with pipelining and a look ahead Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing, (147-156)
  44. Demmel J, Dongarra J, Parlett B, Kahan W, Gu M, Bindel D, Hida Y, Li X, Marques O, Riedy E, Vömel C, Langou J, Luszczek P, Kurzak J, Buttari A, Langou J and Tomov S Prospectus for the next LAPACK and ScaLAPACK libraries Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing, (11-23)
  45. Marques O and Vasconcelos P Evaluation of linear solvers for astrophysics transfer problems Proceedings of the 7th international conference on High performance computing for computational science, (466-475)
  46. Gravvanis G and Giannoutakis K Parallel exact and approximate arrow-type inverses on symmetric multiprocessor systems Proceedings of the 6th international conference on Computational Science - Volume Part I, (506-513)
  47. Gravvanis G and Giannoutakis K On the performance of parallel normalized explicit preconditioned conjugate gradient type methods Proceedings of the 20th international conference on Parallel and distributed processing, (309-309)
  48. Amestoy P, Guermouche A, L'Excellent J and Pralet S (2006). Hybrid scheduling for the parallel solution of linear systems, Parallel Computing, 32:2, (136-156), Online publication date: 1-Feb-2006.
  49. Galoppo N, Govindaraju N, Henson M and Manocha D LU-GPU Proceedings of the 2005 ACM/IEEE conference on Supercomputing
  50. Zuberek W and Perera T Performance analysis of distributed iterative linear solvers Proceedings of the 7th WSEAS International Conference on Mathematical Methods and Computational Techniques In Electrical Engineering, (194-199)
  51. Garz E and García I (2005). Approaches Based on Permutations for Partitioning Sparse Matrices on Multiprocessors, The Journal of Supercomputing, 34:1, (41-61), Online publication date: 1-Oct-2005.
  52. D'Azevedo E, Fahey M and Mills R Vectorized sparse matrix multiply for compressed row storage format Proceedings of the 5th international conference on Computational Science - Volume Part I, (99-106)
  53. Gravvanis G and Giannoutakis K (2005). Parallel preconditioned conjugate gradient square method based on normalized approximate inverses, Scientific Programming, 13:2, (79-91), Online publication date: 1-Apr-2005.
  54. D'Alberto P and Nicolau A JuliusC Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing, (117-131)
  55. Vasconcelos P and d'Almeida F Performance evaluation of a parallel algorithm for a radiative transfer problem Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing, (864-871)
  56. Ortigosa E, Romero L and Ramos J (2003). Parallel scheduling of the PCG method for banded matrices rising from FDM/FEM, Journal of Parallel and Distributed Computing, 63:12, (1243-1256), Online publication date: 1-Dec-2003.
  57. ACM
    Nakajima K Parallel Iterative Solvers of GeoFEM with Selective Blocking Preconditioning for Nonlinear Contact Problems on the Earth Simulator Proceedings of the 2003 ACM/IEEE conference on Supercomputing
  58. Chen Z, Dongarra J, Luszczek P and Roche K (2003). Self-adapting software for numerical linear algebra and LAPACK for clusters, Parallel Computing, 29:11-12, (1723-1743), Online publication date: 1-Nov-2003.
  59. Amestoy P, Duff I, Pralet S and Vömel C (2003). Adapting a parallel sparse direct solver to architectures with clusters of SMPs, Parallel Computing, 29:11-12, (1645-1668), Online publication date: 1-Nov-2003.
  60. Hanson F (2003). Local supercomputing training in the computational sciences using remote national centers, Future Generation Computer Systems, 19:8, (1335-1347), Online publication date: 1-Nov-2003.
  61. Chronopoulos A, Grosu D, Wissink A, Benche M and Liu J (2003). An efficient 3D grid based scheduling for heterogeneous systems, Journal of Parallel and Distributed Computing, 63:9, (827-837), Online publication date: 1-Sep-2003.
  62. Tracy F Application of the multi-level parallelism (MLP) software to a finite element groundwater program using iterative solvers with comparison to MPI Proceedings of the 2003 international conference on Computational science: PartIII, (725-735)
  63. Benzi M and Bertaccini D (2003). Approximate Inverse Preconditioning for Shifted Linear Systems, BIT, 43:2, (231-244), Online publication date: 1-Jun-2003.
  64. Foschi P and Kontoghiorghes E (2003). Estimation of VAR Models, Computational Economics, 21:1-2, (3-22), Online publication date: 1-Feb-2003.
  65. Gryazin Y, Klibanov M and Lucas T (2003). Two numerical methods for an inverse problem for the 2-D Helmholtz equation, Journal of Computational Physics, 184:1, (122-148), Online publication date: 1-Jan-2003.
  66. Shen C and Zhang J (2002). Parallel two level block ILU Preconditioning techniques for solving large sparse linear systems, Parallel Computing, 28:10, (1451-1475), Online publication date: 1-Oct-2002.
  67. McCombs J and Stathopoulos A Multigrain Parallelism for Eigenvalue Computations on Networks of Clusters Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
  68. Katagiri T Performance evaluation of parallel gram-schmidt re-orthogonalization methods Proceedings of the 5th international conference on High performance computing for computational science, (302-314)
  69. Schönauer W and Häfner H (2002). Numerical experiments to optimize the use of (I)LU preconditioning in the iterative linear solver package LINSOL, Applied Numerical Mathematics, 41:1, (23-37), Online publication date: 1-Apr-2002.
  70. Simoncini V and Eldén L (2002). Inexact Rayleigh Quotient-Type Methods for Eigenvalue Computations, BIT, 42:1, (159-182), Online publication date: 1-Mar-2002.
  71. Zlatev Z Massive data set issues in air pollution modelling Handbook of massive data sets, (1169-1220)
  72. Breitner M (2000). Robust Optimal Onboard Reentry Guidance of a Space Shuttle, Journal of Optimization Theory and Applications, 107:3, (481-503), Online publication date: 1-Dec-2000.
  73. Theobald K, Agrawal G, Kumar R, Heber G, Gao G, Stodghill P and Pingali K Landing CG on EARTH Proceedings of the 2000 ACM/IEEE conference on Supercomputing, (4-es)
  74. ACM
    Lee H, Kim J, Hong S and Lee S Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization Proceedings of the 2000 ACM symposium on Applied computing - Volume 2, (641-648)
Contributors
  • The University of Tennessee, Knoxville
  • Rice University
  • Utrecht University

Recommendations