skip to main content
Skip header Section
Programming Massively Parallel Processors: A Hands-on ApproachDecember 2012
Publisher:
  • Morgan Kaufmann Publishers Inc.
  • 340 Pine Street, Sixth Floor
  • San Francisco
  • CA
  • United States
ISBN:978-0-12-415992-1
Published:28 December 2012
Pages:
514
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

Programming Massively Parallel Processors: A Hands-on Approach shows both student and professional alike the basic concepts of parallel programming and GPU architecture. Various techniques for constructing parallel programs are explored in detail. Case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. Topics of performance, floating-point format, parallel patterns, and dynamic parallelism are covered in depth. This best-selling guide to CUDA and GPU parallel programming has been revised with more parallel programming examples, commonly-used libraries such as Thrust, and explanations of the latest tools. With these improvements, the book retains its concise, intuitive, practical approach based on years of road-testing in the authors' own parallel computing courses.Updates in this new edition include: New coverage of CUDA 5.0, improved performance, enhanced development tools, increased hardware support, and moreIncreased coverage of related technology, OpenCL and new material on algorithm patterns, GPU clusters, host programming, and data parallelismTwo new case studies (on MRI reconstruction and molecular visualization) explore the latest applications of CUDA and GPUs for scientific research and high-performance computing

Cited By

  1. Sánchez J, López M, Pastor J, Delgado A and Fernández-Caballero A (2019). Accelerating bioinspired lateral interaction in accumulative computation for real-time moving object detection with graphics processing units, Natural Computing: an international journal, 18:2, (217-227), Online publication date: 1-Jun-2019.
  2. ACM
    Diéguez A, Amor M and Doallo R (2019). Tree Partitioning Reduction, ACM Transactions on Mathematical Software, 45:3, (1-26), Online publication date: 30-Sep-2019.
  3. Bermúdez A, Montero F, López M, Fernández-Caballero A and Sánchez J (2019). Optimization of lateral interaction in accumulative computation on GPU-based platform, The Journal of Supercomputing, 75:3, (1670-1685), Online publication date: 1-Mar-2019.
  4. Ha S, Park J and You D (2018). A GPU-accelerated semi-implicit fractional-step method for numerical solutions of incompressible NavierStokes equations, Journal of Computational Physics, 352:C, (246-264), Online publication date: 1-Jan-2018.
  5. Tran H and Cambria E (2018). A survey of graph processing on graphics processing units, The Journal of Supercomputing, 74:5, (2086-2115), Online publication date: 1-May-2018.
  6. Rios E, Ochi L, Boeres C, Coelho V, Coelho I and Farias R (2018). Exploring parallel multi-GPU local search strategies in a metaheuristic framework, Journal of Parallel and Distributed Computing, 111:C, (39-55), Online publication date: 1-Jan-2018.
  7. ACM
    Zhang J and Gruenwald L Regularizing irregularity Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), (1-8)
  8. ACM
    Shekofteh S, Noori H, Naghibzadeh M, Yazdi H and Fröning H (2019). Metric Selection for GPU Kernel Classification, ACM Transactions on Architecture and Code Optimization, 15:4, (1-27), Online publication date: 31-Dec-2019.
  9. ACM
    Tan W, Chang S, Fong L, Li C, Wang Z and Cao L Matrix Factorization on GPUs with Memory Optimization and Approximate Computing Proceedings of the 47th International Conference on Parallel Processing, (1-10)
  10. Bondarenco M, Gamazo P and Ezzatti P (2017). A comparison of various schemes for solving the transport equation in many-core platforms, The Journal of Supercomputing, 73:1, (469-481), Online publication date: 1-Jan-2017.
  11. Wang Q, Chen D, Li S, Wu Q and Zhang Q (2017). An adaptive cartoon-like stylization for color video in real time, Multimedia Tools and Applications, 76:15, (16767-16782), Online publication date: 1-Aug-2017.
  12. Das R (2017). GPUs in subsurface simulation, Engineering with Computers, 33:4, (919-934), Online publication date: 1-Oct-2017.
  13. ACM
    Xie X, Tan W, Fong L and Liang Y CuMF_SGD Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, (79-92)
  14. ACM
    Hong C, Spence I and Nikolopoulos D (2017). GPU Virtualization and Scheduling Methods, ACM Computing Surveys, 50:3, (1-37), Online publication date: 9-Oct-2017.
  15. ACM
    Hierons R and Türker U (2017). Parallel Algorithms for Generating Distinguishing Sequences for Observable Non-deterministic FSMs, ACM Transactions on Software Engineering and Methodology, 26:1, (1-34), Online publication date: 31-Jan-2017.
  16. Hierons R and Turker U (2019). Parallel Algorithms for Generating Harmonised State Identifiers and Characterising Sets, IEEE Transactions on Computers, 65:11, (3370-3383), Online publication date: 1-Nov-2016.
  17. Ates O, Keskin S and Kocak T (2016). High throughput graphics processing unit based Fano decoder, Journal of Network and Computer Applications, 75:C, (128-137), Online publication date: 1-Nov-2016.
  18. Pedemonte M, Luna F and Alba E (2016). A Systolic Genetic Search for reducing the execution cost of regression testing, Applied Soft Computing, 49:C, (1145-1161), Online publication date: 1-Dec-2016.
  19. ACM
    Daoud F, Watad A and Silberstein M GPUrdma Proceedings of the 6th International Workshop on Runtime and Operating Systems for Supercomputers, (1-8)
  20. ACM
    Shahar S and Silberstein M Supporting data-driven I/O on GPUs using GPUfs Proceedings of the 9th ACM International on Systems and Storage Conference, (1-11)
  21. ACM
    Ben-Sasson E, Hamilis M, Silberstein M and Tromer E Fast Multiplication in Binary Fields on GPUs via Register Cache Proceedings of the 2016 International Conference on Supercomputing, (1-12)
  22. ACM
    Sultana N, Calvert A, Overbey J and Arnold G From OpenACC to OpenMP 4 Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale, (1-8)
  23. Benner P, Dufrechou E, Ezzatti P, Quintana-Ortí E and Remón A (2015). Unleashing GPU acceleration for symmetric band linear algebra kernels and model reduction, Cluster Computing, 18:4, (1351-1362), Online publication date: 1-Dec-2015.
  24. ACM
    Ashari A, Tatikonda S, Boehm M, Reinwald B, Campbell K, Keenleyside J and Sadayappan P On optimizing machine learning workloads via kernel fusion Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (173-182)
  25. ACM
    Liu J, Yang J and Melhem R SAWS Proceedings of the 48th International Symposium on Microarchitecture, (383-394)
  26. Mahani A and Sharabiani M (2019). SIMD parallel MCMC sampling with applications for big-data Bayesian analytics, Computational Statistics & Data Analysis, 88:C, (75-99), Online publication date: 1-Aug-2015.
  27. ACM
    Ashari A, Tatikonda S, Boehm M, Reinwald B, Campbell K, Keenleyside J and Sadayappan P (2015). On optimizing machine learning workloads via kernel fusion, ACM SIGPLAN Notices, 50:8, (173-182), Online publication date: 18-Dec-2015.
  28. ACM
    Zhang J, You S and Xia Y Prototyping A Web-based High-Performance Visual Analytics Platform for Origin-Destination Data Proceedings of the 1st International ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics, (16-23)
  29. ACM
    Zhang J, You S and Gruenwald L Efficient Parallel Zonal Statistics on Large-Scale Global Biodiversity Data on GPUs Proceedings of the 4th International ACM SIGSPATIAL Workshop on Analytics for Big Geospatial Data, (35-44)
  30. Angstadt K and Harcourt E A virtual machine model for accelerating relational database joins using a general purpose GPU Proceedings of the Symposium on High Performance Computing, (127-134)
Contributors
  • Newcastle University

Recommendations