skip to main content
Skip header Section
Parallel programming in OpenMPJanuary 2001
Publisher:
  • Morgan Kaufmann Publishers Inc.
  • 340 Pine Street, Sixth Floor
  • San Francisco
  • CA
  • United States
ISBN:978-1-55860-671-5
Published:01 January 2001
Pages:
230
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

Aimed at the working researcher or scientific C/C++ or Fortran programmer, this text introduces the competent research programmer to a new vocabulary of idioms and techniques for parallelizing software using OpenMP.

Cited By

  1. Gray K, Li M, Ahmed R, Rahman M, Azad A, Kobourov S and Börner K (2024). A Scalable Method for Readable Tree Layouts, IEEE Transactions on Visualization and Computer Graphics, 30:2, (1564-1578), Online publication date: 1-Feb-2024.
  2. Nguyen N, Tran M and Chandra R (2024). Sequential reversible jump MCMC for dynamic Bayesian neural networks, Neurocomputing, 564:C, Online publication date: 7-Jan-2024.
  3. Liu G and Iuricich F (2024). A Task-Parallel Approach for Localized Topological Data Structures, IEEE Transactions on Visualization and Computer Graphics, 30:1, (1271-1281), Online publication date: 1-Jan-2024.
  4. ACM
    Hsu K and Tseng H Simultaneous and Heterogenous Multithreading Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, (137-152)
  5. Charilogis V, Tsoulos I and Tzallas A (2023). An Improved Parallel Particle Swarm Optimization, SN Computer Science, 4:6, Online publication date: 4-Oct-2023.
  6. ACM
    Gan X, Wu G, Zeng R, Si J, Liu J, Dong D, Gong C, Liu C and Li T FT-topo: Architecture-Driven Folded-Triangle Partitioning for Communication-efficient Graph Processing Proceedings of the 37th International Conference on Supercomputing, (240-250)
  7. Neto W, Li Y, Gaillardon P and Yu C (2023). FlowTune: End-to-End Automatic Logic Optimization Exploration via Domain-Specific Multiarmed Bandit, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42:6, (1912-1925), Online publication date: 1-Jun-2023.
  8. Quislant R, Fernandez I, Gutierrez E and Plata O (2023). Time series analysis acceleration with advanced vectorization extensions, The Journal of Supercomputing, 79:9, (10178-10207), Online publication date: 1-Jun-2023.
  9. de Castro M, Santamaria-Valenzuela I, Torres Y, Gonzalez-Escribano A and Llanos D (2023). EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs, The Journal of Supercomputing, 79:9, (9409-9442), Online publication date: 1-Jun-2023.
  10. ACM
    Tabanelli E, Tagliavini G and Benini L (2023). DNN Is Not All You Need: Parallelizing Non-neural ML Algorithms on Ultra-low-power IoT Processors, ACM Transactions on Embedded Computing Systems, 22:3, (1-33), Online publication date: 31-May-2023.
  11. Trabes G, Wainer G and Gil-Costa V (2023). A Parallel Algorithm to Accelerate DEVS Simulations in Shared Memory Architectures, IEEE Transactions on Parallel and Distributed Systems, 34:5, (1609-1620), Online publication date: 1-May-2023.
  12. Dominico S, de Almeida E and Alves M (2023). On the performance limits of thread placement for array databases in non-uniform memory architectures, Computing, 105:5, (1059-1075), Online publication date: 1-May-2023.
  13. Li J, Agung M and Takizawa H Evaluating the Performance and Conformance of a SYCL Implementation for SX-Aurora TSUBASA Parallel and Distributed Computing, Applications and Technologies, (36-47)
  14. Gambhir G and Mandal J (2021). Shared memory implementation and performance analysis of LSB steganography based on chaotic tent map, Innovations in Systems and Software Engineering, 17:4, (333-342), Online publication date: 1-Dec-2021.
  15. ACM
    Podobas A, Svedin M, Chien S, Peng I, Ravichandran N, Herman P, Lansner A and Markidis S StreamBrain Proceedings of the 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, (1-6)
  16. Brahmakshatriya A, Furst E, Ying V, Hsu C, Hong C, Ruttenberg M, Zhang Y, Jung D, Richmond D, Taylor M, Shun J, Oskin M, Sanchez D and Amarasinghe S Taming the zoo Proceedings of the 48th Annual International Symposium on Computer Architecture, (429-442)
  17. Spiliotis I, Sitaridis C and Bekakos M (2021). Parallel Computation of Discrete Orthogonal Moment on Block Represented Images Using OpenMP, International Journal of Parallel Programming, 49:3, (440-462), Online publication date: 1-Jun-2021.
  18. Yin L, Zhang Y, Zhang Z, Peng Y and Zhao P (2021). ParaX, Proceedings of the VLDB Endowment, 14:6, (864-877), Online publication date: 1-Feb-2021.
  19. Anastasopoulos N, Tsoulos I, Karvounis E and Tzallas A (2020). Locate the Bounding Box of Neural Networks with Intervals, Neural Processing Letters, 52:3, (2241-2251), Online publication date: 1-Dec-2020.
  20. Arabnejad H, Bispo J, Cardoso J and Barbosa J (2019). Source-to-source compilation targeting OpenMP-based automatic parallelization of C applications, The Journal of Supercomputing, 76:9, (6753-6785), Online publication date: 1-Sep-2020.
  21. Ren Z, Gu Y, Li C, Li F and Yu G GHSH: Dynamic Hyperspace Hashing on GPU Web and Big Data, (409-424)
  22. Magalhães T and Helio J. C. B Parallel Differential Evolution Algorithms for Stackelberg-Nash Bilevel Optimization Problems 2020 IEEE Congress on Evolutionary Computation (CEC), (1-8)
  23. Dansou A, Mouhoubi S and Chazallon C (2019). Optimizations of a fast multipole symmetric Galerkin boundary element method code, Numerical Algorithms, 84:3, (825-846), Online publication date: 1-Jul-2020.
  24. Ejjaaouani K, Aumage O, Bigot J, Méhrenberger M, Murai H, Nakao M and Sato M (2019). InKS: a programming model to decouple algorithm from optimization in HPC codes, The Journal of Supercomputing, 76:6, (4666-4681), Online publication date: 1-Jun-2020.
  25. ACM
    Gowanlock M, Karsin B, Fink Z and Wright J Accelerating the Unacceleratable Proceedings of the 15th International Workshop on Data Management on New Hardware, (1-11)
  26. Reguly I, Moore B, Schmielau T, du Toit J and Mudalige G Batch Solution of Small PDEs with the OPS DSL High Performance Computing, (124-141)
  27. ACM
    Gowanlock M KNN-Joins Using a Hybrid Approach Proceedings of the 12th Workshop on General Purpose Processing Using GPUs, (33-42)
  28. Fresno J, Barba D, Gonzalez-Escribano A and Llanos D (2019). HitFlow, International Journal of Parallel Programming, 47:1, (3-23), Online publication date: 1-Feb-2019.
  29. Ha O, Lee K, Kim W, Yoon K and Mateos C (2019). Effective Parallelization Method for Object Recognition in 2D Sonar Images Based on Task Partitioning, Scientific Programming, 2019, Online publication date: 1-Jan-2019.
  30. Choi Y and Cong J HLS-Based Optimization and Design Space Exploration for Applications with Variable Loop Bounds 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), (1-8)
  31. Huang Z, Li M, Chousidis C, Mousavi A and Jiang C (2018). Schema Theory-Based Data Engineering in Gene Expression Programming for Big Data Analytics, IEEE Transactions on Evolutionary Computation, 22:5, (792-804), Online publication date: 1-Oct-2018.
  32. ACM
    Muñoz J, Dolz M, del Rio Astorga D, Cepeda J and García J Supporting MPI-distributed stream parallel patterns in GrPPI Proceedings of the 25th European MPI Users' Group Meeting, (1-10)
  33. Ejjaaouani K, Aumage O, Bigot J, Mehrenberger M, Murai H, Nakao M and Sato M , a Programming Model to Decouple Performance from Algorithm in HPC Codes Euro-Par 2018: Parallel Processing Workshops, (757-768)
  34. Carabaş M, Drăghici A, Lupescu G, Samoilă C and Sluşanschi E Integrating Parallel Computing in the Curriculum of the University Politehnica of Bucharest Euro-Par 2018: Parallel Processing Workshops, (222-234)
  35. Bianchi F, Margara A and Pezze M (2018). A Survey of Recent Trends in Testing Concurrent Software Systems, IEEE Transactions on Software Engineering, 44:8, (747-783), Online publication date: 1-Aug-2018.
  36. Gu B, Shan Y, Geng X and Zheng G Accelerated asynchronous greedy coordinate descent algorithm for SVMs Proceedings of the 27th International Joint Conference on Artificial Intelligence, (2170-2176)
  37. ACM
    Pfander D, Daiß G, Marcello D, Kaiser H and Pflüger D Accelerating Octo-Tiger Proceedings of the International Workshop on OpenCL, (1-8)
  38. Stpiczyński P (2018). Language-based vectorization and parallelization using intrinsics, OpenMP, TBB and Cilk Plus, The Journal of Supercomputing, 74:4, (1461-1472), Online publication date: 1-Apr-2018.
  39. Stpiczyński P (2018). Vectorized algorithm for multidimensional Monte Carlo integration on modern GPU, CPU and MIC architectures, The Journal of Supercomputing, 74:2, (936-952), Online publication date: 1-Feb-2018.
  40. ACM
    Arabnejad H, Bispo J, Barbosa J and Cardoso J AutoPar-Clava Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms, (13-19)
  41. ACM
    Trinder P, Chechina N, Papaspyrou N, Sagonas K, Thompson S, Adams S, Aronis S, Baker R, Bihari E, Boudeville O, Cesarini F, Stefano M, Eriksson S, fördős V, Ghaffari A, Giantsios A, Green R, Hoch C, Klaftenegger D, Li H, Lundin K, Mackenzie K, Roukounaki K, Tsiouris Y and Winblad K (2017). Scaling Reliably, ACM Transactions on Programming Languages and Systems, 39:4, (1-46), Online publication date: 31-Dec-2018.
  42. Fan R and Dahnoun N Real-time implementation of stereo vision based on optimised normalised cross-correlation and propagated search range on a GPU 2017 IEEE International Conference on Imaging Systems and Techniques (IST), (1-6)
  43. ACM
    Saxena R, Jain M, Singh D and Kushwah A An enhanced parallel version of RSA public key crypto based algorithm using openMP Proceedings of the 10th International Conference on Security of Information and Networks, (37-42)
  44. ACM
    Beard J The sparse data reduction engine Proceedings of the International Symposium on Memory Systems, (34-48)
  45. Chechina N, MacKenzie K, Thompson S, Trinder P, Boudeville O, Fördős V, Hoch C, Ghaffari A and Hernandez M (2017). Evaluating Scalable Distributed Erlang for Scalability and Reliability, IEEE Transactions on Parallel and Distributed Systems, 28:8, (2244-2257), Online publication date: 1-Aug-2017.
  46. Zahaf H, Benyamina A, Olejnik R and Lipari G (2017). Energy-efficient scheduling for moldable real-time tasks on heterogeneous computing platforms, Journal of Systems Architecture: the EUROMICRO Journal, 74:C, (46-60), Online publication date: 1-Mar-2017.
  47. Oh S and Hong J (2017). Parallelization of a finite element Fortran code using OpenMP library, Advances in Engineering Software, 104:C, (28-37), Online publication date: 1-Feb-2017.
  48. Harizanov S, Lirkov I, Georgiev K, Paprzycki M and Ganzha M (2017). Performance analysis of a parallel algorithm for restoring large-scale CT images, Journal of Computational and Applied Mathematics, 310:C, (104-114), Online publication date: 15-Jan-2017.
  49. Boratto M, Alonso P, Giménez D and Lastovetsky A (2017). Automatic tuning to performance modelling of matrix polynomials on multicore and multi-GPU systems, The Journal of Supercomputing, 73:1, (227-239), Online publication date: 1-Jan-2017.
  50. ACM
    Ravi P, Syam U and Kapre N Preventive Detection of Mosquito Populations using Embedded Machine Learning on Low Power IoT Platforms Proceedings of the 7th Annual Symposium on Computing for Development, (1-10)
  51. ACM
    Wang X, Leidel J and Chen Y Concurrent Dynamic Memory Coalescing on GoblinCore-64 Architecture Proceedings of the Second International Symposium on Memory Systems, (177-187)
  52. Mansouri F, Huet S and Houzet D (2016). A domain-specific high-level programming model, Concurrency and Computation: Practice & Experience, 28:3, (750-767), Online publication date: 10-Mar-2016.
  53. ACM
    Shewale A, Waghmare N, Sonawane A and Teke U High Performance Computation Analysis for Medical Images using High Computational Methods Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, (1-6)
  54. Zhao J, Tao J and Streit A (2016). Enabling collaborative MapReduce on the Cloud with a single-sign-on mechanism, Computing, 98:1-2, (55-72), Online publication date: 1-Jan-2016.
  55. ACM
    Kaiser H, Heller T, Bourgeois D and Fey D Higher-level parallelization for local and distributed asynchronous task-based programming Proceedings of the First International Workshop on Extreme Scale Programming Models and Middleware, (29-37)
  56. Besard T, De Sutter B, Frías-Vel$#225;zquez A and Philips W (2015). Case study of multiple trace transform implementations, International Journal of High Performance Computing Applications, 29:4, (489-505), Online publication date: 1-Nov-2015.
  57. Soares T, Xavier M, Pigozzo A, Campos R, Santos R and Lobosco M Performance Evaluation of a Human Immune System Simulator on a GPU Cluster Proceedings of the 13th International Conference on Parallel Computing Technologies - Volume 9251, (458-468)
  58. Hernández M, Imbernón B, Navarro J, García J, Cebrián J and Cecilia J (2015). Evaluation of the 3-D finite difference implementation of the acoustic diffusion equation model on massively parallel architectures, Computers and Electrical Engineering, 46:C, (190-201), Online publication date: 1-Aug-2015.
  59. ACM
    Bailey M Fundamentals seminar ACM SIGGRAPH 2015 Courses, (1-129)
  60. ACM
    Millo J, Kofman E and Simone R (2015). Modeling and Analyzing Dataflow Applications on NoC-Based Many-Core Architectures, ACM Transactions on Embedded Computing Systems, 14:3, (1-25), Online publication date: 21-May-2015.
  61. Arbelaitz O, Martin J and Muguerza J (2015). Analysis of Introducing Active Learning Methodologies in a Basic Computer Architecture Course, IEEE Transactions on Education, 58:2, (110-116), Online publication date: 1-May-2015.
  62. ACM
    Rodrigues A, Jorge A and Dutra I Accelerating recommender systems using GPUs Proceedings of the 30th Annual ACM Symposium on Applied Computing, (879-884)
  63. Szałkowski D and Stpiczyński P (2015). Using distributed memory parallel computers and GPU clusters for multidimensional Monte Carlo integration, Concurrency and Computation: Practice & Experience, 27:4, (923-936), Online publication date: 25-Mar-2015.
  64. ACM
    Beard J, Li P and Chamberlain R RaftLib Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, (96-105)
  65. Shafi A, Akhtar A, Javed A and Carpenter B Teaching parallel programming using Java Proceedings of the Workshop on Education for High-Performance Computing, (56-63)
  66. ACM
    Kaiser H, Heller T, Adelstein-Lelbach B, Serio A and Fey D HPX Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, (1-11)
  67. ACM
    Fanfarillo A, Burnus T, Cardellini V, Filippone S, Nagle D and Rouson D OpenCoarrays Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, (1-11)
  68. Mansouri F, Huet S and Houzet D A Visual Programming Model to Implement Coarse-Grained DSP Applications on Parallel and Heterogeneous Clusters Revised Selected Papers, Part I, of the Euro-Par 2014 International Workshops on Parallel Processing - Volume 8805, (141-152)
  69. ACM
    Sujeeth A, Gibbons A, Brown K, Lee H, Rompf T, Odersky M and Olukotun K (2013). Forge, ACM SIGPLAN Notices, 49:3, (145-154), Online publication date: 5-Mar-2014.
  70. Hawick K and Playne D Developmental directions in parallel accelerators Proceedings of the Twelfth Australasian Symposium on Parallel and Distributed Computing - Volume 152, (21-27)
  71. ACM
    Zhao J, Lublinerman R, Budimlić Z, Chaudhuri S and Sarkar V (2013). Isolation for nested task parallelism, ACM SIGPLAN Notices, 48:10, (571-588), Online publication date: 12-Nov-2013.
  72. ACM
    Zhao J, Lublinerman R, Budimlić Z, Chaudhuri S and Sarkar V Isolation for nested task parallelism Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications, (571-588)
  73. ACM
    Sujeeth A, Gibbons A, Brown K, Lee H, Rompf T, Odersky M and Olukotun K Forge Proceedings of the 12th international conference on Generative programming: concepts & experiences, (145-154)
  74. ACM
    Aljabri M, Loidl H and Trinder P The Design and Implementation of GUMSMP Proceedings of the 25th symposium on Implementation and Application of Functional Languages, (37-48)
  75. Ghoting A, Gunnels J, Kambadur P, Pednault E and Squillante M (2013). Trends and outlook for the massive-scale analytics stack, IBM Journal of Research and Development, 57:3-4, (2-2), Online publication date: 1-May-2013.
  76. Capuzzo-Dolcetta R, Spera M and Punzo D (2013). A fully parallel, high precision, N-body code running on hybrid computing platforms, Journal of Computational Physics, 236, (580-593), Online publication date: 1-Mar-2013.
  77. ACM
    Zhuravlev S, Saez J, Blagodurov S, Fedorova A and Prieto M (2012). Survey of scheduling techniques for addressing shared resources in multicore processors, ACM Computing Surveys, 45:1, (1-28), Online publication date: 1-Nov-2012.
  78. ACM
    Satish N, Kim C, Chhugani J, Saito H, Krishnaiyer R, Smelyanskiy M, Girkar M and Dubey P (2012). Can traditional programming bridge the Ninja performance gap for parallel computing applications?, ACM SIGARCH Computer Architecture News, 40:3, (440-451), Online publication date: 5-Sep-2012.
  79. López-Espín J, Vidal A and Giménez D (2012). Two-stage least squares and indirect least squares algorithms for simultaneous equations models, Journal of Computational and Applied Mathematics, 236:15, (3676-3684), Online publication date: 1-Sep-2012.
  80. Misztal M, Erleben K, Bargteil A, Fursund J, Christensen B, Bærentzen J and Bridson R Multiphase flow of immiscible fluids on unstructured moving meshes Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, (97-106)
  81. Misztal M, Erleben K, Bargteil A, Fursund J, Christensen B, Bærentzen J and Bridson R Multiphase flow of immiscible fluids on unstructured moving meshes Proceedings of the 11th ACM SIGGRAPH / Eurographics conference on Computer Animation, (97-106)
  82. Boudeville O, Cesarini F, Chechina N, Lundin K, Papaspyrou N, Sagonas K, Thompson S, Trinder P and Wiger U RELEASE Proceedings of the 2012 Conference on Trends in Functional Programming - Volume 7829, (263-278)
  83. Satish N, Kim C, Chhugani J, Saito H, Krishnaiyer R, Smelyanskiy M, Girkar M and Dubey P Can traditional programming bridge the Ninja performance gap for parallel computing applications? Proceedings of the 39th Annual International Symposium on Computer Architecture, (440-451)
  84. Jaros J and Pospichal P A fair comparison of modern CPUs and GPUs running the genetic algorithm under the knapsack benchmark Proceedings of the 2012t European conference on Applications of Evolutionary Computation, (426-435)
  85. ACM
    Bailey M and Cunningham S Introduction to computer graphics SIGGRAPH Asia 2011 Courses, (1-58)
  86. ACM
    Huerta Yero E and Lucchese F Practical experiences on the gridification of financial applications Proceedings of the fourth workshop on High performance computational finance, (39-46)
  87. ACM
    Lublinerman R, Zhao J, Budimlić Z, Chaudhuri S and Sarkar V Delegated isolation Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications, (885-902)
  88. ACM
    Lublinerman R, Zhao J, Budimlić Z, Chaudhuri S and Sarkar V (2011). Delegated isolation, ACM SIGPLAN Notices, 46:10, (885-902), Online publication date: 18-Oct-2011.
  89. ACM
    Lins R, de F. Pereira e Silva G and de A. Formiga A HistDoc v. 2.0 Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, (169-176)
  90. Burak D and Chudzik M Parallelization of the discrete chaotic block encryption algorithm Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II, (323-332)
  91. Tibbits M, Haran M and Liechty J (2011). Parallel multivariate slice sampling, Statistics and Computing, 21:3, (415-430), Online publication date: 1-Jul-2011.
  92. ACM
    Viry P Parallel and distributed programming extensions for mainstream languages based on pi-calculus Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing, (343-344)
  93. ACM
    Gray I and Audsley N (2011). Targeting complex embedded architectures by combining the multicore communications API (mcapi) with compile-time virtualisation, ACM SIGPLAN Notices, 46:5, (51-60), Online publication date: 11-Apr-2011.
  94. ACM
    Gray I and Audsley N Targeting complex embedded architectures by combining the multicore communications API (mcapi) with compile-time virtualisation Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems, (51-60)
  95. Chan M and Yang L Comparative analysis of OpenMP and MPI on multi-core architecture Proceedings of the 44th Annual Simulation Symposium, (18-25)
  96. Hobor A and Gherghina C Barriers in concurrent separation logic Proceedings of the 20th European conference on Programming languages and systems: part of the joint European conferences on theory and practice of software, (276-296)
  97. Zaykov P and Kuzmanov G Architectural support for multithreading on reconfigurable hardware Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications, (363-374)
  98. ACM
    Lagar-Cavilla H, Whitney J, Bryant R, Patchin P, Brudno M, de Lara E, Rumble S, Satyanarayanan M and Scannell A (2011). SnowFlock, ACM Transactions on Computer Systems, 29:1, (1-45), Online publication date: 1-Feb-2011.
  99. ACM
    Bailey M and Cunningham S Introduction to computer graphics ACM SIGGRAPH ASIA 2010 Courses, (1-100)
  100. ACM
    Billingsley M, Tibbitts B and George A Improving UPC productivity via integrated development tools Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, (1-9)
  101. Mateos C, Zunino A and Campo M (2010). An approach for non-intrusively adding malleable fork/join parallelism into ordinary JavaBean compliant applications, Computer Languages, Systems and Structures, 36:3, (288-315), Online publication date: 1-Oct-2010.
  102. Pigozzo A, Lobosco M and Dos Santos R Parallel implementation of a computational model of the human immune system Proceedings of the 2010 conference on Parallel processing, (217-224)
  103. ACM
    Mäkelä J and Leppänen V Towards programming on the moving threads architecture Proceedings of the 11th International Conference on Computer Systems and Technologies and Workshop for PhD Students in Computing on International Conference on Computer Systems and Technologies, (137-142)
  104. ACM
    Iosifidis Y, Mallik A, Mamagkakis S, De Greef E, Bartzas A, Soudris D and Catthoor F A framework for automatic parallelization, static and dynamic memory optimization in MPSoC platforms Proceedings of the 47th Design Automation Conference, (549-554)
  105. ACM
    Pan H, Hindman B and Asanović K (2010). Composing parallel software efficiently with lithe, ACM SIGPLAN Notices, 45:6, (376-387), Online publication date: 12-Jun-2010.
  106. ACM
    Pan H, Hindman B and Asanović K Composing parallel software efficiently with lithe Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation, (376-387)
  107. ACM
    Ben Asher Y, Giver D, Haber G and Kulish G HparC Proceedings of the 3rd Annual Haifa Experimental Systems Conference, (1-13)
  108. Howison M, Bethel E and Childs H MPI-hybrid parallelism for volume rendering on large, multi-core systems Proceedings of the 10th Eurographics conference on Parallel Graphics and Visualization, (1-10)
  109. Riedel M, Wolf F, Kranzlmüller D, Streit A and Lippert T (2009). Research advances by using interoperable e-science infrastructures, Cluster Computing, 12:4, (357-372), Online publication date: 1-Dec-2009.
  110. ACM
    Hulette G, Sottile M, Armstrong R and Allan B OnRamp Proceedings of the 2009 Workshop on Component-Based High Performance Computing, (1-10)
  111. De Sutter B, Verkest D, Brockmeyer E, Delfosse E, Vandecappelle A and Mignolet J (2009). Design and Tool Flow of Multimedia MPSoC Platforms, Journal of Signal Processing Systems, 57:2, (229-247), Online publication date: 1-Nov-2009.
  112. ACM
    Gray I and Audsley N Exposing non-standard architectures to embedded software using compile-time virtualisation Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems, (147-156)
  113. Wolffe G and Trefftz C (2009). Teaching parallel computing, Journal of Computing Sciences in Colleges, 25:1, (21-28), Online publication date: 1-Oct-2009.
  114. Stpiczyński P A parallel non-square tiled algorithm for solving a kind of BVP for second-order ODEs Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I, (87-94)
  115. Lavrentiev-Jr M, Romanenko A, Titov V and Vazhenin A High-Performance Tsunami Wave Propagation Modeling Proceedings of the 10th International Conference on Parallel Computing Technologies, (423-434)
  116. Heuveline V, Krause M and Latt J (2009). Towards a hybrid parallelization of lattice Boltzmann methods, Computers & Mathematics with Applications, 58:5, (1071-1080), Online publication date: 1-Sep-2009.
  117. Best M, Fedorova A, Dickie R, Tagliasacchi A, Couture-Beil A, Mustard C, Mottishaw S, Brown A, Huang Z, Xu X, Ghazali N and Brownsword A Searching for Concurrent Design Patterns in Video Games Proceedings of the 15th International Euro-Par Conference on Parallel Processing, (912-923)
  118. Michel L, See A and Van Hentenryck P (2009). Transparent Parallelization of Constraint Programming, INFORMS Journal on Computing, 21:3, (363-382), Online publication date: 1-Jul-2009.
  119. Spallaccini P, Iovine F and Italiano G An automatized methodology design for real-time signal processing applications in multiple multi-core platforms Proceedings of the 2009 IEEE international conference on Multimedia and Expo, (1825-1828)
  120. ACM
    Sarkar A, Mueller F, Ramaprasad H and Mohan S (2009). Push-assisted migration of real-time tasks in multi-core processors, ACM SIGPLAN Notices, 44:7, (80-89), Online publication date: 28-Jun-2009.
  121. ACM
    Sarkar A, Mueller F, Ramaprasad H and Mohan S Push-assisted migration of real-time tasks in multi-core processors Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, (80-89)
  122. Baert R, Brockmeyer E, Wuytack S and Ashby T Exploring parallelizations of applications for MPSoC platforms using MPA Proceedings of the Conference on Design, Automation and Test in Europe, (1148-1153)
  123. ACM
    Patchin P, Lagar-Cavilla H, de Lara E and Brudno M Adding the easy button to the cloud with SnowFlock and MPI Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing, (1-8)
  124. ACM
    Bücker H, Rasch A, Rath V and Wolf A Semi-automatic parallelization of direct and inverse problems for geothermal simulation Proceedings of the 2009 ACM symposium on Applied Computing, (971-975)
  125. Riedel M, Frings W, Habbinga S, Eickermann T, Mallmann D, Streit A, Wolf F, Lippert T, Ernst A and Spurzem R Extending the collaborative online visualization and steering framework for computational Grids with attribute-based authorization Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing, (104-111)
  126. ACM
    Seiler L, Carmean D, Sprangle E, Forsyth T, Abrash M, Dubey P, Junkins S, Lake A, Sugerman J, Cavin R, Espasa R, Grochowski E, Juan T and Hanrahan P Larrabee ACM SIGGRAPH 2008 papers, (1-15)
  127. ACM
    Seiler L, Carmean D, Sprangle E, Forsyth T, Abrash M, Dubey P, Junkins S, Lake A, Sugerman J, Cavin R, Espasa R, Grochowski E, Juan T and Hanrahan P (2008). Larrabee, ACM Transactions on Graphics, 27:3, (1-15), Online publication date: 1-Aug-2008.
  128. Brodman J, Fraguela B, Garzarán M and Padua D Design Issues in Parallel Array Languages for Shared Memory Proceedings of the 8th international workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation, (208-217)
  129. ACM
    Pankratius V, Schaefer C, Jannesari A and Tichy W Software engineering for multicore systems Proceedings of the 1st international workshop on Multicore software engineering, (53-60)
  130. Collette S, Cucu L and Goossens J (2008). Integrating job parallelism in real-time scheduling theory, Information Processing Letters, 106:5, (180-187), Online publication date: 1-May-2008.
  131. ACM
    Mattos G, Lins R, de Araújo Formiga A and Junqueira Martins F BigBatch Proceedings of the 2008 ACM symposium on Applied computing, (434-441)
  132. ACM
    Guo J, Bikshandi G, Fraguela B, Garzaran M and Padua D Programming with tiles Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, (111-122)
  133. Moura P, Crocker P and Nunes P High-level multi-threading programming in logtalk Proceedings of the 10th international conference on Practical aspects of declarative languages, (265-281)
  134. Trefftz C, Tao Y and Jorgensen P The numerical risks of reduction operations in OpenMP Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems, (200-203)
  135. Bodensteiner C, Darolti C, Schumacher H, Matthäus L and Schweikard A Motion and positional error correction for cone beam 3D-reconstruction with mobile C-arms Proceedings of the 10th international conference on Medical image computing and computer-assisted intervention - Volume Part I, (177-185)
  136. Brown R and Sharapov I (2007). High-scalability parallelization of a molecular modeling application, International Journal of Parallel Programming, 35:5, (441-458), Online publication date: 1-Oct-2007.
  137. Zumbusch G A container-iterator parallel programming model Proceedings of the 7th international conference on Parallel processing and applied mathematics, (1130-1139)
  138. Stpiczyński P Evaluating linear recursive filters using novel data formats for dense matrices Proceedings of the 7th international conference on Parallel processing and applied mathematics, (688-697)
  139. Moskovsky A, Roganov V, Abramov S and Kuznetsov A Variable reassignment in the T++ parallel programming language Proceedings of the 9th international conference on Parallel Computing Technologies, (579-588)
  140. Moskovsky A, Roganov V and Abramov S Parallelism granules aggregation with the T-system Proceedings of the 9th international conference on Parallel Computing Technologies, (293-302)
  141. Chamberlain B, Callahan D and Zima H (2007). Parallel Programmability and the Chapel Language, International Journal of High Performance Computing Applications, 21:3, (291-312), Online publication date: 1-Aug-2007.
  142. ACM
    Ipek E, Kirman M, Kirman N and Martinez J (2007). Core fusion, ACM SIGARCH Computer Architecture News, 35:2, (186-197), Online publication date: 9-Jun-2007.
  143. ACM
    Ipek E, Kirman M, Kirman N and Martinez J Core fusion Proceedings of the 34th annual international symposium on Computer architecture, (186-197)
  144. ACM
    Chandraiah P and Doemer R Designer-controlled generation of parallel and flexible heterogeneous MPSoC specification Proceedings of the 44th annual Design Automation Conference, (787-790)
  145. ACM
    Kaewkasi C and Gurd J A distributed dynamic aspect machine for scientific software development Proceedings of the 1st workshop on Virtual machines and intermediate languages for emerging modularization mechanisms, (3-es)
  146. Antony J, Janes P and Rendell A Exploring thread and memory placement on NUMA architectures Proceedings of the 13th international conference on High Performance Computing, (338-352)
  147. James T, Barkhi R and Johnson J (2006). Platform impact on performance of parallel genetic algorithms, Engineering Applications of Artificial Intelligence, 19:8, (843-856), Online publication date: 1-Dec-2006.
  148. ACM
    Franchetti F, Voronenko Y and Püschel M FFT program generation for shared memory Proceedings of the 2006 ACM/IEEE conference on Supercomputing, (115-es)
  149. ACM
    Gerndt A, Sarholz S, Wolter M, Mey D, Bischof C and Kuhlen T Nested OpenMP for efficient computation of 3D critical points in multi-block CFD datasets Proceedings of the 2006 ACM/IEEE conference on Supercomputing, (93-es)
  150. Mueller C and Lumsdaine A Expression and loop libraries for high-performance code synthesis Proceedings of the 19th international conference on Languages and compilers for parallel computing, (80-95)
  151. Flores-Becerra G, Garcia V and Vidal A Efficient parallel algorithm for constructing a unit triangular matrix with prescribed singular values Proceedings of the 7th international conference on High performance computing for computational science, (349-362)
  152. ACM
    Sharapov I, Kroeger R, Delamarter G, Cheveresan R and Ramsay M A case study in top-down performance estimation for a large-scale parallel application Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, (81-89)
  153. Chalabine M and Kessler C Parallelisation of sequential programs by invasive composition and aspect weaving Proceedings of the 6th international conference on Advanced Parallel Processing Technologies, (131-140)
  154. Stpiczyński P A note on the numerical inversion of the laplace transform Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics, (551-558)
  155. ACM
    Jung C, Lim D, Lee J and Han S Adaptive execution techniques for SMT multiprocessor architectures Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, (236-246)
  156. Brown R and Sharapov I Performance and programmability comparison between OpenMP and MPI implementations of a molecular modeling application Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming, (349-360)
  157. Süß M and Leopold C Common mistakes in OpenMP and how to avoid them Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming, (312-323)
  158. ACM
    Cox P, Gauvin S and Rau-Chaplin A Adding parallelism to visual data flow programs Proceedings of the 2005 ACM symposium on Software visualization, (135-144)
  159. Loogen R, Ortega-mallén Y and Peña-marí R (2005). Parallel functional programming in Eden, Journal of Functional Programming, 15:3, (431-475), Online publication date: 1-May-2005.
  160. Basharahil R, Wims B, Xu C and Fu S (2005). Distributed Shared Arrays, The Journal of Supercomputing, 31:2, (161-184), Online publication date: 1-Feb-2005.
  161. Ino F, Ooyama K and Hagihara K (2005). A data distributed parallel algorithm for nonrigid image registration, Parallel Computing, 31:1, (19-43), Online publication date: 1-Jan-2005.
  162. Zaldívar F, Maciá A and Salvador A A parallel algorithm based on a variant of the Kalman filter for solving the RLS problem Proceedings of the 4th WSEAS International Conference on Signal Processing, Computational Geometry & Artificial Vision, (1-6)
  163. Zhang G, Unnikrishnan P and Ren J Experiments with auto-parallelizing SPEC2000FP benchmarks Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing, (348-362)
  164. Cunha M, Telles J and Coutinho A Parallel boundary elements Proceedings of the 6th international conference on High Performance Computing for Computational Science, (514-526)
  165. Martín M, Parada M and Doallo R (2004). High Performance Air Pollution Simulation Using OpenMP, The Journal of Supercomputing, 28:3, (311-321), Online publication date: 1-Jun-2004.
  166. Zhang G, Silvera R and Archambault R Structure and algorithm for implementing OpenMP workshares Proceedings of the 5th international conference on OpenMP Applications and Tools: shared Memory Parallel Programming with OpenMP, (110-120)
  167. ACM
    Harbulot B and Gurd J Using AspectJ to separate concerns in parallel scientific Java code Proceedings of the 3rd international conference on Aspect-oriented software development, (122-131)
  168. ACM
    Bücker H, Rasch A and Wolf A A class of OpenMP applications involving nested parallelism Proceedings of the 2004 ACM symposium on Applied computing, (220-224)
  169. Martín M, Singh D, Mouriño J, Rivera F, Doallo R and Bruguera J (2003). High performance air pollution modeling for a power plant environment, Parallel Computing, 29:11-12, (1763-1790), Online publication date: 1-Nov-2003.
  170. Liu F and Chaudhary V A practical OpenMP compiler for system on chips Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming, (54-68)
  171. Standish R, Chee C and Smeds N OpenMP in the field Proceedings of the 2003 international conference on Computational science, (637-647)
  172. ACM
    Schirski M, Gerndt A, van Reimersdahl T, Kuhlen T, Adomeit P, Lang O, Pischinger S and Bischof C ViSTA FlowLib - framework for interactive visualization and exploration of unsteady flows in virtual environments Proceedings of the workshop on Virtual environments 2003, (77-85)
  173. Eberl H Simulation of chemical reaction fronts in anaerobic digestion of solid waste Proceedings of the 2003 international conference on Computational science and its applications: PartI, (503-512)
  174. Dongarra J, Foster I, Fox G, Gropp W, Kennedy K, Torczon L and White A References Sourcebook of parallel computing, (729-789)
  175. Vetter J and Yoo A An empirical performance evaluation of scalable scientific applications Proceedings of the 2002 ACM/IEEE conference on Supercomputing, (1-18)
  176. Lee J and Moonesinghe H Adaptively increasing performance and scalability of automatically parallelized programs Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing, (203-217)
  177. Momtchev M and Marquet P An Asymmetric Real-Time Scheduling for Linux Proceedings of the 16th International Parallel and Distributed Processing Symposium
  178. Chen K and Johnson J A Prototypical Self-Optimizing Package for Parallel Implementation of Fast Signal Transforms Proceedings of the 16th International Parallel and Distributed Processing Symposium
  179. Eigenmann R, Hoeflinger J, Kuhn R, Padua D, Basumallik A, Min S and Zhu J Is OpenMP for Grids? Proceedings of the 16th International Parallel and Distributed Processing Symposium
  180. ACM
    Bücker H, Lang B, an Mey D and Bischof C Bringing together automatic differentiation and OpenMP Proceedings of the 15th international conference on Supercomputing, (246-251)
Contributors
  • Hewlett Packard Enterprise
  • Argonne National Laboratory
  • Tensilica Inc.

Recommendations