Aimed at the working researcher or scientific C/C++ or Fortran programmer, this text introduces the competent research programmer to a new vocabulary of idioms and techniques for parallelizing software using OpenMP.
Cited By
- Gray K, Li M, Ahmed R, Rahman M, Azad A, Kobourov S and Börner K (2024). A Scalable Method for Readable Tree Layouts, IEEE Transactions on Visualization and Computer Graphics, 30:2, (1564-1578), Online publication date: 1-Feb-2024.
- Nguyen N, Tran M and Chandra R (2024). Sequential reversible jump MCMC for dynamic Bayesian neural networks, Neurocomputing, 564:C, Online publication date: 7-Jan-2024.
- Liu G and Iuricich F (2024). A Task-Parallel Approach for Localized Topological Data Structures, IEEE Transactions on Visualization and Computer Graphics, 30:1, (1271-1281), Online publication date: 1-Jan-2024.
- Hsu K and Tseng H Simultaneous and Heterogenous Multithreading Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, (137-152)
- Charilogis V, Tsoulos I and Tzallas A (2023). An Improved Parallel Particle Swarm Optimization, SN Computer Science, 4:6, Online publication date: 4-Oct-2023.
- Gan X, Wu G, Zeng R, Si J, Liu J, Dong D, Gong C, Liu C and Li T FT-topo: Architecture-Driven Folded-Triangle Partitioning for Communication-efficient Graph Processing Proceedings of the 37th International Conference on Supercomputing, (240-250)
- Neto W, Li Y, Gaillardon P and Yu C (2023). FlowTune: End-to-End Automatic Logic Optimization Exploration via Domain-Specific Multiarmed Bandit, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42:6, (1912-1925), Online publication date: 1-Jun-2023.
- Quislant R, Fernandez I, Gutierrez E and Plata O (2023). Time series analysis acceleration with advanced vectorization extensions, The Journal of Supercomputing, 79:9, (10178-10207), Online publication date: 1-Jun-2023.
- de Castro M, Santamaria-Valenzuela I, Torres Y, Gonzalez-Escribano A and Llanos D (2023). EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs, The Journal of Supercomputing, 79:9, (9409-9442), Online publication date: 1-Jun-2023.
- Tabanelli E, Tagliavini G and Benini L (2023). DNN Is Not All You Need: Parallelizing Non-neural ML Algorithms on Ultra-low-power IoT Processors, ACM Transactions on Embedded Computing Systems, 22:3, (1-33), Online publication date: 31-May-2023.
- Trabes G, Wainer G and Gil-Costa V (2023). A Parallel Algorithm to Accelerate DEVS Simulations in Shared Memory Architectures, IEEE Transactions on Parallel and Distributed Systems, 34:5, (1609-1620), Online publication date: 1-May-2023.
- Dominico S, de Almeida E and Alves M (2023). On the performance limits of thread placement for array databases in non-uniform memory architectures, Computing, 105:5, (1059-1075), Online publication date: 1-May-2023.
- Li J, Agung M and Takizawa H Evaluating the Performance and Conformance of a SYCL Implementation for SX-Aurora TSUBASA Parallel and Distributed Computing, Applications and Technologies, (36-47)
- Gambhir G and Mandal J (2021). Shared memory implementation and performance analysis of LSB steganography based on chaotic tent map, Innovations in Systems and Software Engineering, 17:4, (333-342), Online publication date: 1-Dec-2021.
- Podobas A, Svedin M, Chien S, Peng I, Ravichandran N, Herman P, Lansner A and Markidis S StreamBrain Proceedings of the 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, (1-6)
- Brahmakshatriya A, Furst E, Ying V, Hsu C, Hong C, Ruttenberg M, Zhang Y, Jung D, Richmond D, Taylor M, Shun J, Oskin M, Sanchez D and Amarasinghe S Taming the zoo Proceedings of the 48th Annual International Symposium on Computer Architecture, (429-442)
- Spiliotis I, Sitaridis C and Bekakos M (2021). Parallel Computation of Discrete Orthogonal Moment on Block Represented Images Using OpenMP, International Journal of Parallel Programming, 49:3, (440-462), Online publication date: 1-Jun-2021.
- Yin L, Zhang Y, Zhang Z, Peng Y and Zhao P (2021). ParaX, Proceedings of the VLDB Endowment, 14:6, (864-877), Online publication date: 1-Feb-2021.
- Anastasopoulos N, Tsoulos I, Karvounis E and Tzallas A (2020). Locate the Bounding Box of Neural Networks with Intervals, Neural Processing Letters, 52:3, (2241-2251), Online publication date: 1-Dec-2020.
- Arabnejad H, Bispo J, Cardoso J and Barbosa J (2019). Source-to-source compilation targeting OpenMP-based automatic parallelization of C applications, The Journal of Supercomputing, 76:9, (6753-6785), Online publication date: 1-Sep-2020.
- Ren Z, Gu Y, Li C, Li F and Yu G GHSH: Dynamic Hyperspace Hashing on GPU Web and Big Data, (409-424)
- Magalhães T and Helio J. C. B Parallel Differential Evolution Algorithms for Stackelberg-Nash Bilevel Optimization Problems 2020 IEEE Congress on Evolutionary Computation (CEC), (1-8)
- Dansou A, Mouhoubi S and Chazallon C (2019). Optimizations of a fast multipole symmetric Galerkin boundary element method code, Numerical Algorithms, 84:3, (825-846), Online publication date: 1-Jul-2020.
- Ejjaaouani K, Aumage O, Bigot J, Méhrenberger M, Murai H, Nakao M and Sato M (2019). InKS: a programming model to decouple algorithm from optimization in HPC codes, The Journal of Supercomputing, 76:6, (4666-4681), Online publication date: 1-Jun-2020.
- Gowanlock M, Karsin B, Fink Z and Wright J Accelerating the Unacceleratable Proceedings of the 15th International Workshop on Data Management on New Hardware, (1-11)
- Reguly I, Moore B, Schmielau T, du Toit J and Mudalige G Batch Solution of Small PDEs with the OPS DSL High Performance Computing, (124-141)
- Gowanlock M KNN-Joins Using a Hybrid Approach Proceedings of the 12th Workshop on General Purpose Processing Using GPUs, (33-42)
- Fresno J, Barba D, Gonzalez-Escribano A and Llanos D (2019). HitFlow, International Journal of Parallel Programming, 47:1, (3-23), Online publication date: 1-Feb-2019.
- Ha O, Lee K, Kim W, Yoon K and Mateos C (2019). Effective Parallelization Method for Object Recognition in 2D Sonar Images Based on Task Partitioning, Scientific Programming, 2019, Online publication date: 1-Jan-2019.
- Choi Y and Cong J HLS-Based Optimization and Design Space Exploration for Applications with Variable Loop Bounds 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), (1-8)
- Huang Z, Li M, Chousidis C, Mousavi A and Jiang C (2018). Schema Theory-Based Data Engineering in Gene Expression Programming for Big Data Analytics, IEEE Transactions on Evolutionary Computation, 22:5, (792-804), Online publication date: 1-Oct-2018.
- Muñoz J, Dolz M, del Rio Astorga D, Cepeda J and García J Supporting MPI-distributed stream parallel patterns in GrPPI Proceedings of the 25th European MPI Users' Group Meeting, (1-10)
- Ejjaaouani K, Aumage O, Bigot J, Mehrenberger M, Murai H, Nakao M and Sato M , a Programming Model to Decouple Performance from Algorithm in HPC Codes Euro-Par 2018: Parallel Processing Workshops, (757-768)
- Carabaş M, Drăghici A, Lupescu G, Samoilă C and Sluşanschi E Integrating Parallel Computing in the Curriculum of the University Politehnica of Bucharest Euro-Par 2018: Parallel Processing Workshops, (222-234)
- Bianchi F, Margara A and Pezze M (2018). A Survey of Recent Trends in Testing Concurrent Software Systems, IEEE Transactions on Software Engineering, 44:8, (747-783), Online publication date: 1-Aug-2018.
- Gu B, Shan Y, Geng X and Zheng G Accelerated asynchronous greedy coordinate descent algorithm for SVMs Proceedings of the 27th International Joint Conference on Artificial Intelligence, (2170-2176)
- Pfander D, Daiß G, Marcello D, Kaiser H and Pflüger D Accelerating Octo-Tiger Proceedings of the International Workshop on OpenCL, (1-8)
- Stpiczyński P (2018). Language-based vectorization and parallelization using intrinsics, OpenMP, TBB and Cilk Plus, The Journal of Supercomputing, 74:4, (1461-1472), Online publication date: 1-Apr-2018.
- Stpiczyński P (2018). Vectorized algorithm for multidimensional Monte Carlo integration on modern GPU, CPU and MIC architectures, The Journal of Supercomputing, 74:2, (936-952), Online publication date: 1-Feb-2018.
- Arabnejad H, Bispo J, Barbosa J and Cardoso J AutoPar-Clava Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms, (13-19)
- Trinder P, Chechina N, Papaspyrou N, Sagonas K, Thompson S, Adams S, Aronis S, Baker R, Bihari E, Boudeville O, Cesarini F, Stefano M, Eriksson S, fördős V, Ghaffari A, Giantsios A, Green R, Hoch C, Klaftenegger D, Li H, Lundin K, Mackenzie K, Roukounaki K, Tsiouris Y and Winblad K (2017). Scaling Reliably, ACM Transactions on Programming Languages and Systems, 39:4, (1-46), Online publication date: 31-Dec-2018.
- Fan R and Dahnoun N Real-time implementation of stereo vision based on optimised normalised cross-correlation and propagated search range on a GPU 2017 IEEE International Conference on Imaging Systems and Techniques (IST), (1-6)
- Saxena R, Jain M, Singh D and Kushwah A An enhanced parallel version of RSA public key crypto based algorithm using openMP Proceedings of the 10th International Conference on Security of Information and Networks, (37-42)
- Beard J The sparse data reduction engine Proceedings of the International Symposium on Memory Systems, (34-48)
- Chechina N, MacKenzie K, Thompson S, Trinder P, Boudeville O, Fördős V, Hoch C, Ghaffari A and Hernandez M (2017). Evaluating Scalable Distributed Erlang for Scalability and Reliability, IEEE Transactions on Parallel and Distributed Systems, 28:8, (2244-2257), Online publication date: 1-Aug-2017.
- Zahaf H, Benyamina A, Olejnik R and Lipari G (2017). Energy-efficient scheduling for moldable real-time tasks on heterogeneous computing platforms, Journal of Systems Architecture: the EUROMICRO Journal, 74:C, (46-60), Online publication date: 1-Mar-2017.
- Oh S and Hong J (2017). Parallelization of a finite element Fortran code using OpenMP library, Advances in Engineering Software, 104:C, (28-37), Online publication date: 1-Feb-2017.
- Harizanov S, Lirkov I, Georgiev K, Paprzycki M and Ganzha M (2017). Performance analysis of a parallel algorithm for restoring large-scale CT images, Journal of Computational and Applied Mathematics, 310:C, (104-114), Online publication date: 15-Jan-2017.
- Boratto M, Alonso P, Giménez D and Lastovetsky A (2017). Automatic tuning to performance modelling of matrix polynomials on multicore and multi-GPU systems, The Journal of Supercomputing, 73:1, (227-239), Online publication date: 1-Jan-2017.
- Ravi P, Syam U and Kapre N Preventive Detection of Mosquito Populations using Embedded Machine Learning on Low Power IoT Platforms Proceedings of the 7th Annual Symposium on Computing for Development, (1-10)
- Wang X, Leidel J and Chen Y Concurrent Dynamic Memory Coalescing on GoblinCore-64 Architecture Proceedings of the Second International Symposium on Memory Systems, (177-187)
- Mansouri F, Huet S and Houzet D (2016). A domain-specific high-level programming model, Concurrency and Computation: Practice & Experience, 28:3, (750-767), Online publication date: 10-Mar-2016.
- Shewale A, Waghmare N, Sonawane A and Teke U High Performance Computation Analysis for Medical Images using High Computational Methods Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, (1-6)
- Zhao J, Tao J and Streit A (2016). Enabling collaborative MapReduce on the Cloud with a single-sign-on mechanism, Computing, 98:1-2, (55-72), Online publication date: 1-Jan-2016.
- Kaiser H, Heller T, Bourgeois D and Fey D Higher-level parallelization for local and distributed asynchronous task-based programming Proceedings of the First International Workshop on Extreme Scale Programming Models and Middleware, (29-37)
- Besard T, De Sutter B, Frías-Vel$#225;zquez A and Philips W (2015). Case study of multiple trace transform implementations, International Journal of High Performance Computing Applications, 29:4, (489-505), Online publication date: 1-Nov-2015.
- Soares T, Xavier M, Pigozzo A, Campos R, Santos R and Lobosco M Performance Evaluation of a Human Immune System Simulator on a GPU Cluster Proceedings of the 13th International Conference on Parallel Computing Technologies - Volume 9251, (458-468)
- Hernández M, Imbernón B, Navarro J, García J, Cebrián J and Cecilia J (2015). Evaluation of the 3-D finite difference implementation of the acoustic diffusion equation model on massively parallel architectures, Computers and Electrical Engineering, 46:C, (190-201), Online publication date: 1-Aug-2015.
- Bailey M Fundamentals seminar ACM SIGGRAPH 2015 Courses, (1-129)
- Millo J, Kofman E and Simone R (2015). Modeling and Analyzing Dataflow Applications on NoC-Based Many-Core Architectures, ACM Transactions on Embedded Computing Systems, 14:3, (1-25), Online publication date: 21-May-2015.
- Arbelaitz O, Martin J and Muguerza J (2015). Analysis of Introducing Active Learning Methodologies in a Basic Computer Architecture Course, IEEE Transactions on Education, 58:2, (110-116), Online publication date: 1-May-2015.
- Rodrigues A, Jorge A and Dutra I Accelerating recommender systems using GPUs Proceedings of the 30th Annual ACM Symposium on Applied Computing, (879-884)
- Szałkowski D and Stpiczyński P (2015). Using distributed memory parallel computers and GPU clusters for multidimensional Monte Carlo integration, Concurrency and Computation: Practice & Experience, 27:4, (923-936), Online publication date: 25-Mar-2015.
- Beard J, Li P and Chamberlain R RaftLib Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, (96-105)
- Shafi A, Akhtar A, Javed A and Carpenter B Teaching parallel programming using Java Proceedings of the Workshop on Education for High-Performance Computing, (56-63)
- Kaiser H, Heller T, Adelstein-Lelbach B, Serio A and Fey D HPX Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, (1-11)
- Fanfarillo A, Burnus T, Cardellini V, Filippone S, Nagle D and Rouson D OpenCoarrays Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, (1-11)
- Mansouri F, Huet S and Houzet D A Visual Programming Model to Implement Coarse-Grained DSP Applications on Parallel and Heterogeneous Clusters Revised Selected Papers, Part I, of the Euro-Par 2014 International Workshops on Parallel Processing - Volume 8805, (141-152)
- Sujeeth A, Gibbons A, Brown K, Lee H, Rompf T, Odersky M and Olukotun K (2013). Forge, ACM SIGPLAN Notices, 49:3, (145-154), Online publication date: 5-Mar-2014.
- Hawick K and Playne D Developmental directions in parallel accelerators Proceedings of the Twelfth Australasian Symposium on Parallel and Distributed Computing - Volume 152, (21-27)
- Zhao J, Lublinerman R, Budimlić Z, Chaudhuri S and Sarkar V (2013). Isolation for nested task parallelism, ACM SIGPLAN Notices, 48:10, (571-588), Online publication date: 12-Nov-2013.
- Zhao J, Lublinerman R, Budimlić Z, Chaudhuri S and Sarkar V Isolation for nested task parallelism Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications, (571-588)
- Sujeeth A, Gibbons A, Brown K, Lee H, Rompf T, Odersky M and Olukotun K Forge Proceedings of the 12th international conference on Generative programming: concepts & experiences, (145-154)
- Aljabri M, Loidl H and Trinder P The Design and Implementation of GUMSMP Proceedings of the 25th symposium on Implementation and Application of Functional Languages, (37-48)
- Ghoting A, Gunnels J, Kambadur P, Pednault E and Squillante M (2013). Trends and outlook for the massive-scale analytics stack, IBM Journal of Research and Development, 57:3-4, (2-2), Online publication date: 1-May-2013.
- Capuzzo-Dolcetta R, Spera M and Punzo D (2013). A fully parallel, high precision, N-body code running on hybrid computing platforms, Journal of Computational Physics, 236, (580-593), Online publication date: 1-Mar-2013.
- Zhuravlev S, Saez J, Blagodurov S, Fedorova A and Prieto M (2012). Survey of scheduling techniques for addressing shared resources in multicore processors, ACM Computing Surveys, 45:1, (1-28), Online publication date: 1-Nov-2012.
- Satish N, Kim C, Chhugani J, Saito H, Krishnaiyer R, Smelyanskiy M, Girkar M and Dubey P (2012). Can traditional programming bridge the Ninja performance gap for parallel computing applications?, ACM SIGARCH Computer Architecture News, 40:3, (440-451), Online publication date: 5-Sep-2012.
- López-Espín J, Vidal A and Giménez D (2012). Two-stage least squares and indirect least squares algorithms for simultaneous equations models, Journal of Computational and Applied Mathematics, 236:15, (3676-3684), Online publication date: 1-Sep-2012.
- Misztal M, Erleben K, Bargteil A, Fursund J, Christensen B, Bærentzen J and Bridson R Multiphase flow of immiscible fluids on unstructured moving meshes Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, (97-106)
- Misztal M, Erleben K, Bargteil A, Fursund J, Christensen B, Bærentzen J and Bridson R Multiphase flow of immiscible fluids on unstructured moving meshes Proceedings of the 11th ACM SIGGRAPH / Eurographics conference on Computer Animation, (97-106)
- Boudeville O, Cesarini F, Chechina N, Lundin K, Papaspyrou N, Sagonas K, Thompson S, Trinder P and Wiger U RELEASE Proceedings of the 2012 Conference on Trends in Functional Programming - Volume 7829, (263-278)
- Satish N, Kim C, Chhugani J, Saito H, Krishnaiyer R, Smelyanskiy M, Girkar M and Dubey P Can traditional programming bridge the Ninja performance gap for parallel computing applications? Proceedings of the 39th Annual International Symposium on Computer Architecture, (440-451)
- Jaros J and Pospichal P A fair comparison of modern CPUs and GPUs running the genetic algorithm under the knapsack benchmark Proceedings of the 2012t European conference on Applications of Evolutionary Computation, (426-435)
- Bailey M and Cunningham S Introduction to computer graphics SIGGRAPH Asia 2011 Courses, (1-58)
- Huerta Yero E and Lucchese F Practical experiences on the gridification of financial applications Proceedings of the fourth workshop on High performance computational finance, (39-46)
- Lublinerman R, Zhao J, Budimlić Z, Chaudhuri S and Sarkar V Delegated isolation Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications, (885-902)
- Lublinerman R, Zhao J, Budimlić Z, Chaudhuri S and Sarkar V (2011). Delegated isolation, ACM SIGPLAN Notices, 46:10, (885-902), Online publication date: 18-Oct-2011.
- Lins R, de F. Pereira e Silva G and de A. Formiga A HistDoc v. 2.0 Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, (169-176)
- Burak D and Chudzik M Parallelization of the discrete chaotic block encryption algorithm Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II, (323-332)
- Tibbits M, Haran M and Liechty J (2011). Parallel multivariate slice sampling, Statistics and Computing, 21:3, (415-430), Online publication date: 1-Jul-2011.
- Viry P Parallel and distributed programming extensions for mainstream languages based on pi-calculus Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing, (343-344)
- Gray I and Audsley N (2011). Targeting complex embedded architectures by combining the multicore communications API (mcapi) with compile-time virtualisation, ACM SIGPLAN Notices, 46:5, (51-60), Online publication date: 11-Apr-2011.
- Gray I and Audsley N Targeting complex embedded architectures by combining the multicore communications API (mcapi) with compile-time virtualisation Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems, (51-60)
- Chan M and Yang L Comparative analysis of OpenMP and MPI on multi-core architecture Proceedings of the 44th Annual Simulation Symposium, (18-25)
- Hobor A and Gherghina C Barriers in concurrent separation logic Proceedings of the 20th European conference on Programming languages and systems: part of the joint European conferences on theory and practice of software, (276-296)
- Zaykov P and Kuzmanov G Architectural support for multithreading on reconfigurable hardware Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications, (363-374)
- Lagar-Cavilla H, Whitney J, Bryant R, Patchin P, Brudno M, de Lara E, Rumble S, Satyanarayanan M and Scannell A (2011). SnowFlock, ACM Transactions on Computer Systems, 29:1, (1-45), Online publication date: 1-Feb-2011.
- Bailey M and Cunningham S Introduction to computer graphics ACM SIGGRAPH ASIA 2010 Courses, (1-100)
- Billingsley M, Tibbitts B and George A Improving UPC productivity via integrated development tools Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, (1-9)
- Mateos C, Zunino A and Campo M (2010). An approach for non-intrusively adding malleable fork/join parallelism into ordinary JavaBean compliant applications, Computer Languages, Systems and Structures, 36:3, (288-315), Online publication date: 1-Oct-2010.
- Pigozzo A, Lobosco M and Dos Santos R Parallel implementation of a computational model of the human immune system Proceedings of the 2010 conference on Parallel processing, (217-224)
- Mäkelä J and Leppänen V Towards programming on the moving threads architecture Proceedings of the 11th International Conference on Computer Systems and Technologies and Workshop for PhD Students in Computing on International Conference on Computer Systems and Technologies, (137-142)
- Iosifidis Y, Mallik A, Mamagkakis S, De Greef E, Bartzas A, Soudris D and Catthoor F A framework for automatic parallelization, static and dynamic memory optimization in MPSoC platforms Proceedings of the 47th Design Automation Conference, (549-554)
- Pan H, Hindman B and Asanović K (2010). Composing parallel software efficiently with lithe, ACM SIGPLAN Notices, 45:6, (376-387), Online publication date: 12-Jun-2010.
- Pan H, Hindman B and Asanović K Composing parallel software efficiently with lithe Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation, (376-387)
- Ben Asher Y, Giver D, Haber G and Kulish G HparC Proceedings of the 3rd Annual Haifa Experimental Systems Conference, (1-13)
- Howison M, Bethel E and Childs H MPI-hybrid parallelism for volume rendering on large, multi-core systems Proceedings of the 10th Eurographics conference on Parallel Graphics and Visualization, (1-10)
- Riedel M, Wolf F, Kranzlmüller D, Streit A and Lippert T (2009). Research advances by using interoperable e-science infrastructures, Cluster Computing, 12:4, (357-372), Online publication date: 1-Dec-2009.
- Hulette G, Sottile M, Armstrong R and Allan B OnRamp Proceedings of the 2009 Workshop on Component-Based High Performance Computing, (1-10)
- De Sutter B, Verkest D, Brockmeyer E, Delfosse E, Vandecappelle A and Mignolet J (2009). Design and Tool Flow of Multimedia MPSoC Platforms, Journal of Signal Processing Systems, 57:2, (229-247), Online publication date: 1-Nov-2009.
- Gray I and Audsley N Exposing non-standard architectures to embedded software using compile-time virtualisation Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems, (147-156)
- Wolffe G and Trefftz C (2009). Teaching parallel computing, Journal of Computing Sciences in Colleges, 25:1, (21-28), Online publication date: 1-Oct-2009.
- Stpiczyński P A parallel non-square tiled algorithm for solving a kind of BVP for second-order ODEs Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I, (87-94)
- Lavrentiev-Jr M, Romanenko A, Titov V and Vazhenin A High-Performance Tsunami Wave Propagation Modeling Proceedings of the 10th International Conference on Parallel Computing Technologies, (423-434)
- Heuveline V, Krause M and Latt J (2009). Towards a hybrid parallelization of lattice Boltzmann methods, Computers & Mathematics with Applications, 58:5, (1071-1080), Online publication date: 1-Sep-2009.
- Best M, Fedorova A, Dickie R, Tagliasacchi A, Couture-Beil A, Mustard C, Mottishaw S, Brown A, Huang Z, Xu X, Ghazali N and Brownsword A Searching for Concurrent Design Patterns in Video Games Proceedings of the 15th International Euro-Par Conference on Parallel Processing, (912-923)
- Michel L, See A and Van Hentenryck P (2009). Transparent Parallelization of Constraint Programming, INFORMS Journal on Computing, 21:3, (363-382), Online publication date: 1-Jul-2009.
- Spallaccini P, Iovine F and Italiano G An automatized methodology design for real-time signal processing applications in multiple multi-core platforms Proceedings of the 2009 IEEE international conference on Multimedia and Expo, (1825-1828)
- Sarkar A, Mueller F, Ramaprasad H and Mohan S (2009). Push-assisted migration of real-time tasks in multi-core processors, ACM SIGPLAN Notices, 44:7, (80-89), Online publication date: 28-Jun-2009.
- Sarkar A, Mueller F, Ramaprasad H and Mohan S Push-assisted migration of real-time tasks in multi-core processors Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, (80-89)
- Baert R, Brockmeyer E, Wuytack S and Ashby T Exploring parallelizations of applications for MPSoC platforms using MPA Proceedings of the Conference on Design, Automation and Test in Europe, (1148-1153)
- Patchin P, Lagar-Cavilla H, de Lara E and Brudno M Adding the easy button to the cloud with SnowFlock and MPI Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing, (1-8)
- Bücker H, Rasch A, Rath V and Wolf A Semi-automatic parallelization of direct and inverse problems for geothermal simulation Proceedings of the 2009 ACM symposium on Applied Computing, (971-975)
- Riedel M, Frings W, Habbinga S, Eickermann T, Mallmann D, Streit A, Wolf F, Lippert T, Ernst A and Spurzem R Extending the collaborative online visualization and steering framework for computational Grids with attribute-based authorization Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing, (104-111)
- Seiler L, Carmean D, Sprangle E, Forsyth T, Abrash M, Dubey P, Junkins S, Lake A, Sugerman J, Cavin R, Espasa R, Grochowski E, Juan T and Hanrahan P Larrabee ACM SIGGRAPH 2008 papers, (1-15)
- Seiler L, Carmean D, Sprangle E, Forsyth T, Abrash M, Dubey P, Junkins S, Lake A, Sugerman J, Cavin R, Espasa R, Grochowski E, Juan T and Hanrahan P (2008). Larrabee, ACM Transactions on Graphics, 27:3, (1-15), Online publication date: 1-Aug-2008.
- Brodman J, Fraguela B, Garzarán M and Padua D Design Issues in Parallel Array Languages for Shared Memory Proceedings of the 8th international workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation, (208-217)
- Pankratius V, Schaefer C, Jannesari A and Tichy W Software engineering for multicore systems Proceedings of the 1st international workshop on Multicore software engineering, (53-60)
- Collette S, Cucu L and Goossens J (2008). Integrating job parallelism in real-time scheduling theory, Information Processing Letters, 106:5, (180-187), Online publication date: 1-May-2008.
- Mattos G, Lins R, de Araújo Formiga A and Junqueira Martins F BigBatch Proceedings of the 2008 ACM symposium on Applied computing, (434-441)
- Guo J, Bikshandi G, Fraguela B, Garzaran M and Padua D Programming with tiles Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, (111-122)
- Moura P, Crocker P and Nunes P High-level multi-threading programming in logtalk Proceedings of the 10th international conference on Practical aspects of declarative languages, (265-281)
- Trefftz C, Tao Y and Jorgensen P The numerical risks of reduction operations in OpenMP Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems, (200-203)
- Bodensteiner C, Darolti C, Schumacher H, Matthäus L and Schweikard A Motion and positional error correction for cone beam 3D-reconstruction with mobile C-arms Proceedings of the 10th international conference on Medical image computing and computer-assisted intervention - Volume Part I, (177-185)
- Brown R and Sharapov I (2007). High-scalability parallelization of a molecular modeling application, International Journal of Parallel Programming, 35:5, (441-458), Online publication date: 1-Oct-2007.
- Zumbusch G A container-iterator parallel programming model Proceedings of the 7th international conference on Parallel processing and applied mathematics, (1130-1139)
- Stpiczyński P Evaluating linear recursive filters using novel data formats for dense matrices Proceedings of the 7th international conference on Parallel processing and applied mathematics, (688-697)
- Moskovsky A, Roganov V, Abramov S and Kuznetsov A Variable reassignment in the T++ parallel programming language Proceedings of the 9th international conference on Parallel Computing Technologies, (579-588)
- Moskovsky A, Roganov V and Abramov S Parallelism granules aggregation with the T-system Proceedings of the 9th international conference on Parallel Computing Technologies, (293-302)
- Chamberlain B, Callahan D and Zima H (2007). Parallel Programmability and the Chapel Language, International Journal of High Performance Computing Applications, 21:3, (291-312), Online publication date: 1-Aug-2007.
- Ipek E, Kirman M, Kirman N and Martinez J (2007). Core fusion, ACM SIGARCH Computer Architecture News, 35:2, (186-197), Online publication date: 9-Jun-2007.
- Ipek E, Kirman M, Kirman N and Martinez J Core fusion Proceedings of the 34th annual international symposium on Computer architecture, (186-197)
- Chandraiah P and Doemer R Designer-controlled generation of parallel and flexible heterogeneous MPSoC specification Proceedings of the 44th annual Design Automation Conference, (787-790)
- Kaewkasi C and Gurd J A distributed dynamic aspect machine for scientific software development Proceedings of the 1st workshop on Virtual machines and intermediate languages for emerging modularization mechanisms, (3-es)
- Antony J, Janes P and Rendell A Exploring thread and memory placement on NUMA architectures Proceedings of the 13th international conference on High Performance Computing, (338-352)
- James T, Barkhi R and Johnson J (2006). Platform impact on performance of parallel genetic algorithms, Engineering Applications of Artificial Intelligence, 19:8, (843-856), Online publication date: 1-Dec-2006.
- Franchetti F, Voronenko Y and Püschel M FFT program generation for shared memory Proceedings of the 2006 ACM/IEEE conference on Supercomputing, (115-es)
- Gerndt A, Sarholz S, Wolter M, Mey D, Bischof C and Kuhlen T Nested OpenMP for efficient computation of 3D critical points in multi-block CFD datasets Proceedings of the 2006 ACM/IEEE conference on Supercomputing, (93-es)
- Mueller C and Lumsdaine A Expression and loop libraries for high-performance code synthesis Proceedings of the 19th international conference on Languages and compilers for parallel computing, (80-95)
- Flores-Becerra G, Garcia V and Vidal A Efficient parallel algorithm for constructing a unit triangular matrix with prescribed singular values Proceedings of the 7th international conference on High performance computing for computational science, (349-362)
- Sharapov I, Kroeger R, Delamarter G, Cheveresan R and Ramsay M A case study in top-down performance estimation for a large-scale parallel application Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, (81-89)
- Chalabine M and Kessler C Parallelisation of sequential programs by invasive composition and aspect weaving Proceedings of the 6th international conference on Advanced Parallel Processing Technologies, (131-140)
- Stpiczyński P A note on the numerical inversion of the laplace transform Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics, (551-558)
- Jung C, Lim D, Lee J and Han S Adaptive execution techniques for SMT multiprocessor architectures Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, (236-246)
- Brown R and Sharapov I Performance and programmability comparison between OpenMP and MPI implementations of a molecular modeling application Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming, (349-360)
- Süß M and Leopold C Common mistakes in OpenMP and how to avoid them Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming, (312-323)
- Cox P, Gauvin S and Rau-Chaplin A Adding parallelism to visual data flow programs Proceedings of the 2005 ACM symposium on Software visualization, (135-144)
- Loogen R, Ortega-mallén Y and Peña-marí R (2005). Parallel functional programming in Eden, Journal of Functional Programming, 15:3, (431-475), Online publication date: 1-May-2005.
- Basharahil R, Wims B, Xu C and Fu S (2005). Distributed Shared Arrays, The Journal of Supercomputing, 31:2, (161-184), Online publication date: 1-Feb-2005.
- Ino F, Ooyama K and Hagihara K (2005). A data distributed parallel algorithm for nonrigid image registration, Parallel Computing, 31:1, (19-43), Online publication date: 1-Jan-2005.
- Zaldívar F, Maciá A and Salvador A A parallel algorithm based on a variant of the Kalman filter for solving the RLS problem Proceedings of the 4th WSEAS International Conference on Signal Processing, Computational Geometry & Artificial Vision, (1-6)
- Zhang G, Unnikrishnan P and Ren J Experiments with auto-parallelizing SPEC2000FP benchmarks Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing, (348-362)
- Cunha M, Telles J and Coutinho A Parallel boundary elements Proceedings of the 6th international conference on High Performance Computing for Computational Science, (514-526)
- Martín M, Parada M and Doallo R (2004). High Performance Air Pollution Simulation Using OpenMP, The Journal of Supercomputing, 28:3, (311-321), Online publication date: 1-Jun-2004.
- Zhang G, Silvera R and Archambault R Structure and algorithm for implementing OpenMP workshares Proceedings of the 5th international conference on OpenMP Applications and Tools: shared Memory Parallel Programming with OpenMP, (110-120)
- Harbulot B and Gurd J Using AspectJ to separate concerns in parallel scientific Java code Proceedings of the 3rd international conference on Aspect-oriented software development, (122-131)
- Bücker H, Rasch A and Wolf A A class of OpenMP applications involving nested parallelism Proceedings of the 2004 ACM symposium on Applied computing, (220-224)
- Martín M, Singh D, Mouriño J, Rivera F, Doallo R and Bruguera J (2003). High performance air pollution modeling for a power plant environment, Parallel Computing, 29:11-12, (1763-1790), Online publication date: 1-Nov-2003.
- Liu F and Chaudhary V A practical OpenMP compiler for system on chips Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming, (54-68)
- Standish R, Chee C and Smeds N OpenMP in the field Proceedings of the 2003 international conference on Computational science, (637-647)
- Schirski M, Gerndt A, van Reimersdahl T, Kuhlen T, Adomeit P, Lang O, Pischinger S and Bischof C ViSTA FlowLib - framework for interactive visualization and exploration of unsteady flows in virtual environments Proceedings of the workshop on Virtual environments 2003, (77-85)
- Eberl H Simulation of chemical reaction fronts in anaerobic digestion of solid waste Proceedings of the 2003 international conference on Computational science and its applications: PartI, (503-512)
- Dongarra J, Foster I, Fox G, Gropp W, Kennedy K, Torczon L and White A References Sourcebook of parallel computing, (729-789)
- Vetter J and Yoo A An empirical performance evaluation of scalable scientific applications Proceedings of the 2002 ACM/IEEE conference on Supercomputing, (1-18)
- Lee J and Moonesinghe H Adaptively increasing performance and scalability of automatically parallelized programs Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing, (203-217)
- Momtchev M and Marquet P An Asymmetric Real-Time Scheduling for Linux Proceedings of the 16th International Parallel and Distributed Processing Symposium
- Chen K and Johnson J A Prototypical Self-Optimizing Package for Parallel Implementation of Fast Signal Transforms Proceedings of the 16th International Parallel and Distributed Processing Symposium
- Eigenmann R, Hoeflinger J, Kuhn R, Padua D, Basumallik A, Min S and Zhu J Is OpenMP for Grids? Proceedings of the 16th International Parallel and Distributed Processing Symposium
- Bücker H, Lang B, an Mey D and Bischof C Bringing together automatic differentiation and OpenMP Proceedings of the 15th international conference on Supercomputing, (246-251)
Index Terms
- Parallel programming in OpenMP
Recommendations
Hybrid parallel programming on SMP clusters using XPFortran and OpenMP
IWOMP'10: Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and moreProcess-thread hybrid programming paradigm is commonly employed in SMP clusters. XPFortran, a parallel programming language that specifies a set of compiler directives and library routines, can be used to realize process-level parallelism in distributed ...
OpenMP for Networks of SMPs
In this paper, we present the first system that implements OpenMP on a network of shared-memory multiprocessors. This system enables the programmer to rely on a single, standard, shared-memory API for parallelization within a multiprocessor and between ...
Teaching Parallel Computing Concepts with OpenMP (Abstract Only)
SIGCSE '16: Proceedings of the 47th ACM Technical Symposium on Computing Science EducationOpenMP is an industry-standard, platform-independent parallel programming library built into all modern C and C++ compilers. Unlike complex parallel platforms, OpenMP is designed to make it relatively easy to add parallelism to existing sequential ...