This GASNet specification describes a network-independent and language-independent high-performance communication interface intended for use in implementing the runtime system for global address space languages (such as UPC or Titanium).
Cited By
- Namashivayam N, Long B, Eachempati D, Cernohous B and Pagel M (2020). A Modern Fortran Interface in OpenSHMEM Need for Interoperability with Parallel Fortran Using Coarrays, ACM Transactions on Parallel Computing, 7:4, (1-25), Online publication date: 1-Dec-2020.
- Kayraklioglu E, Ferguson M and El-Ghazawi T (2018). LAPPS, ACM Transactions on Architecture and Code Optimization, 15:3, (1-26), Online publication date: 8-Oct-2018.
- Matheou G and Evripidou P (2017). Data-Driven Concurrency for High Performance Computing, ACM Transactions on Architecture and Code Optimization, 14:4, (1-26), Online publication date: 20-Dec-2017.
- Chaimov N, Ibrahim K, Williams S and Iancu C (2017). Reaching bandwidth saturation using transparent injection parallelization, International Journal of High Performance Computing Applications, 31:5, (405-421), Online publication date: 1-Sep-2017.
- Habel R, Silber-Chaussumier F, Irigoin F, Brunet E and Trahay F (2016). Combining Data and Computation Distribution Directives for Hybrid Parallel Programming, International Journal of Parallel Programming, 44:6, (1268-1295), Online publication date: 1-Dec-2016.
- Dysart T, Kogge P, Deneroff M, Bovell E, Briggs P, Brockman J, Jacobsen K, Juan Y, Kuntz S, Lethin R, McMahon J, Pawar C, Perrigo M, Rucker S, Ruttenberg J, Ruttenberg M and Stein S Highly scalable near memory processing with migrating threads on the emu system architecture Proceedings of the Sixth Workshop on Irregular Applications: Architectures and Algorithms, (2-9)
- LeBeane M, Potter B, Pan A, Dutu A, Agarwala V, Lee W, Majeti D, Ghimire B, Van Tassell E, Wasmundt S, Benton B, Breternitz M, Chu M, Thottethodi M, John L and Reinhardt S Extended task queuing Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-12)
- Niu Q, Dinan J, Tirukkovalur S, Benali A, Kim J, Mitas L, Wagner L and Sadayappan P (2016). Global-view coefficients, Concurrency and Computation: Practice & Experience, 28:13, (3655-3671), Online publication date: 10-Sep-2016.
- Anbar A, Serres O, Kayraklioglu E, Badawy A and El-Ghazawi T (2016). Exploiting Hierarchical Locality in Deep Parallel Architectures, ACM Transactions on Architecture and Code Optimization, 13:2, (1-25), Online publication date: 27-Jun-2016.
- Kulkarni A, Dalessandro L, Kissel E, Lumsdaine A, Sterling T and Swany M Network-Managed Virtual Global Address Space for Message-driven Runtimes Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, (15-18)
- Tardieu O, Herta B, Cunningham D, Grove D, Kambadur P, Saraswat V, Shinnar A, Takeuchi M, Vaziri M and Zhang W (2016). X10 and APGAS at Petascale, ACM Transactions on Parallel Computing, 2:4, (1-32), Online publication date: 15-Mar-2016.
- Lam B, George A, Lam H and Aggarwal V (2015). Low-level PGAS computing on many-core processors with TSHMEM, Concurrency and Computation: Practice & Experience, 27:17, (5288-5310), Online publication date: 10-Dec-2015.
- Dokulil J and Benkner S (2015). Retargeting of the Open Community Runtime to Intel Xeon Phi, Procedia Computer Science, 51:C, (1453-1462), Online publication date: 1-Sep-2015.
- Namashivayam N, Khaldi D, Eachempati D and Chapman B Extending the Strided Communication Interface in OpenSHMEM Revised Selected Papers of the Second Workshop on OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies - Volume 9397, (3-17)
- De Wael M, Marr S, De Fraine B, Van Cutsem T and De Meuter W (2015). Partitioned Global Address Space Languages, ACM Computing Surveys, 47:4, (1-27), Online publication date: 21-Jul-2015.
- Besta M and Hoefler T Accelerating Irregular Computations with Hardware Transactional Memory and Active Messages Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, (161-172)
- Panagiotopoulou K and Loidl H Towards Resilient Chapel Proceedings of the 3rd International Conference on Exascale Applications and Software, (86-91)
- Willenberg R and Chow P A Heterogeneous GASNet Implementation for FPGA-accelerated Computing Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, (1-9)
- Fanfarillo A, Burnus T, Cardellini V, Filippone S, Nagle D and Rouson D OpenCoarrays Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, (1-11)
- Shan H, Kamil A, Williams S, Zheng Y and Yelick K Evaluation of PGAS Communication Paradigms with Geometric Multigrid Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, (1-12)
- Dinan J and Flajslik M Contexts Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, (1-9)
- Luo M, Seager K, Murthy K, Archer C, Sur S and Hefty S Early Evaluation of Scalable Fabric Interface for PGAS Programming Models Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, (1-13)
- Cao J, Kerr G, Arya K and Cooperman G Transparent checkpoint-restart over infiniband Proceedings of the 23rd international symposium on High-performance parallel and distributed computing, (13-24)
- Kamil A, Zheng Y and Yelick K A Local-View Array Library for Partitioned Global Address Space C++ Programs Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, (26-31)
- Hammond J, Ghosh S and Chapman B Implementing OpenSHMEM Using MPI-3 One-Sided Communication Proceedings of the First Workshop on OpenSHMEM and Related Technologies. Experiences, Implementations, and Tools - Volume 8356, (44-58)
- Lam B, Barboza A, Agrawal R, George A and Lam H Benchmarking Parallel Performance on Many-Core Processors Proceedings of the First Workshop on OpenSHMEM and Related Technologies. Experiences, Implementations, and Tools - Volume 8356, (29-43)
- Jose J, Zhang J, Venkatesh A, Potluri S and Panda D A Comprehensive Performance Evaluation of OpenSHMEM Libraries on InfiniBand Clusters Proceedings of the First Workshop on OpenSHMEM and Related Technologies. Experiences, Implementations, and Tools - Volume 8356, (14-28)
- Dinan J, Cole C, Jost G, Smith S, Underwood K and Wisniewski R Reducing Synchronization Overhead Through Bundled Communication Proceedings of the First Workshop on OpenSHMEM and Related Technologies. Experiences, Implementations, and Tools - Volume 8356, (163-177)
- Shamis P, Venkata M, Poole S, Welch A and Curtis T Designing a High Performance OpenSHMEM Implementation Using Universal Common Communication Substrate as a Communication Middleware Proceedings of the First Workshop on OpenSHMEM and Related Technologies. Experiences, Implementations, and Tools - Volume 8356, (1-13)
- Edmonds N, Willcock J and Lumsdaine A Expressing graph algorithms using generalized active messages Proceedings of the 27th international ACM conference on International conference on supercomputing, (283-292)
- Zhao X, Buntinas D, Zounmevo J, Dinan J, Goodell D, Balaji P, Thakur R, Afsahi A and Gropp W Toward asynchronous and MPI-interoperable active messages Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (87-94)
- Hermanns M, Krishnamoorthy S and Wolf F (2013). A scalable infrastructure for the performance analysis of passive target synchronization, Parallel Computing, 39:3, (132-145), Online publication date: 1-Mar-2013.
- Garland M, Kudlur M and Zheng Y Designing a unified programming model for heterogeneous machines Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, (1-11)
- Preissl R, Wong T, Datta P, Flickner M, Singh R, Esser S, Risk W, Simon H and Modha D Compass Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, (1-11)
- Alvanos M, Farreras M, Tiotto E and Martorell X Automatic communication coalescing for irregular computations in UPC language Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research, (220-234)
- Barrett B, Brightwell R and Underwood K A low impact flow control implementation for offload communication interfaces Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface, (27-36)
- Zhao X, Santhanaraman G and Gropp W Adaptive strategy for one-sided communication in MPICH2 Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface, (16-26)
- Luo M, Panda D, Ibrahim K and Iancu C Congestion avoidance on manycore high performance computing systems Proceedings of the 26th ACM international conference on Supercomputing, (121-132)
- Preissl R, Wichmann N, Long B, Shalf J, Ethier S and Koniges A Multithreaded Global Address Space Communication Techniques for Gyrokinetic Fusion Applications on Ultra-Scale Platforms Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, (1-11)
- Buluç A and Gilbert J (2011). The Combinatorial BLAS, International Journal of High Performance Computing Applications, 25:4, (496-509), Online publication date: 1-Nov-2011.
- Tabbal A, Anderson M, Brodowicz M, Kaiser H and Sterling T (2011). Preliminary design examination of the ParalleX system from a software and hardware perspective, ACM SIGMETRICS Performance Evaluation Review, 38:4, (81-87), Online publication date: 29-Mar-2011.
- Mattson T, Riepen M, Lehnig T, Brett P, Haas W, Kennedy P, Howard J, Vangal S, Borkar N, Ruhl G and Dighe S The 48-core SCC Processor Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, (1-11)
- Edmonds N, Gregor D and Lumsdaine A Extensible PGAS semantics for C++ Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, (1-10)
- Jose J, Luo M, Sur S and Panda D Unifying UPC and MPI runtimes Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, (1-10)
- Farreras M and Almasi G Asynchronous PGAS runtime for Myrinet networks Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, (1-10)
- Blagojević F, Hargrove P, Iancu C and Yelick K Hybrid PGAS runtime support for multicore nodes Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, (1-10)
- Chapman B, Curtis T, Pophale S, Poole S, Kuehn J, Koelbel C and Smith L Introducing OpenSHMEM Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, (1-3)
- Willcock J, Hoefler T, Edmonds N and Lumsdaine A AM++ Proceedings of the 19th international conference on Parallel architectures and compilation techniques, (401-410)
- Buss A, Harshvardhan , Papadopoulos I, Pearce O, Smith T, Tanase G, Thomas N, Xu X, Bianco M, Amato N and Rauchwerger L STAPL Proceedings of the 3rd Annual Haifa Experimental Systems Conference, (1-10)
- Tipparaju V, Aprá E, Yu W and Vetter J Enabling a highly-scalable global address space model for petascale computing Proceedings of the 7th ACM international conference on Computing frontiers, (207-216)
- Kamil A and Yelick K Enforcing textual alignment of collectives using dynamic checks Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing, (368-382)
- Mellor-Crummey J, Adhianto L, Scherer W and Jin G A new vision for coarray Fortran Proceedings of the Third Conference on Partitioned Global Address Space Programing Models, (1-9)
- Santhanaraman G, Balaji P, Gopalakrishnan K, Thakur R, Gropp W and Panda D Natively Supporting True One-Sided Communication in MPI on Multi-core Systems with InfiniBand Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, (380-387)
- Larkins D, Dinan J, Krishnamoorthy S, Parthasarathy S, Rountev A and Sadayappan P Global trees Proceedings of the 2008 ACM/IEEE conference on Supercomputing, (1-13)
- Iancu C and Hofmeyr S Runtime optimization of vector operations on large scale SMP clusters Proceedings of the 17th international conference on Parallel architectures and compilation techniques, (122-132)
- Iancu C, Chen W and Yelick K Performance portable optimizations for loops containing communication operations Proceedings of the 22nd annual international conference on Supercomputing, (266-276)
- Bocchino R, Adve V and Chamberlain B Software transactional memory for large scale clusters Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, (247-258)
- Underwood K, Levenhagen M and Brightwell R Evaluating NIC hardware requirements to achieve high message rate PGAS support on multi-core processors Proceedings of the 2007 ACM/IEEE conference on Supercomputing, (1-10)
- Husbands P and Yelick K Multi-threading and one-sided communication in parallel LU factorization Proceedings of the 2007 ACM/IEEE conference on Supercomputing, (1-10)
- Fredj O and Renault É Performance evaluation of distributed computing over heterogeneous networks Proceedings of the Third international conference on High Performance Computing and Communications, (53-61)
- Kamil A and Yelick K Hierarchical pointer analysis for distributed programs Proceedings of the 14th international conference on Static Analysis, (281-297)
- Yelick K, Hilfinger P, Graham S, Bonachea D, Su J, Kamil A, Datta K, Colella P and Wen T (2007). Parallel Languages and Compilers, International Journal of High Performance Computing Applications, 21:3, (266-290), Online publication date: 1-Aug-2007.
- Luszczek P and Dongarra J (2007). High Performance Development for High End Computing With Python Language Wrapper (PLW), International Journal of High Performance Computing Applications, 21:3, (360-369), Online publication date: 1-Aug-2007.
- Chamberlain B, Callahan D and Zima H (2007). Parallel Programmability and the Chapel Language, International Journal of High Performance Computing Applications, 21:3, (291-312), Online publication date: 1-Aug-2007.
- Yelick K, Bonachea D, Chen W, Colella P, Datta K, Duell J, Graham S, Hargrove P, Hilfinger P, Husbands P, Iancu C, Kamil A, Nishtala R, Su J, Welcome M and Wen T Productivity and performance using partitioned global address space languages Proceedings of the 2007 international workshop on Parallel symbolic computation, (24-32)
- Chen W, Bonachea D, Iancu C and Yelick K Automatic nonblocking communication for partitioned global address space programs Proceedings of the 21st annual international conference on Supercomputing, (158-167)
- Agarwal S, Barik R, Bonachea D, Sarkar V, Shyamasundar R and Yelick K Deadlock-free scheduling of X10 computations with bounded resources Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, (229-240)
- Iancu C and Strohmaier E Optimizing communication overlap for high-speed networks Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming, (35-45)
- Barton C, Cascaval C and Amaral J A characterization of shared data access patterns in UPC programs Proceedings of the 19th international conference on Languages and compilers for parallel computing, (111-125)
- Barton C, Casçaval C, Almási G, Zheng Y, Farreras M, Chatterje S and Amaral J Shared memory programming for large scale machines Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation, (108-117)
- Barton C, Casçaval C, Almási G, Zheng Y, Farreras M, Chatterje S and Amaral J (2006). Shared memory programming for large scale machines, ACM SIGPLAN Notices, 41:6, (108-117), Online publication date: 11-Jun-2006.
- Nieplocha J, Tipparaju V, Krishnan M and Panda D (2006). High Performance Remote Memory Access Communication, International Journal of High Performance Computing Applications, 20:2, (233-253), Online publication date: 1-May-2006.
- Bell C, Bonachea D, Nishtala R and Yelick K Optimizing bandwidth limited problems using one-sided communication and overlap Proceedings of the 20th international conference on Parallel and distributed processing, (84-84)
- Vishnu A, Santhanaraman G, Huang W, Jin H and Panda D Supporting MPI-2 one sided communication on multi-rail infiniband clusters Proceedings of the 12th international conference on High Performance Computing, (137-147)
- Kamil A, Su J and Yelick K Making Sequential Consistency Practical in Titanium Proceedings of the 2005 ACM/IEEE conference on Supercomputing
- Kamil A and Yelick K Concurrency analysis for parallel programs with textually aligned barriers Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing, (185-199)
- Buntinas D and Gropp W Designing a common communication subsystem Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface, (156-166)
- Coarfa C, Dotsenko Y, Mellor-Crummey J, Cantonnet F, El-Ghazawi T, Mohanti A, Yao Y and Chavarría-Miranda D An evaluation of global address space languages Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, (36-47)
- Thakur R, Gropp W and Toonen B (2005). Optimizing the Synchronization Operations in Message Passing Interface One-Sided Communication, International Journal of High Performance Computing Applications, 19:2, (119-128), Online publication date: 1-May-2005.
- Huang W, Santhanaraman G, Jin H and Panda D Scheduling of MPI-2 One Sided Operations over InfiniBand Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
- Iancu C, Husbands P and Chen W Message strip-mining heuristics for high speed networks Proceedings of the 6th international conference on High Performance Computing for Computational Science, (424-437)
- Bell C, Chen W, Bonachea D and Yelick K Evaluating support for global address space languages on the Cray X1 Proceedings of the 18th annual international conference on Supercomputing, (184-195)
- Bonachea D and Duell J (2004). Problems with using MPI 1.1 and 2.0 as compilation targets for parallel language implementations, International Journal of High Performance Computing and Networking, 1:1-3, (91-99), Online publication date: 1-Jan-2004.
- Chen W, Bonachea D, Duell J, Husbands P, Iancu C and Yelick K A performance analysis of the Berkeley UPC compiler Proceedings of the 17th annual international conference on Supercomputing, (63-73)
Index Terms
- GASNet Specification, v1.1
Recommendations
Complete formal specification of the OpenMP memory model
OpenMP [OpenMP Architecture Review Board. OpenMP application program interface, version 2.5] is an important API for shared memory programming, combining shared memory's potential for performance with a simple programming interface. Unfortunately, ...
A specification language for parallel real-time systems
WPDRTS '95: Proceedings of the 3rd Workshop on Parallel and Distributed Real-Time SystemsParallel high-performance computing is gaining momentum as a computing platform for many applications. In recent years, the research in software support for parallel computers has mainly addressed scientific and information processing applications. Very ...
Operational specification languages
ACM '83: Proceedings of the 1983 annual conference on Computers : Extending the human resourceThe “operational approach” to software development is based on separation of problem-oriented and implementation-oriented concerns, and features executable specifications and transformational implementation. “Operational specification languages” are ...