skip to main content
10.1145/1094811.1094852acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
Article

X10: an object-oriented approach to non-uniform cluster computing

Published:12 October 2005Publication History

ABSTRACT

It is now well established that the device scaling predicted by Moore's Law is no longer a viable option for increasing the clock frequency of future uniprocessor systems at the rate that had been sustained during the last two decades. As a result, future systems are rapidly moving from uniprocessor to multiprocessor configurations, so as to use parallelism instead of frequency scaling as the foundation for increased compute capacity. The dominant emerging multiprocessor structure for the future is a Non-Uniform Cluster Computing (NUCC) system with nodes that are built out of multi-core SMP chips with non-uniform memory hierarchies, and interconnected in horizontally scalable cluster configurations such as blade servers. Unlike previous generations of hardware evolution, this shift will have a major impact on existing software. Current OO language facilities for concurrent and distributed programming are inadequate for addressing the needs of NUCC systems because they do not support the notions of non-uniform data access within a node, or of tight coupling of distributed nodes.We have designed a modern object-oriented programming language, X10, for high performance, high productivity programming of NUCC systems. A member of the partitioned global address space family of languages, X10 highlights the explicit reification of locality in the form of places}; lightweight activities embodied in async, future, foreach, and ateach constructs; a construct for termination detection (finish); the use of lock-free synchronization (atomic blocks); and the manipulation of cluster-wide global data structures. We present an overview of the X10 programming model and language, experience with our reference implementation, and results from some initial productivity comparisons between the X10 and Java™ languages.

References

  1. Sudhir Ahuja, Nicholas Carriero, and David Gelernter. Linda and friends. IEEE Computer, 19(8):26--34, August 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Eric Allan, David Chase, Victor Luchangco, Jan-Willem Maessen, Sukyoung Ryu, Guy L. Steele Jr., and Sam Tobin-Hochstadt. The Fortress language specification version 0.618. Technical report, Sun Microsystems, April 2005.Google ScholarGoogle Scholar
  3. Yariv Aridor, Michael Factor, and Avi Teperman. cJVM: A single system image of a JVM on a cluster. In Proceedings of the International Conference on Parallel Processing (ICPP'99), pages 4--11, September 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Henri E. Bal and M. Frans Kaashoek. Object distribution in Orca using compile-time and run-time techniques. In Proceedings of the Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA'93), pages 162--177, November 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ray Barriuso and Allan Knies. SHMEM user's guide. Technical report, Cray Inc. Research, May 1994.Google ScholarGoogle Scholar
  6. John K. Bennett, John B. Carter, and Willy Zwaenepoel. Munin: Distributed shared memory based on type specific memory coherence. In Proceedings of the Symposium on Principles of Programming Languages (POPL'95), pages 168--176, March 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Hans Boehm. Threads cannot be implemented as a library. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI'05), pages 261--268, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Luca Cardelli. A language with distributed scope. In Proceedings of the Symposium on Principles of Programming Languages (POPL'95), pages 286--297, January 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Calin Cascaval, Evelyn Duesterwald, Peter F. Sweeney, and Robert W. Wisniewski. Multiple page size modeling and optimization. In Proceedings of the Conference on Parallel Architectures and Compilation Techniques (PACT'05), September 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. IBM International Technical Support Organization Poughkeepsie Center. Overview of lapi. Technical report sg24-2080-00, chapter 10, IBM, December 1997. www.redbooks.ibm.com/redbooks/pdfs/sg242080.pdf.Google ScholarGoogle Scholar
  11. Bradford L. Chamberlain, Sung-Eun Choi, Steven J. Deitz, and Lawrence Snyder. The high-level parallel language ZPL improves productivity and performance. In Proceedings of the IEEE International Workshop on Productivity and Performance in High-End Computing, 2004.Google ScholarGoogle Scholar
  12. Elaine Cheong, Judy Liebman, Jie Liu, and Feng Zhao. TinyGALS: A Programming model for event-driven embedded systems. In Proceedings of 2003 ACM Symposium on Applied Computing, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Brian Chin, Shane Markstrum, and Todd Millstein. Semantic type qualifiers. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI'05), pages 85--95, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. CILK-5.3 reference manual. Technical report, Supercomputing Technologies Group, June 2000.Google ScholarGoogle Scholar
  15. F. Darema, D.A. George, V.A. Norton, and G.F. Pfister. A Single-Program-Multiple-Data Computational model for EPEX/FORTRAN. Parallel Computing, 7(1):11--24, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  16. Kemal Ebcioc glu, Vijay Saraswat, and Vivek Sarkar. X10: Programming for hierarchical parallelism and nonuniform data access (extended abstract). In Language Runtimes '04 Workshop: Impact of Next Generation Processor Architectures On Virtual Machines (colocated with OOPSLA 2004), October 2004. www.aurorasoft.net/workshops/lar04/lar04home.htm.Google ScholarGoogle Scholar
  17. Kemal Ebcioc glu, Vijay Saraswat, and Vivek Sarkar. X10: an experimental language for high productivity programming of scalable systems (extended abstract). In Workshop on Productivity and Performance in High-End Computing (P-PHEC), February 2005.Google ScholarGoogle Scholar
  18. ECMA. Standard ecma-334: C} language specification. http://www.ecma-international.org/publications/files/ecma-st/Ecma-334.pdf, December 2002.Google ScholarGoogle Scholar
  19. Tarek El-Ghazawi, William W. Carlson, and Jesse M. Draper. UPC Language Specification v1.1.1, October 2003.Google ScholarGoogle Scholar
  20. High Performance Fortran Forum. High performance fortran language specification version 2.0. Technical report, Rice University Houston, TX, October 1996.Google ScholarGoogle Scholar
  21. Ian Foster and Carl Kesselman. The Globus toolkit. The Grid: Blueprint of a New Computing Infrastructure, pages 259--278, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Basilio B. Fraquela, Jia Guo, Ganesh Bikshandi, Maria J. Garzaran, Gheorghe Almasi, Jose Moreira, and David Padua. The hierarchically tiled arrays programming approach. In Proceedings of the Workshop on Languages, Compilers, and Runtime Support for Scalable Systems (LCR'04), pages 1--12, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Al Geist, Adam Beguelin, Jack Dongarra, Weicheng Jiang, Robert Manchek, and Vaidy Sunderam. PVM -- Parallel Virtual Machine: A Users' Guide and Tutorial for Networked Parallel Computing. MIT Press, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha. The Java Language Specification. Addison Wesley, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Robert H. Halstead. MULTILISP: A language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems, 7(4):501--538, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Per Brinch Hansen. Structured multiprogramming. CACM, 15(7), July 1972. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Timothy Harris and Keir Fraser. Language support for lightweight transactions. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'03), pages 388--402, October 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Matthias Hauswirth, Peter F. Sweeney, Amer Diwan, and Michael Hind. Vertical profiling: Understanding the behavior of object oriented applications. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'04), October 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Maurice Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1):124--149, January 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Paul Hilfinger, Dan Bonachea, David Gay, Susan Graham, Ben Liblit, Geoff Pike, and Katherine Yelick. Titanium Language Reference Manual. Technical Report CSD-01-1163, University of California at Berkeley, Berkeley, Ca, USA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C.A.R. Hoare. Monitors: An operating system structuring concept. CACM, 17(10):549--557, October 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. HPC challenge benchmark. http://icl.cs.utk.edu/hpcc/.Google ScholarGoogle Scholar
  33. HPL Workshop on High Productivity Programming Models and Languages, May 2004. http://hplws.jpl.nasa.gov/.Google ScholarGoogle Scholar
  34. Cray Inc. The Chapel language specification version 0.4. Technical report, Cray Inc., February 2005.Google ScholarGoogle Scholar
  35. The Java Grande Forum benchmark suite. http://www.epcc.ed.ac.uk/javagrande/javag.html.Google ScholarGoogle Scholar
  36. The Java RMI Specification. http://java.sun.com/products/jdk/rmi/.Google ScholarGoogle Scholar
  37. Arvind Krishnamurthy, David E. Culler, Andrea Dusseau, Seth C. Goldstein, Steven Lumetta, Thorsten von Eicken, and Katherine Yelick. Parallel programming in Split-C. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, pages 262 -- 273, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers, 28(9), 1979.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Doug Lea. Concurrent Programming in Java, Second Edition. Addison-Wesley, Inc., Reading, Massachusetts, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Doug Lea. The Concurreny Utilities, 2001. JSR 166, http://www.jcp.org/en/jsr/detail?id=166.Google ScholarGoogle Scholar
  41. Maged M. Michael and Michael L. Scott. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In PODC '96: Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing, pages 267--275. ACM Press, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jose Moreira, Samuel Midkiff, and Manish Gupta. A comparison of three approaches to language, compiler, and library support for multidimensional arrays in Java computing. In Proceedings of the ACM Java Grande - ISCOPE 2001 Conference, June 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Jose E. Moreira, Samuel P. Midkiff, Manish Gupta, Pedro V. Artigas, Marc Snir, and Richard D. Lawrence. Java programming for high-performance numerical computing. IBM Systems Journal, 39(1):21--, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Robert W. Numrich and John Reid. Co-Array Fortran for parallel programming. ACM SIGPLAN Fortran Forum Archive, 17:1--31, August 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Nathaniel Nystrom, Michael R. Clarkson, and Andrew C. Myers. Polyglot: An extensible compiler framework for Java. In Proceedings of the Conference on Compiler Construction (CC'03), pages 1380--152, April 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. OpenMP specifications. http://www.openmp.org/specs.Google ScholarGoogle Scholar
  47. Vijay Saraswat and Radha Jagadeesan. Concurrent clustered programming. In Proceedings of the International Conference on Concurrency Theory (CONCUR'05), August 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Vijay Saraswat, Radha Jagadeesan, Armando Solar-Iezama, and Christoph von Praun. Determinate imperative programming: A clocked interpretetation of imperative syntax. Submitted for publication, available at http://www.saraswat.org/cf.html, September 2005.Google ScholarGoogle Scholar
  49. V. Sarkar and G. R. Gao. Analyzable atomic sections: Integrating fine-grained synchronization and weak consistency models for scalable parallelism. Technical report, CAPSL Technical Memo 52, February 2004.Google ScholarGoogle Scholar
  50. Vivek Sarkar, Clay Williams, and Kemal Ebcioc glu. Application development productivity challenges for high-end computing. In Workshop on Productivity and Performance in High-End Computing (P-PHEC), February 2004. http://www.research.ibm.com/arl/pphec/pphec2004-proceedings.pdf.Google ScholarGoogle Scholar
  51. Anthony Skjellum, Ewing Lusk, and William Gropp. Using MPI: Portable Parallel Programming with the Message Passing Iinterface. MIT Press, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Lorna A. Smith and J. Mark Bull. A multithreaded java grande benchmark suite. In Proceedings of the Third Workshop on Java for High Performance Computing, June 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Lorna A. Smith, J. Mark Bull, and Jan Obdrzalek. A parallel Java Grande benchmark suite. In Proceedings of Supercomputing 2001, Denver, Colorado, November 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Standard Performance Evaluation Corporation (SPEC). SPECjbb2000 (java business benchmark). http://www.spec.org/jbb2000.Google ScholarGoogle Scholar
  55. Thorsten von Eicken, David E. Culler, Seth C. Goldstein, and Klaus E. Schauser. Active messages: a mechanism for integrated communication and computation. In Proceedings of the Annual International Symposium on Computer Architecture (ISCA'92), pages 256--266, May 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Robert W. Wisniewski, Peter F. Sweeney, Kartik Sudeep, Matthias Hauswirth, Evelyn Duesterwald, Calin Cascaval, and Reza Azimi. Performance and Environment Monitoring for Whole-System Characterization and Optimization. In Conference on Power/Performance interaction with Architecture,Circuits, and Compilers, 2004.Google ScholarGoogle Scholar

Index Terms

  1. X10: an object-oriented approach to non-uniform cluster computing

                Recommendations

                Reviews

                Henk Sips

                The authors present X10, an object-oriented language designed and implemented to address the requirements of non-uniform cluster computing platforms (NUCC): tile-based architectures having multicore symmetric multiprocessing (SMP) tiles (nodes) with non-uniform memory hierarchies. X10 increases NUCC programming productivity by facilitating the design, implementation, and deployment of safe, analyzable, scalable, and flexible parallel high-performance computing (HPC) applications. X10 is based on a novel combination of design decisions: a new programming model, reflected in a Java-based programming language, using a partitioned global address space (PGAS) model with explicit locality; specific concurrency constructs for task parallelism; and dedicated array constructs for data parallelism. The prototype implementation is entirely based on Java: X10 is translated to Java code and executed on a single unmodified Java Virtual Machine (JVM) (current version) or on a multi-JVM environment (under development). The code productivity analysis is done by comparing the parallelizations of eight benchmarks from the Java Grande Benchmark Suite in both X10 and Java. For the code size metrics considered, X10 gives better results. The benchmark performance is not analyzed, probably because the prototype X10 implementation (the full chain is due in 2010) is now mainly targeted to language design evaluation and not to performance. The main contribution of the paper is the presentation of X10 as a distinctive option among the HPC languages, due to its attempts at a coherent design (not building on legacy code); object-oriented approach; and novel constructs for locality, synchronization, data and task parallelism. Despite the numerous technical details, which may hinder readability, the authors have succeeded in proving X10 as a productive solution for programming NUCC platforms. Of course, they still have to prove that productivity and performance will go together, because that will determine in the end whether new paradigms and languages will be accepted by the HPC community. Online Computing Reviews Service

                Access critical reviews of Computing literature here

                Become a reviewer for Computing Reviews.

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader