poster

Portable parallel performance from sequential, productive, embedded domain-specific languages

Authors:
Shoaib Kamil

University of California, Berkeley, CA, USA

University of California, Berkeley, CA, USA
View Profile

,
Derrick Coetzee

University of California, Berkeley, CA, USA

University of California, Berkeley, CA, USA
View Profile

,
Scott Beamer

University of California, Berkeley, CA, USA

University of California, Berkeley, CA, USA
View Profile

,
Henry Cook

University of California, Berkeley, CA, USA

University of California, Berkeley, CA, USA
View Profile

,
Ekaterina Gonina

University of California, Berkeley, CA, USA

University of California, Berkeley, CA, USA
View Profile

,
Jonathan Harper

Mississippi State University, Mississippi, MS, USA

Mississippi State University, Mississippi, MS, USA
View Profile

,
Jeffrey Morlan

University of California, Berkeley, CA, USA

University of California, Berkeley, CA, USA
View Profile

,
Armando Fox

University of California, Berkeley, CA, USA

University of California, Berkeley, CA, USA
View Profile

PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel ProgrammingFebruary 2012Pages 303–304https://doi.org/10.1145/2145816.2145865

Published:25 February 2012Publication History

PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming

Pages 303–304

ABSTRACT

Domain-expert productivity programmers desire scalable application performance, but usually must rely on efficiency programmers who are experts in explicit parallel programming to achieve it. Since such programmers are rare, to maximize reuse of their work we propose encapsulating their strategies in mini-compilers for domain-specific embedded languages (DSELs) glued together by a common high-level host language familiar to productivity programmers. The nontrivial applications that use these DSELs perform up to 98% of peak attainable performance, and comparable to or better than existing hand-coded implementations. Our approach is unique in that each mini-compiler not only performs conventional compiler transformations and optimizations, but includes imperative procedural code that captures an efficiency expert's strategy for mapping a narrow domain onto a specific type of hardware. The result is source- and performance-portability for productivity programmers and parallel performance that rivals that of hand-coded efficiency-language implementations of the same applications. We describe a framework that supports our methodology and five implemented DSELs supporting common computation kernels.

Our results demonstrate that for several interesting classes of problems, efficiency-level parallel performance can be achieved by packaging efficiency programmers' expertise in a reusable framework that is easy to use for both productivity programmers and efficiency programmers.

References

A. Buluc and J. R. Gilbert. The combinatorial BLAS: Design, implementation, and applications. Technical Report UCSB-CS-2010-18, University of California, Santa Barbara, 2010.Google Scholar
H. Cook, E. Gonina, S. Kamil, G. Friedland, and D. P. A. Fox. Cuda-level performance with python-level productivity for gaussian mixture model applications. In 3rd USENIX conference on Hot topics in parallelism (HotPar'11), Berkeley, CA, USA, 2011. Google ScholarDigital Library
P. Hudak. Building domain-specific embedded languages. ACM Comput. Surv., 28:196, December 1996. ISSN 0360-0300. doi: http://doi.acm.org/10.1145/242224.242477. Google ScholarDigital Library
G. Malewicz, M. Austern, A. Bik, J. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. SIGMOD, Jun 2010. Google ScholarDigital Library
M. Mohiyuddin, M. Hoemmen, J. Demmel, and K. Yelick. Minimizing communication in sparse matrix solvers. In Supercomputing 2009, Portland, OR, Nov 2009. Google ScholarDigital Library
S. Williams, A. Waterman, and D. A. Patterson. Roofline: an insightful visual performance model for multicore architectures. Commun. ACM, pages 65--76, 2009. Google ScholarDigital Library

Index Terms

Portable parallel performance from sequential, productive, embedded domain-specific languages
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language features
        Frameworks
      2. Language types
        Parallel programming languages

Recommendations

Portable parallel performance from sequential, productive, embedded domain-specific languages
PPOPP '12

Domain-expert productivity programmers desire scalable application performance, but usually must rely on efficiency programmers who are experts in explicit parallel programming to achieve it. Since such programmers are rare, to maximize reuse of their ...
Read More
Development of internal domain-specific languages: design principles and design patterns
PLoP '11: Proceedings of the 18th Conference on Pattern Languages of Programs

A great part of software development challenges can be solved by one universal tool: Abstraction. Developers solve development challenges by using expressions and concepts that abstract from too technical details. One especially supportive tool for ...
Read More
Productivity and Performance with Embedded Domain-Specific Languages
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
February 2012
352 pages
ISBN:9781450311601
DOI:10.1145/2145816
General Chair:
J. Ramanujam
Louisiana State University, USA
,
Program Chair:
P. Sadayappan
The Ohio State University, USA
ACM SIGPLAN Notices Volume 47, Issue 8
PPOPP '12
August 2012
334 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2370036
Issue’s Table of Contents
Copyright © 2012 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 February 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
SEJITS
asp
domain-specific languages
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate230of1,014submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 35
  Total Citations
  View Citations
- 374
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Portable parallel performance from sequential, productive, embedded domain-specific languages

PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming

ABSTRACT

References

Cited By

Index Terms

Recommendations

Portable parallel performance from sequential, productive, embedded domain-specific languages

Development of internal domain-specific languages: design principles and design patterns

Productivity and Performance with Embedded Domain-Specific Languages