research-article

Building Efficient Query Engines in a High-Level Language

Authors:
Amir Shaikhha

EPFL, Lausanne, Lausanne, Switzerland

EPFL, Lausanne, Lausanne, Switzerland
View Profile

,
Yannis Klonatos

EPFL, Lausanne, Lausanne, Switzerland

EPFL, Lausanne, Lausanne, Switzerland
View Profile

,
Christoph Koch

EPFL, Lausanne, Lausanne, Switzerland

EPFL, Lausanne, Lausanne, Switzerland
View Profile

Authors Info & Claims

ACM Transactions on Database Systems Volume 43 Issue 1Article No.: 4pp 1–45https://doi.org/10.1145/3183653

Published:11 April 2018Publication History

ACM Transactions on Database Systems

Abstract

Abstraction without regret refers to the vision of using high-level programming languages for systems development without experiencing a negative impact on performance. A database system designed according to this vision offers both increased productivity and high performance instead of sacrificing the former for the latter as is the case with existing, monolithic implementations that are hard to maintain and extend.

In this article, we realize this vision in the domain of analytical query processing. We present LegoBase, a query engine written in the high-level programming language Scala. The key technique to regain efficiency is to apply generative programming: LegoBase performs source-to-source compilation and optimizes database systems code by converting the high-level Scala code to specialized, low-level C code. We show how generative programming allows to easily implement a wide spectrum of optimizations, such as introducing data partitioning or switching from a row to a column data layout, which are difficult to achieve with existing low-level query compilers that handle only queries. We demonstrate that sufficiently powerful abstractions are essential for dealing with the complexity of the optimization effort, shielding developers from compiler internals and decoupling individual optimizations from each other.

We evaluate our approach with the TPC-H benchmark and show that (a) with all optimizations enabled, our architecture significantly outperforms a commercial in-memory database as well as an existing query compiler. (b) Programmers need to provide just a few hundred lines of high-level code for implementing the optimizations, instead of complicated low-level code that is required by existing query compilation approaches. (c) These optimizations may potentially come at the cost of using more system memory for improved performance. (d) The compilation overhead is low compared to the overall execution time, thus making our approach usable in practice for compiling query engines.

Supplemental Material

Available for Download

zip

shaikhha.zip (358 KB)

Supplemental movie, appendix, image and software files for, Building Efficient Query Engines in a High-Level Language

References

Daniel J. Abadi, Samuel R. Madden, and Nabil Hachem. 2008. Column-stores vs. row-stores: How different are they really? In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’08). ACM, New York, NY, 967--980. Google ScholarDigital Library
Stefan Ackermann, Vojin Jovanovic, Tiark Rompf, and Martin Odersky. 2012. Jet: An embedded DSL for high performance big data processing. In Proceedings of the International Workshop on End-to-end Management of Big Data (BigData’12).Google Scholar
Yanif Ahmad and Christoph Koch. 2009. DBToaster: A SQL compiler for high-performance delta processing in main-memory databases. Proc. VLDB Endow. 2, 2 (Aug. 2009), 1566--1569. Google ScholarDigital Library
Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 2007. Compilers: Principles, Techniques, and Tools. Vol. 2. Addison-Wesley, Reading, MA. Google ScholarDigital Library
Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, and Marios Skounakis. 2001. Weaving relations for cache performance. In Proceedings of the International Conference on Very Large Data Bases (VLDB’01). 169--180. Google ScholarDigital Library
Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. 2015. Spark SQL: Relational data processing in spark. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’15). ACM, New York, NY, 1383--1394. Google ScholarDigital Library
Peter Bailis. 2015. Coordination Avoidance in Distributed Databases. Ph.D. Dissertation. University of California, Berkeley.Google Scholar
Peter Boncz, Thomas Neumann, and Orri Erling. 2014. TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark. Springer International Publishing, Cham, 61--76. Google ScholarDigital Library
Donald D. Chamberlin, Morton M. Astrahan, Michael W. Blasgen, James N. Gray, W. Frank King, Bruce G. Lindsay, Raymond Lorie, James W. Mehl, Thomas G. Price, Franco Putzolu, Patricia Griffiths Selinger, Mario Schkolnick, Donald R. Slutz, Irving L. Traiger, Bradford W. Wade, and Robert A. Yost. 1981. A history and evaluation of system R. Commun. ACM 24, 10 (1981), 632--646. Google ScholarDigital Library
Mahendra Chavan, Ravindra Guravannavar, Karthik Ramachandra, and S. Sudarshan. 2011. Program transformations for asynchronous query submission. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’11). IEEE, 375--386. Google ScholarDigital Library
Alvin Cheung, Owen Arden, Samuel Madden, Armando Solar-Lezama, and Andrew C. Myers. 2013. StatusQuo: Making familiar abstractions perform using program analysis. In Conference on Innovative Data Systems Research (CIDR’13).Google Scholar
Alvin Cheung, Samuel Madden, Owen Arden, and Andrew C. Myers. 2012. Automatic partitioning of database applications. Proc. VLDB Endow. 5, 11 (July 2012), 1471--1482. Google ScholarDigital Library
Alvin Cheung, Samuel Madden, Armando Solar-Lezama, Owen Arden, and Andrew C. Myers. 2014. Using program analysis to improve database applications. IEEE Data Eng. Bull. 37, 1 (2014), 48--59.Google Scholar
Alvin Cheung, Armando Solar-Lezama, and Samuel Madden. 2013. Optimizing database-backed applications with query synthesis. In Proceedings of the Annual ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13). ACM, New York, NY, 3--14. Google ScholarDigital Library
Andrew Crotty, Alex Galakatos, Kayhan Dursun, Tim Kraska, Carsten Binnig, Ugur Cetintemel, and Stan Zdonik. 2015. An architecture for compiling UDF-centric workflows. Proc. VLDB 8, 12 (2015), 1466--1477. Google ScholarDigital Library
Andrew Crotty, Alex Galakatos, Kayhan Dursun, Tim Kraska, Ugur Cetintemel, and Stan Zdonik. 2014. Tupleware: Redefining modern analytics. CoRR abs/1406.6667 (2014). http://arxiv.org/abs/1406.6667Google Scholar
Stephen Curial, Peng Zhao, Jose Nelson Amaral, Yaoqing Gao, Shimin Cui, Raul Silvera, and Roch Archambault. 2008. MPADS: Memory-pooling-assisted data splitting. In Proceedings of the 7th International Symposium on Memory Management. ACM, 101--110. Google ScholarDigital Library
J. Dees and P. Sanders. 2013. Efficient many-core query execution in main memory column-stores. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’13). 350--361. Google ScholarDigital Library
F. E. Allen and J. Cocke. 1971. A Catalogue of Optimizing Transformations.Google Scholar
Rickard E. Faith, Lars S. Nyland, and Jan F. Prins. 1997. KHEPERA: A system for rapid implementation of domain specific languages. In Proceedings of the 1997 Conference on Domain-Specific Languages (DSL’97). USENIX Association, Berkeley, CA, 19--19. Google ScholarDigital Library
Franz Färber, Sang Kyun Cha, Jürgen Primsch, Christof Bornhövd, Stefan Sigg, and Wolfgang Lehner. 2012. SAP HANA database -- Data management for modern business applications. SIGMOD Rec. 40, 4 (2012), 45--51. Google ScholarDigital Library
Yoshihiko Futamura. 1999. Partial evaluation of computation process - An approach to a compiler-compiler. Higher-Order Symbol. Comput. 12, 4 (1999), 381--391. Google ScholarDigital Library
Andrew Gill, John Launchbury, and Simon L. Peyton Jones. 1993. A short cut to deforestation. In Proceedings of the Conference on Functional Programming Languages and Computer Architecture (FPCA’93). ACM, New York, NY, 223--232. Google ScholarDigital Library
Anil K. Goel, Jeffrey Pound, Nathan Auch, Peter Bumbulis, Scott MacLean, Franz Färber, Francis Gropengiesser, Christian Mathis, Thomas Bodner, and Wolfgang Lehner. 2015. Towards scalable real-time analytics: An architecture for scale-out of OLxP workloads. Proc. VLDB 8, 12 (Aug. 2015), 1716--1727. Google ScholarDigital Library
Goetz Graefe. 1994. Volcano—An extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng. 6, 1 (Feb. 1994), 120--135. Google ScholarDigital Library
Rick Greer. 1999. Daytona and the fourth-generation language cymbal. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’99). ACM, 525--526. Google ScholarDigital Library
Torsten Grust, Manuel Mayr, Jan Rittinger, and Tom Schreiber. 2009. FERRY—Database-supported program execution. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’09). ACM, New York, NY, 1063--1066. Google ScholarDigital Library
Ravindra Guravannavar and S. Sudarshan. 2008. Rewriting procedures for batched bindings. Proc. VLDB Endow. 1, 1 (Aug. 2008), 1107--1123. Google ScholarDigital Library
Stavros Harizopoulos, Velen Liang, Daniel J. Abadi, and Samuel Madden. 2006. Performance tradeoffs in read-optimized databases. In Proceedings of the International Conference on Very Large Data Bases (VLDB’06). VLDB Endowment, 487--498. Google ScholarDigital Library
Paul Hudak. 1996. Building domain-specific embedded languages. ACM Comput. Surv. 28, 4 (Dec. 1996), 196 pages. Google ScholarDigital Library
Christian Humer, Christian Wimmer, Christian Wirth, Andreas Wöß, and Thomas Würthinger. 2014. A domain-specific language for building self-optimizing AST interpreters. In Proceedings of the International Conference on Generative Programming: Concepts 8 Experience (GPCE’14). ACM, New York, NY, 123--132. Google ScholarDigital Library
Galen C. Hunt and James R. Larus. 2007. Singularity: Rethinking the software stack. SIGOPS Oper. Syst. Rev. 41, 2 (2007), 37--49. Google ScholarDigital Library
Neil D. Jones, Carsten K. Gomard, and Peter Sestoft. 1993. Partial Evaluation and Automatic Program Generation. Peter Sestoft. Google ScholarDigital Library
Vojin Jovanović, Amir Shaikhha, Sandro Stucki, Vladimir Nikolaev, Christoph Koch, and Martin Odersky. 2014. Yin-Yang: Concealing the deep embedding of DSLs. In Proceedings of the International Conference on Generative Programming: Concepts 8 Experience (GPCE’14). ACM, 73--82. Google ScholarDigital Library
Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden, Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi. 2008. H-Store: A high-performance, distributed main memory transaction processing system. Proc. VLDB 1, 2 (2008), 1496--1499. Google ScholarDigital Library
Ken Kennedy, Bradley Broom, Arun Chauhan, Robert J. Fowler, John Garvin, Charles Koelbel, Cheryl McCosh, and John Mellor-Crummey. 2005. Telescoping languages: A system for automatic generation of domain languages. Proc. IEEE 93, 2 (2005), 387--408.Google ScholarCross Ref
Yannis Klonatos, Christoph Koch, Tiark Rompf, and Hassan Chafi. 2014. Building efficient query engines in a high-level language. Proc. VLDB 7, 10 (2014), 853--864. Google ScholarDigital Library
Yannis Klonatos, Andres Nötzli, Andrej Spielmann, Christoph Koch, and Victor Kuncak. 2013. Automatic synthesis of out-of-core algorithms. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’13). ACM, 133--144. Google ScholarDigital Library
Christoph Koch. 2010. Incremental query evaluation in a ring of databases. In Proceedings of the ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS’10). ACM, New York, NY, 87--98. Google ScholarDigital Library
Christoph Koch. 2013. Abstraction without regret in data management systems. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’13).Google Scholar
Christoph Koch. 2014. Abstraction without regret in database systems building: A manifesto. IEEE Data Eng. Bull. 37, 1 (2014), 70--79.Google Scholar
Christoph Koch, Yanif Ahmad, Oliver Kennedy, Milos Nikolic, Andres Nötzli, Daniel Lupei, and Amir Shaikhha. 2014. DBToaster: Higher-order delta processing for dynamic, frequently fresh views. VLDB J. 23, 2 (2014), 253--278.Google ScholarCross Ref
Konstantinos Krikellas, Stratis Viglas, and Marcelo Cintra. 2010. Generating code for holistic query evaluation. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’10). IEEE Computer Society, Washington, DC, USA, 613--624.Google ScholarCross Ref
Per-Åke Larson, Mike Zwilling, and Kevin Farlee. 2013. The hekaton memory-optimized OLTP engine. IEEE Data Eng. Bull. 36, 2 (2013), 34--40.Google Scholar
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis 8 transformation. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’04). IEEE Computer Society, Washington, DC, 75--86. Google ScholarDigital Library
Hyouk Joong Lee, Kevin J. Brown, Arvind K. Sujeeth, Hassan Chafi, Tiark Rompf, Martin Odersky, and Kunle Olukotun. 2011. Implementing domain-specific languages for heterogeneous parallel computing. IEEE Micro 31, 5 (Sept. 2011), 42--53. Google ScholarDigital Library
Stefan Manegold, Martin L. Kersten, and Peter Boncz. 2009. Database architecture evolution: Mammals flourished long before dinosaurs became extinct. Proc. VLDB 2, 2 (2009), 1648--1653. Google ScholarDigital Library
Amit Manjhi, Charles Garrod, Bruce M. Maggs, Todd C. Mowry, and Anthony Tomasic. 2009. Holistic query transformations for dynamic web applications. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’09). IEEE, 1175--1178. Google ScholarDigital Library
A. C. McKellar and Edward G. Coffman Jr. 1969. Organizing matrices and matrix operations for paged memory systems. Commun. ACM 12, 3 (1969), 153--165. Google ScholarDigital Library
Erik Meijer, Brian Beckman, and Gavin Bierman. 2006. LINQ: Reconciling objects, relations and XML in the .NET framework. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’06). ACM, 706--706. Google ScholarDigital Library
Prashanth Menon, Todd C. Mowry, and Andrew Pavlo. 2017. Relaxed operator fusion for in-memory databases: Making compilation, vectorization, and prefetching work together at last. Proc. VLDB Endow. 11, 1 (Sept. 2017), 1--13. Google ScholarDigital Library
Guido Moerkotte and Thomas Neumann. 2011. Accelerating queries with group-by and join by groupjoin. Proc. VLDB 4, 11 (2011), 843–851.Google Scholar
Derek Gordon Murray, Michael Isard, and Yuan Yu. 2011. Steno: Automatic optimization of declarative queries. In Proceedings of the Annual ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). ACM, New York, NY, 121--131. Google ScholarDigital Library
Fabian Nagel, Gavin Bierman, and Stratis D. Viglas. 2014. Code generation for efficient query processing in managed runtimes. Proc. VLDB Endow. 7, 12 (Aug. 2014), 1095--1106. Google ScholarDigital Library
Thomas Neumann. 2011. Efficiently compiling efficient query plans for modern hardware. Proc. VLDB 4, 9 (2011), 539--550. Google ScholarDigital Library
Martin Odersky and Matthias Zenger. 2005. Scalable component abstractions. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’05). ACM, New York, NY, 41--57. Google ScholarDigital Library
Oracle Corporation. 2006. TimesTen In-Memory Database Architectural Overview. Retrieved from http://download.oracle.com/otn_hosted_doc/timesten/603/TimesTen-Documentation/arch.pdf.Google Scholar
Sriram Padmanabhan, Timothy Malkemus, Ramesh C. Agarwal, and Anant Jhingran. 2001. Block oriented processing of relational database operations in modern computer architectures. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’01). 567--574. Google ScholarDigital Library
Shoumik Palkar, James J. Thomas, Anil Shanbhag, Deepak Narayanan, Holger Pirk, Malte Schwarzkopf, Saman Amarasinghe, Matei Zaharia, and Stanford InfoLab. 2017. Weld: A common runtime for high performance data analytics. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’17).Google Scholar
Andrew Pavlo, Gustavo Angulo, Joy Arulraj, Haibin Lin, Jiexi Lin, Lin Ma, Prashanth Menon, and others. 2017. Self-driving database management systems. In Conference on Innovative Data Systems Research (CIDR’17).Google Scholar
Karthik Ramachandra and S. Sudarshan. 2012. Holistic optimization by prefetching query results. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 133--144. Google ScholarDigital Library
Vijayshankar Raman, Garret Swart, Lin Qiao, Frederick Reiss, Vijay Dialani, Donald Kossmann, Inderpal Narang, and Richard Sidle. 2008. Constant-time query processing. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’08). 60--69. Google ScholarDigital Library
Jun Rao, Hamid Pirahesh, C. Mohan, and Guy Lohman. 2006. Compiled query execution engine using JVM. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’06). IEEE Computer Society, Washington, DC, 23--34. Google ScholarDigital Library
Kristian F. D. Rietveld and Harry A. G. Wijshoff. 2014. Re-engineering compiler transformations to outperform database query optimizers. In International Workshop on Languages and Compilers for Parallel Computing. Springer, 300--314.Google Scholar
Kristian F. D. Rietveld and Harry A. G. Wijshoff. 2015. Reducing layered database applications to their essence through vertical integration. Trans. Database Syst. 40, 3 (2015), 18. Google ScholarDigital Library
Tiark Rompf. 2012. Lightweight Modular Staging and Embedded Compilers: Abstraction Without Regret for High-Level High-Performance Programming. Technical Report. EPFL Ph.D. thesis 5456.Google Scholar
Tiark Rompf and Martin Odersky. 2010. Lightweight modular staging: A pragmatic approach to runtime code generation and compiled DSLs. In Proceedings of the International Conference on Generative Programming: Concepts 8 Experience (GPCE’10). ACM, New York, NY, 127--136. Google ScholarDigital Library
Tiark Rompf, Arvind K. Sujeeth, Nada Amin, Kevin J. Brown, Vojin Jovanovic, Hyouk Joong Lee, Manohar Jonnalagedda, Kunle Olukotun, and Martin Odersky. 2013. Optimizing data structures in high-level programs: New directions for extensible compilers based on staging. In Proceedings of the Annual Symposium on Principles of Programming Languages (POPL’13). ACM, 497--510. Google ScholarDigital Library
Sudip Roy, Lucja Kot, Gabriel Bender, Bailu Ding, Hossein Hojjat, Christoph Koch, Nate Foster, and Johannes Gehrke. 2015. The homeostasis protocol: Avoiding transaction coordination through program analysis. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’15). 1311--1326. Google ScholarDigital Library
Amir Shaikhha, Mohammad Dashti, and Christoph Koch. 2018. Push vs. pull-based loop fusion in query engines. J. Funct. Program. 28 (2018).Google Scholar
Amir Shaikhha, Yannis Klonatos, Lionel Parreaux, Lewis Brown, Mohammad Dashti, and Christoph Koch. 2016. How to architect a query compiler. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’16). 1907--1922. Google ScholarDigital Library
Xiaogang Shi, Bin Cui, Gillian Dobbie, and Beng Chin Ooi. 2014. Towards unified ad-hoc data processing. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’14). ACM, 1263--1274. Google ScholarDigital Library
Juliusz Sompolski, Marcin Zukowski, and Peter Boncz. Vectorization vs. compilation in query execution. In Proceedings of the the 7th International Workshop on Data Management on New Hardware (DaMoN’11). ACM, 33--40. Google ScholarDigital Library
Mike Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth O’Neil, Pat O’Neil, Alex Rasin, Nga Tran, and Stan Zdonik. 2005. C-Store: A Column-oriented DBMS. In Proceedings of the International Conference on Very Large Data Bases (VLDB’05). VLDB Endowment, 553--564. Google ScholarDigital Library
Michael Stonebraker, Samuel Madden, Daniel J. Abadi, Stavros Harizopoulos, Nabil Hachem, and Pat Helland. 2007. The end of an architectural era (It's Time for a Complete Rewrite). In Proceedings of the International Conference on Very Large Data Bases (VLDB’07). 1150--1160. Google ScholarDigital Library
Arvind K. Sujeeth, Austin Gibbons, Kevin J. Brown, Hyouk Joong Lee, Tiark Rompf, Martin Odersky, and Kunle Olukotun. 2013. Forge: Generating a high performance DSL implementation from a declarative specification. In Proceedings of the International Conference on Generative Programming: Concepts 8 Experience (GPCE’13). ACM, New York, NY, 145--154. Google ScholarDigital Library
Eijiro Sumii and Naoki Kobayashi. 2001. A hybrid approach to online and offline partial evaluation. Higher Order Symbol. Comput. 14, 2--3 (Sept. 2001), 101--142. Google ScholarDigital Library
Don Syme. 2006. Leveraging .NET meta-programming components from F#: Integrated queries and interoperable heterogeneous execution. In Proceedings of the 2006 Workshop on ML (ML’06). ACM, New York, NY, 43--54. Google ScholarDigital Library
Walid Taha. 2004. A gentle introduction to multi-stage programming. In Domain-Specific Program Generation. Springer, 30--50.Google Scholar
Walid Taha and Tim Sheard. 2000. MetaML and multi-stage programming with explicit annotations. Theor. Comput. Sci. 248, 1--2 (2000), 211--242. Google ScholarDigital Library
The GNOME Project. 2013. GLib: Library Package for Low-Level Data Structures in C—The Reference Manual. Retrieved from https://developer.gnome.org/glib/2.38/.Google Scholar
Ehsan Totoni, Todd A. Anderson, and Tatiana Shpeisman. 2017. HPAT: High performance analytics with scripting ease-of-use. In Proceedings of the International Conference on Supercomputing (ICS’17). ACM, New York, NY. Google ScholarDigital Library
Transaction Processing Performance Council. 1999. TPC-H, an Ad-Hoc, Decision Support Benchmark. Retrieved from http://www.tpc.org/tpch.Google Scholar
Arie van Deursen, Paul Klint, and Joost Visser. 2000. Domain-specific languages: An annotated bibliography. SIGPLAN Not. 35, 6 (June 2000), 26--36. Google ScholarDigital Library
Stratis Viglas, Gavin M. Bierman, and Fabian Nagel. 2014. Processing declarative queries through generating imperative code in managed runtimes. IEEE Data Eng. Bull. 37, 1 (2014), 12--21.Google Scholar
Ben Wiedermann and William R. Cook. 2007. Extracting queries by static analysis of transparent persistence. In Proceedings of the Annual Symposium on Principles of Programming Languages (POPL’07). ACM, New York, NY, 199--210. Google ScholarDigital Library
Ben Wiedermann, Ali Ibrahim, and William R. Cook. 2008. Interprocedural query extraction for transparent persistence. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’08). ACM, New York, NY, 19--36. Google ScholarDigital Library
Michael Joseph Wolfe. 1978. Techniques for Improving the Inherent Parallelism in Programs. Ph.D. Dissertation. Department of Computer Science, University of Illinois at Urbana--Champaign.Google Scholar
Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey. 2008. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. (OSDI’08). USENIX Association, Berkeley, CA, 1--14. Google ScholarDigital Library
Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (HotCloud’10). USENIX Association, Berkeley, CA. Google ScholarDigital Library
Barry M. Zane, James P. Ballard, Foster D. Hinshaw, Dana A. Kirkpatrick, and Less Premanand Yerabothu. 2008. Optimized SQL code generation. Patent No. 7430549 B2. WO Patent App. US 10/886,011.Google Scholar
Rui Zhang, Saumya Debray, and Richard T. Snodgrass. 2012. Micro-specialization: Dynamic code specialization of database management systems. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’12). ACM, New York, NY, 63--73. Google ScholarDigital Library
Rui Zhang, Richard T. Snodgrass, and Saumya Debray. 2012. Micro-specialization in DBMSes. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’12). IEEE Computer Society, Washington, DC, 690--701. Google ScholarDigital Library
Peng Zhao, Shimin Cui, Yaoqing Gao, Raúl Silvera, and José Nelson Amaral. 2007. Forma: A framework for safe automatic array reshaping. Trans. Program. Lang. Syst. 30, 1 (2007), 2. Google ScholarDigital Library

Index Terms

Building Efficient Query Engines in a High-Level Language

Recommendations

How to Architect a Query Compiler, Revisited
SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data

To leverage modern hardware platforms to their fullest, more and more database systems embrace compilation of query plans to native code. In the research community, there is an ongoing debate about the best way to architect such query compilers. This is ...
Read More
Building efficient query engines in a high-level language

In this paper we advocate that it is time for a radical rethinking of database systems design. Developers should be able to leverage high-level programming languages without having to pay a price in efficiency. To realize our vision of abstraction ...
Read More
Low-latency query compilation
Abstract
Query compilation is a processing technique that achieves very high processing speeds but has the disadvantage of introducing additional compilation latencies. These latencies cause an overhead that is relatively high for short-running and high-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Database Systems Volume 43, Issue 1
Best of SIGMOD 2016 Papers and Regular Papers
March 2018
227 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/3194314
Editor:
Christian S. Jensen
Aalborg University, Denmark
Issue’s Table of Contents
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 April 2018
- Revised: 1 December 2017
- Accepted: 1 December 2017
- Received: 1 November 2016
Published in tods Volume 43, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
High-level programming languages
abstraction without regret
code generation
optimizing compilers
query compilation
query processing
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 25
  Total Citations
  View Citations
- 657
  Total Downloads
- Downloads (Last 12 months)41
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Building Efficient Query Engines in a High-Level Language

ACM Transactions on Database Systems

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

How to Architect a Query Compiler, Revisited

Building efficient query engines in a high-level language

Low-latency query compilation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Building Efficient Query Engines in a High-Level Language

ACM Transactions on Database Systems

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

How to Architect a Query Compiler, Revisited

Building efficient query engines in a high-level language

Low-latency query compilation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media