skip to main content
research-article

Building Efficient Query Engines in a High-Level Language

Published:11 April 2018Publication History
Skip Abstract Section

Abstract

Abstraction without regret refers to the vision of using high-level programming languages for systems development without experiencing a negative impact on performance. A database system designed according to this vision offers both increased productivity and high performance instead of sacrificing the former for the latter as is the case with existing, monolithic implementations that are hard to maintain and extend.

In this article, we realize this vision in the domain of analytical query processing. We present LegoBase, a query engine written in the high-level programming language Scala. The key technique to regain efficiency is to apply generative programming: LegoBase performs source-to-source compilation and optimizes database systems code by converting the high-level Scala code to specialized, low-level C code. We show how generative programming allows to easily implement a wide spectrum of optimizations, such as introducing data partitioning or switching from a row to a column data layout, which are difficult to achieve with existing low-level query compilers that handle only queries. We demonstrate that sufficiently powerful abstractions are essential for dealing with the complexity of the optimization effort, shielding developers from compiler internals and decoupling individual optimizations from each other.

We evaluate our approach with the TPC-H benchmark and show that (a) with all optimizations enabled, our architecture significantly outperforms a commercial in-memory database as well as an existing query compiler. (b) Programmers need to provide just a few hundred lines of high-level code for implementing the optimizations, instead of complicated low-level code that is required by existing query compilation approaches. (c) These optimizations may potentially come at the cost of using more system memory for improved performance. (d) The compilation overhead is low compared to the overall execution time, thus making our approach usable in practice for compiling query engines.

Skip Supplemental Material Section

Supplemental Material

References

  1. Daniel J. Abadi, Samuel R. Madden, and Nabil Hachem. 2008. Column-stores vs. row-stores: How different are they really? In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’08). ACM, New York, NY, 967--980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Stefan Ackermann, Vojin Jovanovic, Tiark Rompf, and Martin Odersky. 2012. Jet: An embedded DSL for high performance big data processing. In Proceedings of the International Workshop on End-to-end Management of Big Data (BigData’12).Google ScholarGoogle Scholar
  3. Yanif Ahmad and Christoph Koch. 2009. DBToaster: A SQL compiler for high-performance delta processing in main-memory databases. Proc. VLDB Endow. 2, 2 (Aug. 2009), 1566--1569. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 2007. Compilers: Principles, Techniques, and Tools. Vol. 2. Addison-Wesley, Reading, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, and Marios Skounakis. 2001. Weaving relations for cache performance. In Proceedings of the International Conference on Very Large Data Bases (VLDB’01). 169--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. 2015. Spark SQL: Relational data processing in spark. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’15). ACM, New York, NY, 1383--1394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Peter Bailis. 2015. Coordination Avoidance in Distributed Databases. Ph.D. Dissertation. University of California, Berkeley.Google ScholarGoogle Scholar
  8. Peter Boncz, Thomas Neumann, and Orri Erling. 2014. TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark. Springer International Publishing, Cham, 61--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Donald D. Chamberlin, Morton M. Astrahan, Michael W. Blasgen, James N. Gray, W. Frank King, Bruce G. Lindsay, Raymond Lorie, James W. Mehl, Thomas G. Price, Franco Putzolu, Patricia Griffiths Selinger, Mario Schkolnick, Donald R. Slutz, Irving L. Traiger, Bradford W. Wade, and Robert A. Yost. 1981. A history and evaluation of system R. Commun. ACM 24, 10 (1981), 632--646. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mahendra Chavan, Ravindra Guravannavar, Karthik Ramachandra, and S. Sudarshan. 2011. Program transformations for asynchronous query submission. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’11). IEEE, 375--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Alvin Cheung, Owen Arden, Samuel Madden, Armando Solar-Lezama, and Andrew C. Myers. 2013. StatusQuo: Making familiar abstractions perform using program analysis. In Conference on Innovative Data Systems Research (CIDR’13).Google ScholarGoogle Scholar
  12. Alvin Cheung, Samuel Madden, Owen Arden, and Andrew C. Myers. 2012. Automatic partitioning of database applications. Proc. VLDB Endow. 5, 11 (July 2012), 1471--1482. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Alvin Cheung, Samuel Madden, Armando Solar-Lezama, Owen Arden, and Andrew C. Myers. 2014. Using program analysis to improve database applications. IEEE Data Eng. Bull. 37, 1 (2014), 48--59.Google ScholarGoogle Scholar
  14. Alvin Cheung, Armando Solar-Lezama, and Samuel Madden. 2013. Optimizing database-backed applications with query synthesis. In Proceedings of the Annual ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13). ACM, New York, NY, 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Andrew Crotty, Alex Galakatos, Kayhan Dursun, Tim Kraska, Carsten Binnig, Ugur Cetintemel, and Stan Zdonik. 2015. An architecture for compiling UDF-centric workflows. Proc. VLDB 8, 12 (2015), 1466--1477. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Andrew Crotty, Alex Galakatos, Kayhan Dursun, Tim Kraska, Ugur Cetintemel, and Stan Zdonik. 2014. Tupleware: Redefining modern analytics. CoRR abs/1406.6667 (2014). http://arxiv.org/abs/1406.6667Google ScholarGoogle Scholar
  17. Stephen Curial, Peng Zhao, Jose Nelson Amaral, Yaoqing Gao, Shimin Cui, Raul Silvera, and Roch Archambault. 2008. MPADS: Memory-pooling-assisted data splitting. In Proceedings of the 7th International Symposium on Memory Management. ACM, 101--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Dees and P. Sanders. 2013. Efficient many-core query execution in main memory column-stores. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’13). 350--361. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. F. E. Allen and J. Cocke. 1971. A Catalogue of Optimizing Transformations.Google ScholarGoogle Scholar
  20. Rickard E. Faith, Lars S. Nyland, and Jan F. Prins. 1997. KHEPERA: A system for rapid implementation of domain specific languages. In Proceedings of the 1997 Conference on Domain-Specific Languages (DSL’97). USENIX Association, Berkeley, CA, 19--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Franz Färber, Sang Kyun Cha, Jürgen Primsch, Christof Bornhövd, Stefan Sigg, and Wolfgang Lehner. 2012. SAP HANA database -- Data management for modern business applications. SIGMOD Rec. 40, 4 (2012), 45--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yoshihiko Futamura. 1999. Partial evaluation of computation process - An approach to a compiler-compiler. Higher-Order Symbol. Comput. 12, 4 (1999), 381--391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Andrew Gill, John Launchbury, and Simon L. Peyton Jones. 1993. A short cut to deforestation. In Proceedings of the Conference on Functional Programming Languages and Computer Architecture (FPCA’93). ACM, New York, NY, 223--232. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Anil K. Goel, Jeffrey Pound, Nathan Auch, Peter Bumbulis, Scott MacLean, Franz Färber, Francis Gropengiesser, Christian Mathis, Thomas Bodner, and Wolfgang Lehner. 2015. Towards scalable real-time analytics: An architecture for scale-out of OLxP workloads. Proc. VLDB 8, 12 (Aug. 2015), 1716--1727. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Goetz Graefe. 1994. Volcano—An extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng. 6, 1 (Feb. 1994), 120--135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Rick Greer. 1999. Daytona and the fourth-generation language cymbal. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’99). ACM, 525--526. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Torsten Grust, Manuel Mayr, Jan Rittinger, and Tom Schreiber. 2009. FERRY—Database-supported program execution. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’09). ACM, New York, NY, 1063--1066. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ravindra Guravannavar and S. Sudarshan. 2008. Rewriting procedures for batched bindings. Proc. VLDB Endow. 1, 1 (Aug. 2008), 1107--1123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Stavros Harizopoulos, Velen Liang, Daniel J. Abadi, and Samuel Madden. 2006. Performance tradeoffs in read-optimized databases. In Proceedings of the International Conference on Very Large Data Bases (VLDB’06). VLDB Endowment, 487--498. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Paul Hudak. 1996. Building domain-specific embedded languages. ACM Comput. Surv. 28, 4 (Dec. 1996), 196 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Christian Humer, Christian Wimmer, Christian Wirth, Andreas Wöß, and Thomas Würthinger. 2014. A domain-specific language for building self-optimizing AST interpreters. In Proceedings of the International Conference on Generative Programming: Concepts 8 Experience (GPCE’14). ACM, New York, NY, 123--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Galen C. Hunt and James R. Larus. 2007. Singularity: Rethinking the software stack. SIGOPS Oper. Syst. Rev. 41, 2 (2007), 37--49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Neil D. Jones, Carsten K. Gomard, and Peter Sestoft. 1993. Partial Evaluation and Automatic Program Generation. Peter Sestoft. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Vojin Jovanović, Amir Shaikhha, Sandro Stucki, Vladimir Nikolaev, Christoph Koch, and Martin Odersky. 2014. Yin-Yang: Concealing the deep embedding of DSLs. In Proceedings of the International Conference on Generative Programming: Concepts 8 Experience (GPCE’14). ACM, 73--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden, Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi. 2008. H-Store: A high-performance, distributed main memory transaction processing system. Proc. VLDB 1, 2 (2008), 1496--1499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Ken Kennedy, Bradley Broom, Arun Chauhan, Robert J. Fowler, John Garvin, Charles Koelbel, Cheryl McCosh, and John Mellor-Crummey. 2005. Telescoping languages: A system for automatic generation of domain languages. Proc. IEEE 93, 2 (2005), 387--408.Google ScholarGoogle ScholarCross RefCross Ref
  37. Yannis Klonatos, Christoph Koch, Tiark Rompf, and Hassan Chafi. 2014. Building efficient query engines in a high-level language. Proc. VLDB 7, 10 (2014), 853--864. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Yannis Klonatos, Andres Nötzli, Andrej Spielmann, Christoph Koch, and Victor Kuncak. 2013. Automatic synthesis of out-of-core algorithms. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’13). ACM, 133--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Christoph Koch. 2010. Incremental query evaluation in a ring of databases. In Proceedings of the ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS’10). ACM, New York, NY, 87--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Christoph Koch. 2013. Abstraction without regret in data management systems. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’13).Google ScholarGoogle Scholar
  41. Christoph Koch. 2014. Abstraction without regret in database systems building: A manifesto. IEEE Data Eng. Bull. 37, 1 (2014), 70--79.Google ScholarGoogle Scholar
  42. Christoph Koch, Yanif Ahmad, Oliver Kennedy, Milos Nikolic, Andres Nötzli, Daniel Lupei, and Amir Shaikhha. 2014. DBToaster: Higher-order delta processing for dynamic, frequently fresh views. VLDB J. 23, 2 (2014), 253--278.Google ScholarGoogle ScholarCross RefCross Ref
  43. Konstantinos Krikellas, Stratis Viglas, and Marcelo Cintra. 2010. Generating code for holistic query evaluation. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’10). IEEE Computer Society, Washington, DC, USA, 613--624.Google ScholarGoogle ScholarCross RefCross Ref
  44. Per-Åke Larson, Mike Zwilling, and Kevin Farlee. 2013. The hekaton memory-optimized OLTP engine. IEEE Data Eng. Bull. 36, 2 (2013), 34--40.Google ScholarGoogle Scholar
  45. Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis 8 transformation. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’04). IEEE Computer Society, Washington, DC, 75--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Hyouk Joong Lee, Kevin J. Brown, Arvind K. Sujeeth, Hassan Chafi, Tiark Rompf, Martin Odersky, and Kunle Olukotun. 2011. Implementing domain-specific languages for heterogeneous parallel computing. IEEE Micro 31, 5 (Sept. 2011), 42--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Stefan Manegold, Martin L. Kersten, and Peter Boncz. 2009. Database architecture evolution: Mammals flourished long before dinosaurs became extinct. Proc. VLDB 2, 2 (2009), 1648--1653. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Amit Manjhi, Charles Garrod, Bruce M. Maggs, Todd C. Mowry, and Anthony Tomasic. 2009. Holistic query transformations for dynamic web applications. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’09). IEEE, 1175--1178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. A. C. McKellar and Edward G. Coffman Jr. 1969. Organizing matrices and matrix operations for paged memory systems. Commun. ACM 12, 3 (1969), 153--165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Erik Meijer, Brian Beckman, and Gavin Bierman. 2006. LINQ: Reconciling objects, relations and XML in the .NET framework. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’06). ACM, 706--706. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Prashanth Menon, Todd C. Mowry, and Andrew Pavlo. 2017. Relaxed operator fusion for in-memory databases: Making compilation, vectorization, and prefetching work together at last. Proc. VLDB Endow. 11, 1 (Sept. 2017), 1--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Guido Moerkotte and Thomas Neumann. 2011. Accelerating queries with group-by and join by groupjoin. Proc. VLDB 4, 11 (2011), 843–851.Google ScholarGoogle Scholar
  53. Derek Gordon Murray, Michael Isard, and Yuan Yu. 2011. Steno: Automatic optimization of declarative queries. In Proceedings of the Annual ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’11). ACM, New York, NY, 121--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Fabian Nagel, Gavin Bierman, and Stratis D. Viglas. 2014. Code generation for efficient query processing in managed runtimes. Proc. VLDB Endow. 7, 12 (Aug. 2014), 1095--1106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Thomas Neumann. 2011. Efficiently compiling efficient query plans for modern hardware. Proc. VLDB 4, 9 (2011), 539--550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Martin Odersky and Matthias Zenger. 2005. Scalable component abstractions. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’05). ACM, New York, NY, 41--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Oracle Corporation. 2006.    TimesTen    In-Memory    Database    Architectural    Overview. Retrieved from http://download.oracle.com/otn_hosted_doc/timesten/603/TimesTen-Documentation/arch.pdf.Google ScholarGoogle Scholar
  58. Sriram Padmanabhan, Timothy Malkemus, Ramesh C. Agarwal, and Anant Jhingran. 2001. Block oriented processing of relational database operations in modern computer architectures. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’01). 567--574. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Shoumik Palkar, James J. Thomas, Anil Shanbhag, Deepak Narayanan, Holger Pirk, Malte Schwarzkopf, Saman Amarasinghe, Matei Zaharia, and Stanford InfoLab. 2017. Weld: A common runtime for high performance data analytics. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’17).Google ScholarGoogle Scholar
  60. Andrew Pavlo, Gustavo Angulo, Joy Arulraj, Haibin Lin, Jiexi Lin, Lin Ma, Prashanth Menon, and others. 2017. Self-driving database management systems. In Conference on Innovative Data Systems Research (CIDR’17).Google ScholarGoogle Scholar
  61. Karthik Ramachandra and S. Sudarshan. 2012. Holistic optimization by prefetching query results. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 133--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Vijayshankar Raman, Garret Swart, Lin Qiao, Frederick Reiss, Vijay Dialani, Donald Kossmann, Inderpal Narang, and Richard Sidle. 2008. Constant-time query processing. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’08). 60--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Jun Rao, Hamid Pirahesh, C. Mohan, and Guy Lohman. 2006. Compiled query execution engine using JVM. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’06). IEEE Computer Society, Washington, DC, 23--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Kristian F. D. Rietveld and Harry A. G. Wijshoff. 2014. Re-engineering compiler transformations to outperform database query optimizers. In International Workshop on Languages and Compilers for Parallel Computing. Springer, 300--314.Google ScholarGoogle Scholar
  65. Kristian F. D. Rietveld and Harry A. G. Wijshoff. 2015. Reducing layered database applications to their essence through vertical integration. Trans. Database Syst. 40, 3 (2015), 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Tiark Rompf. 2012. Lightweight Modular Staging and Embedded Compilers: Abstraction Without Regret for High-Level High-Performance Programming. Technical Report. EPFL Ph.D. thesis 5456.Google ScholarGoogle Scholar
  67. Tiark Rompf and Martin Odersky. 2010. Lightweight modular staging: A pragmatic approach to runtime code generation and compiled DSLs. In Proceedings of the International Conference on Generative Programming: Concepts 8 Experience (GPCE’10). ACM, New York, NY, 127--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Tiark Rompf, Arvind K. Sujeeth, Nada Amin, Kevin J. Brown, Vojin Jovanovic, Hyouk Joong Lee, Manohar Jonnalagedda, Kunle Olukotun, and Martin Odersky. 2013. Optimizing data structures in high-level programs: New directions for extensible compilers based on staging. In Proceedings of the Annual Symposium on Principles of Programming Languages (POPL’13). ACM, 497--510. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Sudip Roy, Lucja Kot, Gabriel Bender, Bailu Ding, Hossein Hojjat, Christoph Koch, Nate Foster, and Johannes Gehrke. 2015. The homeostasis protocol: Avoiding transaction coordination through program analysis. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’15). 1311--1326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Amir Shaikhha, Mohammad Dashti, and Christoph Koch. 2018. Push vs. pull-based loop fusion in query engines. J. Funct. Program. 28 (2018).Google ScholarGoogle Scholar
  71. Amir Shaikhha, Yannis Klonatos, Lionel Parreaux, Lewis Brown, Mohammad Dashti, and Christoph Koch. 2016. How to architect a query compiler. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’16). 1907--1922. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Xiaogang Shi, Bin Cui, Gillian Dobbie, and Beng Chin Ooi. 2014. Towards unified ad-hoc data processing. In Proceedings of the Special Interest Group International Conference on the Management of Data (SIGMOD’14). ACM, 1263--1274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Juliusz Sompolski, Marcin Zukowski, and Peter Boncz. Vectorization vs. compilation in query execution. In Proceedings of the the 7th International Workshop on Data Management on New Hardware (DaMoN’11). ACM, 33--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Mike Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth O’Neil, Pat O’Neil, Alex Rasin, Nga Tran, and Stan Zdonik. 2005. C-Store:  A  Column-oriented  DBMS. In Proceedings of the International Conference on Very Large Data Bases (VLDB’05). VLDB Endowment, 553--564. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Michael Stonebraker, Samuel Madden, Daniel J. Abadi, Stavros Harizopoulos, Nabil Hachem, and Pat Helland. 2007. The end of an architectural era (It's Time for a Complete Rewrite). In Proceedings of the International Conference on Very Large Data Bases (VLDB’07). 1150--1160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Arvind K. Sujeeth, Austin Gibbons, Kevin J. Brown, Hyouk Joong Lee, Tiark Rompf, Martin Odersky, and Kunle Olukotun. 2013. Forge: Generating a high performance DSL implementation from a declarative specification. In Proceedings of the International Conference on Generative Programming: Concepts 8 Experience (GPCE’13). ACM, New York, NY, 145--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Eijiro Sumii and Naoki Kobayashi. 2001. A hybrid approach to online and offline partial evaluation. Higher Order Symbol. Comput. 14, 2--3 (Sept. 2001), 101--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Don Syme. 2006. Leveraging .NET meta-programming components from F#: Integrated queries and interoperable heterogeneous execution. In Proceedings of the 2006 Workshop on ML (ML’06). ACM, New York, NY, 43--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Walid Taha. 2004. A gentle introduction to multi-stage programming. In Domain-Specific Program Generation. Springer, 30--50.Google ScholarGoogle Scholar
  80. Walid Taha and Tim Sheard. 2000. MetaML and multi-stage programming with explicit annotations. Theor. Comput. Sci. 248, 1--2 (2000), 211--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. The GNOME Project. 2013. GLib: Library Package for Low-Level Data Structures in C—The Reference Manual. Retrieved from https://developer.gnome.org/glib/2.38/.Google ScholarGoogle Scholar
  82. Ehsan Totoni, Todd A. Anderson, and Tatiana Shpeisman. 2017. HPAT: High performance analytics with scripting ease-of-use. In Proceedings of the International Conference on Supercomputing (ICS’17). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Transaction Processing Performance Council. 1999. TPC-H, an Ad-Hoc, Decision Support Benchmark. Retrieved from http://www.tpc.org/tpch.Google ScholarGoogle Scholar
  84. Arie van Deursen, Paul Klint, and Joost Visser. 2000. Domain-specific languages: An annotated bibliography. SIGPLAN Not. 35, 6 (June 2000), 26--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Stratis Viglas, Gavin M. Bierman, and Fabian Nagel. 2014. Processing declarative queries through generating imperative code in managed runtimes. IEEE Data Eng. Bull. 37, 1 (2014), 12--21.Google ScholarGoogle Scholar
  86. Ben Wiedermann and William R. Cook. 2007. Extracting queries by static analysis of transparent persistence. In Proceedings of the Annual Symposium on Principles of Programming Languages (POPL’07). ACM, New York, NY, 199--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Ben Wiedermann, Ali Ibrahim, and William R. Cook. 2008. Interprocedural query extraction for transparent persistence. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’08). ACM, New York, NY, 19--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Michael Joseph Wolfe. 1978. Techniques for Improving the Inherent Parallelism in Programs. Ph.D. Dissertation. Department of Computer Science, University of Illinois at Urbana--Champaign.Google ScholarGoogle Scholar
  89. Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey. 2008. DryadLINQ: A system for general-purpose  distributed data-parallel computing using a high-level language. (OSDI’08). USENIX Association, Berkeley, CA, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: cluster computing with working sets. In  Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (HotCloud’10). USENIX Association, Berkeley, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Barry M. Zane, James P. Ballard, Foster D. Hinshaw, Dana A. Kirkpatrick, and Less Premanand Yerabothu. 2008. Optimized SQL code generation. Patent No. 7430549 B2. WO Patent App. US 10/886,011.Google ScholarGoogle Scholar
  92. Rui Zhang, Saumya Debray, and Richard T. Snodgrass. 2012. Micro-specialization: Dynamic code specialization of database management systems. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’12). ACM, New York, NY, 63--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Rui Zhang, Richard T. Snodgrass, and Saumya Debray. 2012. Micro-specialization in DBMSes. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’12). IEEE Computer Society, Washington, DC, 690--701. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Peng Zhao, Shimin Cui, Yaoqing Gao, Raúl Silvera, and José Nelson Amaral. 2007. Forma: A framework for safe automatic array reshaping. Trans. Program. Lang. Syst. 30, 1 (2007), 2. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Building Efficient Query Engines in a High-Level Language

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Database Systems
            ACM Transactions on Database Systems  Volume 43, Issue 1
            Best of SIGMOD 2016 Papers and Regular Papers
            March 2018
            227 pages
            ISSN:0362-5915
            EISSN:1557-4644
            DOI:10.1145/3194314
            Issue’s Table of Contents

            Copyright © 2018 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 11 April 2018
            • Revised: 1 December 2017
            • Accepted: 1 December 2017
            • Received: 1 November 2016
            Published in tods Volume 43, Issue 1

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader