research-article

Plan Bouquets: A Fragrant Approach to Robust Query Processing

Authors:
Anshuman Dutt

Indian Institute of Science, Bangalore, India

Indian Institute of Science, Bangalore, India

0000-0002-9435-105X
View Profile

,
Jayant R. Haritsa

Indian Institute of Science, Bangalore, India

Indian Institute of Science, Bangalore, India
View Profile

Authors Info & Claims

ACM Transactions on Database Systems Volume 41 Issue 2Article No.: 11pp 1–37https://doi.org/10.1145/2901738

Published:11 May 2016Publication History

ACM Transactions on Database Systems

Abstract

Identifying efficient execution plans for declarative OLAP queries typically entails estimation of several predicate selectivities. In practice, these estimates often differ significantly from the values actually encountered during query execution, leading to poor plan choices and grossly inflated response times. We propose here a conceptually new approach to address this classical problem, wherein the compile-time estimation process is completely eschewed for error-prone selectivities. Instead, from the set of optimal plans in the query’s selectivity error space, a limited subset, called the “plan bouquet,” is selected such that at least one of the bouquet plans is 2-optimal at each location in the space. Then, at run time, a sequence of cost-budgeted executions from the plan bouquet is carried out, eventually finding a plan that executes to completion within its assigned budget. The duration and switching of these executions is controlled by a graded progression of isosurfaces projected onto the optimal performance profile. We prove that this construction results, for the first time, in guarantees on worst-case performance sub-optimality. Moreover, it ensures repeatable execution strategies across different invocations of a query.

We then present a suite of enhancements to the basic plan bouquet algorithm, including randomized variants, that result in significantly stronger performance guarantees. An efficient isosurface identification algorithm is also introduced to curtail the bouquet construction overheads.

The plan bouquet approach has been empirically evaluated on both PostgreSQL and a commercial DBMS, over the TPC-H and TPC-DS benchmark environments. Our experimental results indicate that it delivers substantial improvements in the worst-case behavior, without impairing the average-case performance, as compared to the native optimizers of these systems. Moreover, it can be implemented using existing optimizer infrastructure, making it relatively easy to incorporate in current database engines.

Overall, the plan bouquet approach provides novel performance guarantees that open up new possibilities for robust query processing.

Supplemental Material

Available for Download

zip

dutt.zip (229.5 KB)

Supplemental movie, appendix, image and software files for, Plan Bouquets: A Fragrant Approach to Robust Query Processing

References

Ashraf Aboulnaga and Surajit Chaudhuri. 1999. Self-tuning histograms: Building histograms without looking at data. In Proc. of the 1999 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’99). 181--192. Google ScholarDigital Library
Ron Avnur and Joseph M. Hellerstein. 2000. Eddies: Continuously adaptive query processing. In Proc. of the 2000 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’00). 261--272. Google ScholarDigital Library
Brian Babcock and Surajit Chaudhuri. 2005. Towards a robust query optimizer: A principled and practical approach. In Proc. of the 2005 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’05). 119--130. Google ScholarDigital Library
Shivnath Babu, Pedro Bizarro, and David DeWitt. 2005. Proactive re-optimization. In Proc. of the 2005 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’05). 107--118. Google ScholarDigital Library
Pedro Bizarro, Nicolas Bruno, and David J. DeWitt. 2009. Progressive parametric query optimization. IEEE Trans. Knowl. Data Eng. 21, 4 (2009), 582--594. Google ScholarDigital Library
Renata Borovica-Gajic, Stratos Idreos, Anastasia Ailamaki, Marcin Zukowski, and Campbell Fraser. 2015. Smooth scan: Statistics-oblivious access paths. In Proc. of the 31st IEEE Intl. Conf. on Data Engg. (ICDE’15). 315--326.Google ScholarCross Ref
Surajit Chaudhuri, Hongrae Lee, and Vivek R. Narasayya. 2010. Variance aware optimization of parameterized queries. In Proc. of the 2010 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’10). 531--542. Google ScholarDigital Library
Surajit Chaudhuri, Vivek Narasayya, and Ravi Ramamurthy. 2008. A pay-as-you-go framework for query execution feedback. In Proc. VLDB 1, 1 (2008), 1141--1152. Google ScholarDigital Library
Marek Chrobak, Claire Kenyon, John Noga, and Neal E. Young. 2008. Incremental medians via online bidding. Algorithmica 50, 4 (2008), 455--478.Google ScholarDigital Library
Francis Chu, Joseph Halpern, and Johannes Gehrke. 2002. Least expected cost query optimization: What can we expect? In Proc. of the 21st Symposium on Principles of Database Systems (PODS’02). 293--302. Google ScholarDigital Library
Richard L. Cole and Goetz Graefe. 1994. Optimization of dynamic query evaluation plans. In Proc. of the 1994 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’94). 150--160. Google ScholarDigital Library
Amol Deshpande, Zachary Ives, and Vijayshankar Raman. 2007. Adaptive query processing. Foundations and Trends in Databases 1, 1 (2007), 1--140. Google ScholarDigital Library
Anshuman Dutt and Jayant R. Haritsa. 2014a. Plan bouquets: Query processing without selectivity estimation. In Proc. of the 2014 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’14). 1039--1050. Google ScholarDigital Library
Anshuman Dutt and Jayant R. Haritsa. 2014b. Query Processing without Estimation. Technical Report TR-2014-01. Database Systems Lab, SERC/CSA, Indian Institute of Science. Retrieved from http://dsl.serc.iisc.ernet.in/publications/TR/TR-2014-01.pdf.Google Scholar
Fedor V. Fomin and Alexey A. Stepanov. 2007. Counting minimum weighted dominating sets. In Computing and Combinatorics. Lecture Notes in Computer Science, Vol. 4598. Springer, Berlin, 165--175. Google ScholarDigital Library
Goetz Graefe. 2012. New algorithms for join and grouping operations. Comput. Sci. 27, 1 (Feb. 2012), 3--27. Google ScholarDigital Library
Goetz Graefe, Wey Guy, Harumi A. Kuno, and Glenn N. Paulley. 2012. Robust query processing (Dagstuhl seminar 12321). Dagstuhl Rep. 2, 8 (2012), 1--15.Google Scholar
D. Harish, Pooja N. Darera, and Jayant R. Haritsa. 2007. On the production of anorexic plan diagrams. In Proc. of the 33rd Intl. Conf. on Very Large Data Bases (VLDB’07). 1081--1092. Google ScholarDigital Library
D. Harish, Pooja N. Darera, and Jayant R. Haritsa. 2008. Identifying robust plans through plan diagram reduction. In Proc. VLDB 1, 1 (2008), 1124--1140.Google ScholarDigital Library
IBM. 2003. Using a SELECTIVITY clause to influence the optimizer. Retrieved from www.ibm.com/developerworks/data/library/tips/dm-0312yip/.Google Scholar
Yannis E. Ioannidis and Stavros Christodoulakis. 1991. On the propagation of errors in the size of join results. In Proc. of the 1991 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’91). 268--277. Google ScholarDigital Library
Navin Kabra and David J. DeWitt. 1998. Efficient mid-query re-optimization of sub-optimal query execution plans. In Proc. of the 1998 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’98). 106--117. Google ScholarDigital Library
Donald E. Knuth. 1997. The Art of Computer Programming, Volume 2 (3rd ed.): Seminumerical Algorithms. Addison-Wesley Professional, Boston, MA, 145--146. Google ScholarDigital Library
Guy Lohman. 2014. Is Query Optimization a Solved Problem? Retrieved from http://wp.sigmod.org/?author=20.Google Scholar
Zvi Lotker, Boaz Patt-Shamir, and Dror Rawitz. 2008. Rent, lease or buy: Randomized algorithms for multislope ski rental. In Proc. of the 25th Annual Symposium on Theoretical Aspects of Computer Science. 503--514.Google Scholar
Volker Markl, Vijayshankar Raman, David Simmen, Guy Lohman, Hamid Pirahesh, and Miso Cilimdzic. 2004. Robust query processing through progressive optimization. In Proc. of the 2004 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’04). 659--670. Google ScholarDigital Library
Guido Moerkotte, Thomas Neumann, and Gabriele Steidl. 2009. Preventing bad plans by bounding the impact of cardinality estimation errors. In Proc. VLDB 2, 1 (Aug. 2009), 982--993. Google ScholarDigital Library
Thomas Neumann and César A. Galindo-Legaria. 2013. Taking the edge off cardinality estimation errors using incremental execution. In Datenbanksysteme für Business, Technologie und Web (BTW), 15. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), 11.-15.3.2013 in Magdeburg, Germany. Proceedings. 73--92.Google Scholar
Hung Q. Ngo, Ely Porat, Christopher Ré, and Atri Rudra. 2012. Worst-case optimal join algorithms: {Extended abstract}. In Proc. of the 31st Symposium on Principles of Database Systems (PODS’12). 37--48. Google ScholarDigital Library
Oracle. 2013. Optimizer with Oracle Database 12c. Retrieved from www.oracle.com/ocom/groups/public/@otn/documents/webcontent/1963236.pdf.Google Scholar
Neoklis Polyzotis. 2005. Selectivity-based partitioning: A divide-and-union paradigm for effective query optimization. In Proc. of the 14th ACM Intl. Conf. on Information and Knowledge Management (CIKM’05). ACM, New York, NY, 720--727. Google ScholarDigital Library
PostgreSQL. 2009. PostgreSQL 8.4. www.postgresql.org/docs/8.4/static/release.html. (2009).Google Scholar
Naveen Reddy and Jayant R. Haritsa. 2005. Analyzing plan diagrams of database query optimizers. In Proc. of the 31st Intl. Conf. Very Large Data Bases (VLDB’05). 1228--1239. Google ScholarDigital Library
P. Griffiths Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. 1979. Access path selection in a relational database management system. In Proc. of the 1979 ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD’79). 23--34. Google ScholarDigital Library
SQLSERVER. 2010. Using the USE PLAN Query Hint. Retrieved from technet.microsoft.com/en-us/library/ms186954(v=sql.105).aspx.Google Scholar
Michael Stillger, Guy M. Lohman, Volker Markl, and Mokhtar Kandil. 2001. LEO - DB2’s learning optimizer. In Proc. of the 27th Intl. Conf. on Very Large Data Bases (VLDB’01). 19--28. Google ScholarDigital Library
Kostas Tzoumas, Amol Deshpande, and Christian S. Jensen. 2013. Efficiently adapting graphical models for selectivity estimation. VLDB J. 22, 1 (2013), 3--27. Google ScholarDigital Library
Todd L. Veldhuizen. 2014. Triejoin: A simple, worst-case optimal join algorithm. In Proc. of the 17th Intl. Conf. on Database Theory (ICDT’14). 96--106.Google Scholar
Wentao Wu, Yun Chi, Shenghuo Zhu, Jun’ichi Tatemura, Hakan Hacigümüs, and Jeffrey F. Naughton. 2013. Predicting query execution time: Are optimizer cost models really unusable? In Proc. of the 29th IEEE Intl. Conf. on Data Engg. (ICDE’13). 1081--1092. Google ScholarDigital Library

Index Terms

Plan Bouquets: A Fragrant Approach to Robust Query Processing
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Database query processing
        Query optimization

Recommendations

Plan bouquets: query processing without selectivity estimation
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data

Selectivity estimates for optimizing OLAP queries often differ significantly from those actually encountered during query execution, leading to poor plan choices and inflated response times. We propose here a conceptually new approach to address this ...
Read More
Fast and scalable inequality joins

Inequality joins, which is to join relations with inequality conditions, are used in various applications. Optimizing joins has been the subject of intensive research ranging from efficient join algorithms such as sort-merge join, to the use of ...
Read More
Selectivity estimation for hybrid queries over text-rich data graphs
EDBT '13: Proceedings of the 16th International Conference on Extending Database Technology

Many databases today are text-rich, comprising not only structured, but also textual data. Querying such databases involves predicates matching structured data combined with string predicates featuring textual constraints. Based on selectivity estimates ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Database Systems Volume 41, Issue 2
Invited Paper from SIGMOD 2014 and Regular Papers
June 2016
271 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/2936309
Editor:
Christian S. Jensen
Aalborg University, Denmark
Issue’s Table of Contents
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 May 2016
- Accepted: 1 January 2016
- Revised: 1 September 2015
- Received: 1 March 2015
Published in tods Volume 41, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Selectivity estimation
plan bouquets
robust query processing
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 396
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Plan Bouquets: A Fragrant Approach to Robust Query Processing

ACM Transactions on Database Systems

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Plan bouquets: query processing without selectivity estimation

Fast and scalable inequality joins

Selectivity estimation for hybrid queries over text-rich data graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Plan Bouquets: A Fragrant Approach to Robust Query Processing

ACM Transactions on Database Systems

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Plan bouquets: query processing without selectivity estimation

Fast and scalable inequality joins

Selectivity estimation for hybrid queries over text-rich data graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media