Abstract
Estimates of predicate selectivities by database query optimizers often differ significantly from those actually encountered during query execution, leading to poor plan choices and inflated response times. In this paper, we investigate mitigating this problem by replacing selectivity error-sensitive plan choices with alternative plans that provide robust performance. Our approach is based on the recent observation that even the complex and dense "plan diagrams" associated with industrial-strength optimizers can be efficiently reduced to "anorexic" equivalents featuring only a few plans, without materially impacting query processing quality.
Extensive experimentation with a rich set of TPC-H and TPC-DS-based query templates in a variety of database environments indicate that plan diagram reduction typically retains plans that are substantially resistant to selectivity errors on the base relations. However, it can sometimes also be severely counter-productive, with the replacements performing much worse. We address this problem through a generalized mathematical characterization of plan cost behavior over the parameter space, which lends itself to efficient criteria of when it is safe to reduce. Our strategies are fully non-invasive and have been implemented in the Picasso optimizer visualization tool.
- A. Aboulnaga and S. Chaudhuri, "Self-tuning Histograms: Building Histograms without Looking at Data", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 1999. Google ScholarDigital Library
- B. Babcock and S. Chaudhuri, "Towards a Robust Query Optimizer: A Principled and Practical Approach", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, June 2005. Google ScholarDigital Library
- S. Babu, P. Bizarro and D. DeWitt, "Proactive Re-Optimization", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, June 2005. Google ScholarDigital Library
- S. Babu, P. Bizarro and D. DeWitt, "Proactive Re-Optimization with Rio", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, June 2005. Google ScholarDigital Library
- N. Bruno, "A Critical Look at the TAB Benchmark for Physical Design Tools", SIGMOD Record, 36(4), 2007. Google ScholarDigital Library
- F. Chu, J. Halpern and P. Seshadri, "Least Expected Cost Query Optimization: An Exercise in Utility", Proc. of ACM Symp. on Principles of Database Systems (PODS), May 1999. Google ScholarDigital Library
- A. Dey, S. Bhaumik, Harish D. and J. Haritsa, "Efficiently Approximating Query Optimizer Plan Diagrams", Proc. of 34th Intl. Conf. on Very Large Data Bases (VLDB), August 2008. Google ScholarDigital Library
- F. Chu, J. Halpern and J. Gehrke, "Least Expected Cost Query Optimization: What Can We Expect", Proc. of ACM Symp. on Principles of Database Systems (PODS), May 2002. Google ScholarDigital Library
- A. Deshpande, Z. Ives and V. Raman "Adaptive Query Processing", Foundations and Trends in Databases, 2007. Google ScholarDigital Library
- U. Feige, "A threshold of In n for approximating set cover", Journal of ACM, 45(4), 1998. Google ScholarDigital Library
- Harish D., P. Darera and J. Haritsa, "On the Production of Anorexic Plan Diagrams", Proc. of 33rd Intl. Conf. on Very Large Data Bases (VLDB), September 2007. Google ScholarDigital Library
- Harish D., P. Darera and J. Haritsa, "Robust Plans through Plan Diagram Reduction", Tech. Rep. TR-2007-02, DSL/SERC, Indian Inst. of Science, 2007. http://dsl.serc.iisc.ernet.in/publications/report/TR/TR-2007-02.pdfGoogle Scholar
- A. Hulgeri and S. Sudarshan, "Parametric Query Optimization for Linear and Piecewise Linear Cost Functions", Proc. of 28th Intl. Conf. on Very Large Data Bases (VLDB), August 2002. Google ScholarDigital Library
- A. Hulgeri and S. Sudarshan, "AniPQO: Almost Non-intrusive Parametric Query Optimization for Nonlinear Cost Functions", Proc. of 29th Intl. Conf. on Very Large Data Bases (VLDB), September 2003. Google ScholarDigital Library
- Y. Ioannidis and S. Christodoulakis, "On the Propagation of Errors in the Size of Join Results", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 1991. Google ScholarDigital Library
- N. Kabra and D. DeWitt, "Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 1998. Google ScholarDigital Library
- E. Kreyszig, Advanced Engineering Mathematics, New Age International, 5th ed, 1997.Google Scholar
- L. Mackert and G. Lohman, "R* Optimizer Validation and Performance Evaluation for Local Queries", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 1986. Google ScholarDigital Library
- V. Markl, V. Raman, D. Simmen, G. Lohman, H. Pirahesh and M. Cilimdzic, "Robust Query Processing through Progressive Optimization", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, June 2004. Google ScholarDigital Library
- J. Patel, M. Carey and M. Vernon, "Accurate Modeling of the Hybrid Hash Join Algorithm", Proc. of ACM SIGMETRICS Intl. Conf. on Measurement and Modeling of Computer Systems, May 1994. Google ScholarDigital Library
- Picasso Database Query Optimizer Visualizer, http://dsl.serc.iise.ernet.in/projects/PICASSO/picasso.htmlGoogle Scholar
- N. Reddy and J. Haritsa, "Analyzing Plan Diagrams of Database Query Optimizers", Proc. of 31st Intl. Conf. on Very Large Data Bases (VLDB), August 2005. Google ScholarDigital Library
- P. Slavik, "A tight analysis of the greedy algorithm for set cover", Proc. of 28th ACM Symp. on Theory of Computing, 1996. Google ScholarDigital Library
- M. Stillger, G. Lohman, V. Markl and M. Kandil, "LEO -- DB2's LEarning Optimizer", Proc. of 27th Intl. Conf. on Very Large Data Bases (VLDB), September 2001. Google ScholarDigital Library
- MATLAB, http://www.mathworks.comGoogle Scholar
- http://www.tpc.org/tpchGoogle Scholar
- http://www.tpc.org/tpcdsGoogle Scholar
- http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com. ibm.db2.udb.admin.doc/doc/t0024533.htmGoogle Scholar
- http://msdn2.microsoft.com/en-us/library/ms189298.aspxGoogle Scholar
- http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase. dc34982_1500/html/mig_gde/BABIFCAF.htmGoogle Scholar
Index Terms
- Identifying robust plans through plan diagram reduction
Recommendations
Plan bouquets: query processing without selectivity estimation
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of DataSelectivity estimates for optimizing OLAP queries often differ significantly from those actually encountered during query execution, leading to poor plan choices and inflated response times. We propose here a conceptually new approach to address this ...
Plan Bouquets: A Fragrant Approach to Robust Query Processing
Invited Paper from SIGMOD 2014 and Regular PapersIdentifying efficient execution plans for declarative OLAP queries typically entails estimation of several predicate selectivities. In practice, these estimates often differ significantly from the values actually encountered during query execution, ...
Looking ahead makes query plans robust: making the initial case with in-memory star schema data warehouse workloads
Query optimizers and query execution engines cooperate to deliver high performance on complex analytic queries. Typically, the optimizer searches through the plan space and sends a selected plan to the execution engine. However, optimizers may at times ...
Comments