ABSTRACT
A "plan diagram" is a pictorial enumeration of the execution plan choices of a database query optimizer over the relational selectivity space. We have shown recently that, for industrial-strength database engines, these diagrams are often remarkably complex and dense, with a large number of plans covering the space. However, they can often be reduced to much simpler pictures, featuring significantly fewer plans, without materially affecting the query processing quality. Plan reduction has useful implications for the design and usage of query optimizers, including quantifying redundancy in the plan search space, enhancing useability of parametric query optimization, identifying error-resistant and least-expected-cost plans, and minimizing the overheads of multi-plan approaches.
We investigate here the plan reduction issue from theoretical, statistical and empirical perspectives. Our analysis shows that optimal plan reduction, w.r.t. minimizing the number of plans, is an NP-hard problem in general, and remains so even for a storage-constrained variant. We then present a greedy reduction algorithm with tight and optimal performance guarantees, whose complexity scales linearly with the number of plans in the diagram for a given resolution. Next, we devise fast estimators for locating the best tradeoff between the reduction in plan cardinality and the impact on query processing quality. Finally, extensive experimentation with a suite of multi-dimensional TPCH-based query templates on industrial-strength optimizers demonstrates that complex plan diagrams easily reduce to "anorexic" (small absolute number of plans) levels incurring only marginal increases in the estimated query processing costs.
- G. Antonshenkov, "Dynamic Query Optimization in Rdb/VMS", Proc. of 9th IEEE Intl. Conf. on Data Engineering (ICDE), April 1993. Google ScholarDigital Library
- A. Betawadkar, "Query Optimization with One Parameter", Master's Thesis, Dept. of Computer Science & Engineering, IIT Kanpur, February 1999.Google Scholar
- F. Chu, J. Halpern and P. Seshadri, "Least Expected Cost Query Optimization: An Exercise in Utility", Proc. of ACM Symp. on Principles of Database Systems (PODS), May 1999. Google ScholarDigital Library
- F. Chu, J. Halpern and J. Gehrke, "Least Expected Cost Query Optimization: What Can We Expect", Proc. of ACM Symp. on Principles of Database Systems (PODS), May 2002. Google ScholarDigital Library
- R. Cole and G. Graefe, "Optimization of Dynamic Query Evaluation Plans", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 1994. Google ScholarDigital Library
- U. Feige, "A threshold of in n for approximating set cover", Journal of ACM, 45(4), 1998. Google ScholarDigital Library
- S. Ganguly, "Design and Analysis of Parametric Query Optimization Algorithms", Proc. of 24th Intl. Conf. on Very Large Data Bases (VLDB), August 1998. Google ScholarDigital Library
- M. Garey and D. Johnson, "Computers and Intractability: A Guide to the Theory of NP-Completeness", W. H. Freeman & Co, 1979. Google ScholarDigital Library
- Harish D., P. Darera and J. Haritsa, "Reduction of Query Optimizer Plan Diagrams", Tech. Rep. TR-2007-01, DSL/SERC, Indian Inst. of Science, 2007. http://dsl.serc.iisc.ernet.in/publications/report/TR/TR-2007-01.pdfGoogle Scholar
- A. Hulgeri and S. Sudarshan, "Parametric Query Optimization for Linear and Piecewise Linear Cost Functions", Proc. of 28th Intl. Conf. on Very Large Data Bases (VLDB), August 2002. Google ScholarDigital Library
- A. Hulgeri and S. Sudarshan, "AniPQO: Almost Non-intrusive Parametric Query Optimization for Nonlinear Cost Functions", Proc. of 29th Intl. Conf. on Very Large Data Bases (VLDB), September 2003. Google ScholarDigital Library
- N. Kabra and D. DeWitt, "Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 1998. Google ScholarDigital Library
- Y. Ioannidis, R. Ng, K. Shim and T. Sellis, "Parametric Query Optimization", Proc. of 18th Intl. Conf. on Very Large Data Bases (VLDB), August 1992. Google ScholarDigital Library
- L. Mackert and G. Lohman, "R* Optimizer Validation and Performance Evaluation for Local Queries", Proc. of ACM Sigmod Intl. Conf. on Management of Data, May 1986. Google ScholarDigital Library
- Picasso Database Query Optimizer Visualizer, http://dsl.serc.iisc.ernet.in/projects/PICASSO/picasso.htmlGoogle Scholar
- N. Reddy and J. Haritsa, "Analyzing Plan Diagrams of Database Query Optimizers", Proc. of 31st Intl. Conf. on Very Large Data Bases (VLDB), August 2005. Google ScholarDigital Library
- F. Reiss and T. Kanungo, "A Characterization of the Sensitivity of Query Optimization to Storage Access Cost Parameters", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, June 2003. Google ScholarDigital Library
- P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie and T. Price, "Access Path Selection in a Relational Database System", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, June 1979. Google ScholarDigital Library
- P. Slavik, "A tight analysis of the greedy algorithm for set cover", Proc. of 28th ACM Symp. on Theory of Computing, 1996. Google ScholarDigital Library
- M. Stillger, G. Lohman, V. Markl and M. Kandil, "LEO -- DB2's LEarning Optimizer", Proc. of 27th Intl. Conf. on Very Large Data Bases (VLDB), September 2001. Google ScholarDigital Library
- http://www.tpc.org/tpchGoogle Scholar
- http://www.tpc.org/tpcdsGoogle Scholar
Recommendations
Analyzing plan diagrams of database query optimizers
VLDB '05: Proceedings of the 31st international conference on Very large data basesA "plan diagram" is a pictorial enumeration of the execution plan choices of a database query optimizer over the relational selectivity space. In this paper, we present and analyze representative plan diagrams on a suite of popular commercial query ...
Plan bouquets: query processing without selectivity estimation
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of DataSelectivity estimates for optimizing OLAP queries often differ significantly from those actually encountered during query execution, leading to poor plan choices and inflated response times. We propose here a conceptually new approach to address this ...
Plan Bouquets: A Fragrant Approach to Robust Query Processing
Invited Paper from SIGMOD 2014 and Regular PapersIdentifying efficient execution plans for declarative OLAP queries typically entails estimation of several predicate selectivities. In practice, these estimates often differ significantly from the values actually encountered during query execution, ...
Comments