skip to main content
research-article

Efficiently approximating query optimizer plan diagrams

Published:01 August 2008Publication History
Skip Abstract Section

Abstract

Given a parametrized n-dimensional SQL query template and a choice of query optimizer, a plan diagram is a color-coded pictorial enumeration of the execution plan choices of the optimizer over the query parameter space. These diagrams have proved to be a powerful metaphor for the analysis and redesign of modern optimizers, and are gaining currency in diverse industrial and academic institutions. However, their utility is adversely impacted by the impractically large computational overheads incurred when standard brute-force exhaustive approaches are used for producing fine-grained diagrams on high-dimensional query templates.

In this paper, we investigate strategies for efficiently producing close approximations to complex plan diagrams. Our techniques are customized to the features available in the optimizer's API, ranging from the generic optimizers that provide only the optimal plan for a query, to those that also support costing of sub-optimal plans and enumerating rank-ordered lists of plans. The techniques collectively feature both random and grid sampling, as well as inference techniques based on nearest-neighbor classifiers, parametric query optimization and plan cost monotonicity.

Extensive experimentation with a representative set of TPC-H and TPC-DS-based query templates on industrial-strength optimizers indicates that our techniques are capable of delivering 90% accurate diagrams while incurring less than 15% of the computational overheads of the exhaustive approach. In fact, for full-featured optimizers, we can guarantee zero error with less than 10% overheads. These approximation techniques have been implemented in the publicly available Picasso optimizer visualization tool.

References

  1. G. Antonshenkov, "Dynamic Query Optimization in Rdb/VMS", Proc. of 9th IEEE Intl. Conf. on Data Engineering (ICDE), April 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Charikar, S. Chaudhuri, R. Motwani and V. Narasayya,"Towards Estimation Error Guarantees for Distinct Values", Proc. of ACM Symp. on Principles of Database Systems (PODS), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. F. Chu, J. Halpern and P. Seshadri, "Least Expected Cost Query Optimization: An Exercise in Utility", Proc. of ACM Symp. on Principles of Database Systems (PODS), May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. E Chu, J. Halpern and J. Gehrke, "Least Expected Cost Query Optimization: What Can We Expect", Proc. of ACM Symp. on Principles of Database Systems (PODS), May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Cole and G. Graefe, "Optimization of Dynamic Query Evaluation Plans", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Deshpande, Z. Ives and V. Raman, "Adaptive Query Processing", Foundations and Trends in Databases, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Dey, S. Bhaumik, Harish D. and J. Haritsa "Efficient Generation of Approximate Plan Diagrams", Tech. Rep. TR-2008-01, DSL/SERC, Indian Inst. of Science, 2008. http://dsl.serc.iisc.ernet.in/publications/report/TR/TR-2008-01.pdfGoogle ScholarGoogle Scholar
  8. A. Ghosh, J. Parikh, V. Sengar and J. Haritsa, "Plan Selection based on Query Clustering", Proc. of 28th Intl. Conf. on Very Large Data Bases (VLDB), August 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Gonzalez and R. Woods, Digital Image Processing, Pearson Prentice Hall, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Haas and L. Stokes. "Estimating the number of classes in a finite population". In Journal of the American Statistical Association, 93, 1998.Google ScholarGoogle Scholar
  11. P. Haas, J. Naughton, S. Seshadri and L. Stokes,"Sampling-Based Estimation of the Number of Distinct Values of an Attribute", Proc. of 21st Intl. Conf. on Very Large Databases (VLDB), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Harish D., P. Darera and J. Haritsa, "On the Production of Anorexic Plan Diagrams", Proc. of 33rd Intl. Conf. on Very Large Data Bases (VLDB), September 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Harish D., P. Darera and J. Haritsa, "Identifying Robust Plans through Plan Diagram Reduction", Proc. of 34th Intl. Conf. on Very Large Data Bases (VLDB), August 2008.Google ScholarGoogle Scholar
  14. A. Hulgeri and S. Sudarshan, "Parametric Query Optimization for Linear and Piecewise Linear Cost Functions", Proc. of 28th Intl. Conf. on Very Large Data Bases (VLDB), August 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Hulgeri and S. Sudarshan, "AniPQO: Almost Non-intrusive Parametric Query Optimization for Nonlinear Cost Functions", Proc. of 29th Intl. Conf. on Very Large Data Bases (VLDB), September 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Kabra and D. DeWitt, "Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. V. Markl, V. Raman, D. Simmen, G. Lohman, H. Pirahesh and M. Cilimdzic, "Robust Query Processing through Progressive Optimization", Proc. of ACM SIGMOD Intl. Conf. on Management of Data, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. V. Prasad, "Parametric Query Optimization: A Geometric Approach", Master's Thesis, Dept. of Computer Science & Engineering, IIT Kanpur, April 1999.Google ScholarGoogle Scholar
  19. S. Rao, "Parametric Query Optimization: A Non-Geometric Approach", Master's Thesis, Dept. of Computer Science & Engineering, IIT Kanpur, March 1999.Google ScholarGoogle Scholar
  20. N. Reddy and J. Haritsa, "Analyzing Plan Diagrams of Database Query Optimizers", Proc. of 31st Intl. Conf. on Very Large Data Bases (VLDB), August 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. N. Reddy, "Next-Generation Relational Query Optimizers", Master's Thesis, Dept. of CSA, Indian Institute of Science, June 2005, http://dsl.serc.iisc.ernet.in/publications/thesis/naveen.pdf.Google ScholarGoogle Scholar
  22. P. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining, Addison-Wesley, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Picasso Database Query Optimizer Visualizer, http://dsl.serc.iisc.ernet.in/projects/PICASSO/picasso.htmlGoogle ScholarGoogle Scholar
  24. http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/t0024533.htmGoogle ScholarGoogle Scholar
  25. http://msdn2.microsoft.com/en-us/library/ms189298.aspxGoogle ScholarGoogle Scholar
  26. http://infocenter.sybase.com/help/index. jsp?topic=/com.sybase.dc34982\_1500/html/mig\_gde/BABIFCAF.htmGoogle ScholarGoogle Scholar
  27. http://postgresql.orgGoogle ScholarGoogle Scholar
  28. http://www.tpc.org/tpchGoogle ScholarGoogle Scholar
  29. http://www.tpc.org/tpcdsGoogle ScholarGoogle Scholar

Index Terms

  1. Efficiently approximating query optimizer plan diagrams

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in

                  Full Access

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader