ABSTRACT
Given a set of Datalog rules, facts, and a query, answers to the query can be inferred bottom-up starting with the facts or top-down starting with the query. The dominant strategies to improve the performance of answering queries are reusing answers to subqueries for top-down methods, and transforming rules based on demand from the query, such as the well-known magic sets transformation, for bottom-up methods. However, the performance of these strategies vary drastically, and the most effective method has remained unknown.
This paper describes precise time and space complexity analysis for efficient implementation of Datalog queries using subsumptive tabling, a top-down evaluation method with more reuse of answers than the dominant tabling strategy, and shows that subsumptive tabling beats bottom-up evaluation of rules after magic sets transformation in both time and space complexities. It also describes subsumptive demand transformation, a novel method for transforming the rules so that bottom-up evaluation of the transformed rules mimics subsumptive tabling; we show that the time complexity of bottom-up evaluation after this transformation is equal to the the time complexity of top-down evaluation with subsumptive tabling. The paper further describes subsumption optimization, an optimization to increase the use of subsumption in subsumptive methods, and shows its application in the derivation of a well-known demand-driven pointer analysis algorithm. We support our analyses and comparisons through experiments with applications in ontology queries and program analysis.
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
- L. O. Andersen. Program analysis and specialization for the C programming language. Technical report, DIKU, Department of Computer Science, University of Copenhagen, 1994.Google Scholar
- C. Beeri and R. Ramakrishnan. On the power of magic. J. Logic Programming, 10(1/2/3&4):255--299, 1991. Google ScholarDigital Library
- F. Bry. Query evaluation in deductive databases: Bottom-up and top-down reconciled. Data Knowledge Engineering, 5:289--312, 1990. Google ScholarDigital Library
- S. Ceri, G. Gottlob, and L. Tanca. Logic Programming and Databases. Springer, 1990. Google ScholarDigital Library
- W. Chen and D. S. Warren. Tabled evaluation with delaying for general logic programs. J. ACM, 43(1):20--74, 1996. Google ScholarDigital Library
- S. S. Cosmadakis, H. Gaifman, P. C. Kanellakis, and M. Y. Vardi. Decidable optimization problems for database logic programs (preliminary report). In Proc. of the 20th Annual ACM Symp. on Theory of Computing, pages 477--490, 1988. Google ScholarDigital Library
- A. F. da Silva and V. S. Costa. The design of the YAP compiler: An optimizing compiler for logic programming languages. J. of Universal Computer Science, 12(7):764--787, 2006.Google Scholar
- J. DeTreville. Binder, a logic-based security language. In Proc. of the 2002 IEEE Symp. on Security and Privacy (S&P), pages 105--113, 2002. Google ScholarDigital Library
- E. Hajiyev, M. Verbaere, and O. de Moor. CodeQuest: Scalable source code queries with Datalog. In Proc. of the 20th European Conf. on Object-Oriented Programming (ECOOP), pages 2--27, 2006. Google ScholarDigital Library
- N. Heintze and O. Tardieu. Demand-driven pointer analysis. In Proc. of the 2001 ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), pages 24--34, 2001. Google ScholarDigital Library
- P. G. Kolaitis and C. H. Papadimitriou. Why not negation by fixpoint? J. Comput. Syst. Sci., 43(1):125--144, 1991. Google ScholarDigital Library
- A. Y. Levy and Y. Sagiv. Semantic query optimization in Datalog programs. In Proc. of the 14th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems (PODS), pages 163--173, 1995. Google ScholarDigital Library
- N. Li and J. C. Mitchell. Datalog with constraints: A foundation for trust management languages. In Proc. of the 5th Intl. Symp. on Practical Aspects of Declarative Languages (PADL), pages 58--73, 2003. Google ScholarDigital Library
- S. Liang, P. Fodor, H. Wan, and M. Kifer. OpenRuleBench: An analysis of the performance of rule engines. In Proc. of the 18th Intl. Conf. on World Wide Web (WWW), pages 601--610, 2009. Google ScholarDigital Library
- Y. A. Liu and S. D. Stoller. From Datalog rules to efficient programs with time and space guarantees. ACM Trans. Programming Languages and Systems, 31(6):21:1--21:38, August 2009. Google ScholarDigital Library
- B. T. Loo, T. Condie, M. N. Garofalakis, D. E. Gay, J. M. Hellerstein, P. Maniatis, R. Ramakrishnan, T. Roscoe, and I. Stoica. Declarative networking. Communications of the ACM, 52(11):87--95, 2009. Google ScholarDigital Library
- R. Ramakrishnan. Magic templates: A spellbinding approach to logic programs. In Proc. of the 5th Intl. Conf. and Symp. on Logic Programming, pages 140--159, 1988.Google Scholar
- R. Ramakrishnan and S. Sudarshan. Top-down versus bottom-up revisited. In Proc. of the 1991 Intl. Symp. on Logic Programming (ISLP), pages 321--336, 1991.Google Scholar
- R. Ramakrishnan and J. D. Ullman. A survey of deductive database systems. J. Logic Programming, 23(2):125--149, 1995.Google ScholarCross Ref
- P. Rao, C. R. Ramakrishnan, and I. V. Ramakrishnan. A thread in time saves tabling time. In Proc. of the 1996 Joint Intl. Conf. and Symp. on Logic Programming, pages 112--126, 1996.Google Scholar
- Y. Sagiv. Optimizing Datalog programs. In Foundations of Deductive Databases and Logic Programming, pages 659--698. Morgan Kaufmann, 1988. Google ScholarDigital Library
- D. Sereni, P. Avgustinov, and O. de Moor. Adding magic to an optimising datalog compiler. In Proc. of the ACM SIGMOD Intl. Conf. on Management of Data (SIGMOD), pages 553--566, 2008. Google ScholarDigital Library
- W. Shen, A. Doan, J. F. Naughton, and R. Ramakrishnan. Declarative information extraction using Datalog with embedded extraction predicates. In Proc. of the 33rd Intl. Conf. on Very Large Data Bases (VLDB), pages 1033--1044, 2007. Google ScholarDigital Library
- H. Tamaki and T. Sato. OLD resolution with tabulation. In Proc. of the 3rd Intl. Conf. on Logic Programming (ICLP), pages 84--98, 1986. Google ScholarDigital Library
- K. T. Tekle, M. Gorbovitski, and Y. A. Liu. Graph queries through Datalog optimizations. In Proc. of the 12th Intl. ACM SIGPLAN Conf. on Principles and Practice of Declarative Programming (PPDP), pages 25--34, 2010. Google ScholarDigital Library
- K. T. Tekle and Y. A. Liu. Precise complexity analysis for efficient Datalog queries. In Proc. of the 12th Intl. ACM SIGPLAN Conf. on Principles and Practice of Declarative Programming (PPDP), pages 35--44, 2010. Google ScholarDigital Library
- J. D. Ullman. Bottom-up beats top-down for Datalog. In Proc. of the 8th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems (PODS), pages 140--149, 1989. Google ScholarDigital Library
- D. S. Warren. Programming in tabled Prolog. Available at \\http://www.cs.sunysb.edu/~warren/xsbbook/, 1999.Google Scholar
- J. Whaley, D. Avots, M. Carbin, and M. S. Lam. Using Datalog with binary decision diagrams for program analysis. In Proc. of the 3rd Asian Symp. on Programming Languages and Systems (APLAS), pages 97--118, 2005. Google ScholarDigital Library
Index Terms
- More efficient datalog queries: subsumptive tabling beats magic sets
Recommendations
Precise complexity analysis for efficient datalog queries
PPDP '10: Proceedings of the 12th international ACM SIGPLAN symposium on Principles and practice of declarative programmingGiven a set of Datalog rules, facts, and a query, answers to the query can be inferred bottom-up starting with the facts or top-down starting with the query. For efficiently answering the query, top-down evaluation is extended with tabling that stores ...
Graph queries through datalog optimizations
PPDP '10: Proceedings of the 12th international ACM SIGPLAN symposium on Principles and practice of declarative programmingThis paper describes the use of a powerful graph query language for querying programs, and a novel combination of transformations for generating efficient implementations of the queries. The language supports graph path expressions that allow convenient ...
Decidable containment of recursive queries
Database theoryOne of the most important reasoning tasks on queries is checking containment, i.e., verifying whether one query yields necessarily a subset of the result of another one. Query containment is crucial in several contexts, such as query optimization, query ...
Comments