ABSTRACT
Static analysis is a fundamental task in query optimization. In this paper we study static analysis and optimization techniques for SPARQL, which is the standard language for querying Semantic Web data. Of particular interest for us is the optionality feature in SPARQL. It is crucial in Semantic Web data management, where data sources are inherently incomplete and the user is usually interested in partial answers to queries. This feature is one of the most complicated constructors in SPARQL and also the one that makes this language depart from classical query languages such as relational conjunctive queries. We focus on the class of well-designed SPARQL queries, which has been proposed in the literature as a fragment of the language with good properties regarding query evaluation. We first propose a tree representation for SPARQL queries, called pattern trees, which captures the class of well-designed SPARQL graph patterns and which can be considered as a query execution plan. Among other results, we propose several transformation rules for pattern trees, a simple normal form, and study equivalence and containment. We also study the enumeration and counting problems for this class of queries.
- D. J. Abadi, A. Marcus, S. Madden, and K. J. Hollenbach. Scalable semantic web data management using vertical partitioning. In VLDB, pages 411--422. ACM, 2007. Google ScholarDigital Library
- R. Angles and C. Gutierrez. The expressive power of SPARQL. In ISWC, pages 114--129. Springer, 2008. Google ScholarDigital Library
- M. Arenas and J. Pérez. Querying Semantic Web data with SPARQL. In PODS, pages 305--316. ACM, 2011. Google ScholarDigital Library
- G. Bagan, A. Durand, and E. Grandjean. On acyclic conjunctive queries and constant delay enumeration. In CSL, pages 208--222. Springer, 2007. Google ScholarDigital Library
- M. Bauland, P. Chapdelaine, N. Creignou, M. Hermann, and H. Vollmer. An algebraic approach to the complexity of generalized conjunctive queries. In SAT 2004 - Revised Selected Papers, pages 30--45. Springer, 2005. Google ScholarDigital Library
- T. Berners-Lee. Linked data -- design issues. http://www.w3.org/DesignIssues/LinkedData.html, 2006.Google Scholar
- C. Bizer, T. Heath, and T. Berners-Lee. Linked data - the story so far. Int. J. Semantic Web Inf. Syst., 5(3):1--22, 2009.Google ScholarCross Ref
- A. K. Chandra and P. M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In STOC, pages 77--90. ACM, 1977. Google ScholarDigital Library
- M. Chekol, J. Euzenat, P. Genevès, and N. Layaïda. PSPARQL query containment. In DBPL, 2011.Google Scholar
- C. Chekuri and A. Rajaraman. Conjunctive query containment revisited. Theor. Comput. Sci., 239(2):211--229, 2000. Google ScholarDigital Library
- S. Cohen and I. Fadida and Y. Kanza and B. Kimelfeld and Y. Sagiv. Full Disjunctions: Polynomial-Delay Iterators in Action. In VLDB, pages 739--750. ACM, 2006. Google ScholarDigital Library
- A. Durand, M. Hermann, and P. G. Kolaitis. Subtractive reductions and complete problems for counting complexity classes. Theor. Comput. Sci., 340(3):496--513, 2005. Google ScholarDigital Library
- J. Flum, M. Frick, and M. Grohe. Query evaluation via tree-decompositions. J. ACM, 49(6):716--752, 2002. Google ScholarDigital Library
- J. Flum and M. Grohe. The parameterized complexity of counting problems. SIAM J. Comput., 33(4):892--922, 2004. Google ScholarDigital Library
- M. A. Gallego, J. D. Fernández, M. A. Martínez-Prieto, and P. de la Fuente. An empirical study of real-world SPARQL queries. In USEWOD, 2011.Google Scholar
- G. Gottlob, N. Leone, and F. Scarcello. A comparison of structural CSP decomposition methods. Artif. Intell., 124(2):243--282, 2000. Google ScholarDigital Library
- G. Gottlob, N. Leone, and F. Scarcello. Hypertree decompositions and tractable queries. J. Comput. Syst. Sci., 64(3):579--627, 2002.Google ScholarDigital Library
- G. Greco and F. Scarcello. The power of tree projections: local consistency, greedy algorithms, and larger islands of tractability. In PODS 2010, pages 327--338. ACM, 2010. Google ScholarDigital Library
- C. Gutierrez, C. A. Hurtado, A. O. Mendelzon, and J. Pérez. Foundations of semantic web databases. J. Comput. Syst. Sci., 77(3):520--541, 2011. Google ScholarDigital Library
- L. A. Hemaspaandra and H. Vollmer. The satanic notations: Counting classes beyond \#P and other definitional adventures. SIGACT News, Complexity Theory Column 8, 26(1):2--13, 1995. Google ScholarDigital Library
- Y. Kanza and W. Nutt and Y. Sagiv. Querying Incomplete Information in Semistructured Data. J. Comput. Syst. Sci., 64(3):655--693, 2002.Google ScholarDigital Library
- P.-Å. Larson and J. Zhou. View matching for outer-join views. In VLDB, pages 445--456. ACM, 2005. Google ScholarDigital Library
- O. Lassila and R. Swick. Resource description framework (RDF) model and syntax. W3C Recommnedation, January 1999.Google Scholar
- T. Neumann and G. Weikum. The rdf-3x engine for scalable management of rdf data. VLDB J., 19(1):91--113, 2010. Google ScholarDigital Library
- J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. In ISWC, pages 30--43. Springer, 2006. Google ScholarDigital Library
- J. Pérez, M. Arenas, and C. Gutierrez. Semantics of SPARQL. Technical Report, Universidad de Chile TR/DCC-2006--17, October 2006.Google Scholar
- J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. ACM Trans. Database Syst., 34(3), 2009. Google ScholarDigital Library
- F. Picalausa and S. Vansummeren. What are real SPARQL queries like? In SWIM, pages 7:1--7:6. ACM, 2011. Google ScholarDigital Library
- R. Pichler and S. Skritek. Tractable counting of the answers to conjunctive queries. In AMW. CEUR-WS.org, 2011.Google Scholar
- A. Polleres. From SPARQL to rules (and back). In WWW, pages 787--796. ACM, 2007. Google ScholarDigital Library
- E. Prud'Hommeaux and A. Seaborne. SPARQL query language for RDF. W3C Recommnedation, January 2008.Google Scholar
- M. Schmidt, T. Hornung, N. Küchlin, G. Lausen, and C. Pinkel. An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In ISWC, pages 82--97. Springer, 2008. Google ScholarDigital Library
- M. Schmidt, T. Hornung, G. Lausen, and C. Pinkel. SP$^2$Bench: A SPARQL performance benchmark. In ICDE, pages 222--233. IEEE, 2009. Google ScholarDigital Library
- M. Schmidt, M. Meier, and G. Lausen. Foundations of SPARQL query optimization. In ICDT, pages 4--33. ACM, 2010. Google ScholarDigital Library
- G. Serfiotis, I. Koffina, V. Christophides, and V. Tannen. Containment and minimization of RDF/S query patterns. In ISWC, pages 607--623. Springer, 2005. Google ScholarDigital Library
- L. Sidirourgos, R. Goncalves, M. L. Kersten, N. Nes, and S. Manegold. Column-store support for RDF data management: not all swans are white. PVLDB, 1(2):1553--1563, 2008. Google ScholarDigital Library
- M. Stocker, A. Seaborne, A. Bernstein, C. Kiefer, and D. Reynolds. Sparql basic graph pattern optimization using selectivity estimation. In WWW, pages 595--604. ACM, 2008. Google ScholarDigital Library
- J. D. Ullman. Information integration using logical views. In ICDT, pages 19--40. Springer, 1997. Google ScholarDigital Library
- L. G. Valiant. The complexity of enumeration and reliability problems. SIAM J. Comput., 8(3):410--421, 1979.Google ScholarDigital Library
- C. Weiss, P. Karras, and A. Bernstein. Hexastore: sextuple indexing for semantic web data management. PVLDB, 1(1):1008--1019, 2008. Google ScholarDigital Library
- M. Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82--94. IEEE, 1981. Google ScholarDigital Library
- http://data.gov.uk.Google Scholar
- http://www.data.gov.Google Scholar
- http://DBpedia.org/sparql.Google Scholar
- ARQ. http://sourceforge.net/projects/jena/files/ARQ/.Google Scholar
Index Terms
- Static analysis and optimization of semantic web queries
Recommendations
Static analysis and optimization of semantic web queries
Invited papers issueStatic analysis is a fundamental task in query optimization. In this article we study static analysis and optimization techniques for SPARQL, which is the standard language for querying Semantic Web data. Of particular interest for us is the optionality ...
RDF, Jena, SparQL and the 'Semantic Web'
SIGUCCS '09: Proceedings of the 37th annual ACM SIGUCCS fall conference: communication and collaborationThe Resource Description Format (RDF) is used to represent information modeled as a "graph": a set of individual objects, along with a set of connections among those objects. In that role, RDF is one of the pillars of the so-called Semantic Web. This ...
Expressive Languages for Querying the Semantic Web
Best of PODS 2017, Best of ICDT 2017 and Regular PapersThe problem of querying RDF data is a central issue for the development of the Semantic Web. The query language SPARQL has become the standard language for querying RDF since its W3C standardization in 2008. However, the 2008 version of this language ...
Comments