ABSTRACT
Processing SPARQL queries involves the construction of an efficient query plan to guide query execution. Alternative plans can vary in the resources and the amount of time that they need by orders of magnitude, making planning crucial for efficiency. On the other hand, the construction of optimal plans can become computationally intensive and it also operates upon detailed, difficult to obtain, metadata. In this paper we present Semagrow, a federated SPARQL querying system that uses metadata about the federated data sources in order to optimize query execution. We balance between a query optimizer that introduces little overhead, has appropriate fall backs in the absence of metadata, but at the same time produces optimal plans in as many situations as possible. Semagrow also exploits non-blocking and asynchronous stream processing technologies to achieve query execution efficiency and robustness. We also present and analyse empirical results using the FedBench benchmark to compare Semagrow against FedX and SPLENDID. Semagrow clearly outperforms SPLENDID and it is either on a par or much faster than FedX.
- K. Alexander, R. Cyganiak, M. Hausenblas, and J. Zhao. Describing linked datasets with the VoID vocabulary. W3C Interest Group Note, 3 March 2011.Google Scholar
- C. Buil-Aranda, A. Hogan, J. Umbrich, and P.-Y. Vandenbussche. SPARQL web-querying infrastructure: Ready for action? In Proc. 12th Intl Semantic Web Conference (ISWC 2013), Sydney, Australia, October 21-25, 2013, Part II, LNCS 8219. Springer, 2013. Google ScholarDigital Library
- A. Charalambidis, S. Konstantopoulos, and V. Karkaletsis. Dataset descriptions for optimizing federated querying. In 24th Intl World Wide Web Conference Companion Proceedings (WWW 2015), Poster Session, Florence, Italy, 18-22 May 2015, 2015. Google ScholarDigital Library
- A. Charalambidis, A. Troumpoukis, and J. Jakobitsch. Techniques for heterogeneous distributed semantic querying. Semagrow Public Deliverable D3.4, 2015.Google Scholar
- O. Görlitz and S. Staab. SPLENDID: SPARQL endpoint federation exploiting VOID descriptions. In Proc. 2nd Intl Workshop on Consuming Linked Data (COLD 2011), Bonn, Germany, CEUR 782, 2011.Google Scholar
- G. Graefe. Iterators, schedulers, and distributed-memory parallelism. Softw., Pract. Exper. 26(4), 1996. Google ScholarDigital Library
- L. M. Haas, D. Kossmann, E. L. Wimmers, and J. Yang. Optimizing queries across diverse data sources. In Proc. 23rd Intl Conference on Very Large Data Bases (VLDB'97), Athens, Greece. 1997. Google ScholarDigital Library
- O. Hartig, C. Bizer, and J. C. Freytag. Executing SPARQL queries over the web of linked data. In Proc. of 8th Intl Semantic Web Conference (ISWC 2009), Chantilly, VA, USA. LNCS 5823. Springer, 2009. Google ScholarDigital Library
- D. Kossmann. The state of the art in distributed query processing. ACM Comput. Surv., 32(4):422--469, 2000. Google ScholarDigital Library
- R. Lokers, S. Konstantopoulos, A. Stellato, R. Knapen, and S. Janssen. Exploiting innovative linked open data and semantic technologies in agro-environmental modelling. In Proc. of the 7th Intl Congress on Environmental Modelling and Software (iEMSs 2014), San Diego, USA, 15-19 June 2014.Google Scholar
- M. Saleem, Y. Khan, A. Hasnain, I. Ermilov, and A.-C. Ngonga Ngomo. A fine-grained evaluation of SPARQL endpoint federation systems. Accepted to Semantic Web Journal. 2014.Google Scholar
- M. Saleem and A.-C. Ngonga Ngomo. Hibiscus: Hypergraph-based source selection for SPARQL endpoint federation. In Proc. 11th ESWC Conference, Anissaras, Crete, Greece, LNCS 8465. Springer, 2014.Google ScholarCross Ref
- M. Saleem, A.-C. Ngonga Ngomo, J. Xavier Parreira, H. F. Deus, and M. Hauswirth. DAW: duplicate-aware federated query processing over the web of data. In Proc. 12th Intl Semantic Web Conference (ISWC 2013), Sydney, Australia, Part I, LNCS 8218, 2013. Google ScholarDigital Library
- M. Schmidt, O. Görlitz, P. Haase, G. Ladwig et al. Fedbench: A benchmark suite for federated semantic data query processing. In Proc. of the 10th Intl Semantic Web Conference (ISWC 2011), Bonn, Germany, Part I, LNCS 7031. Springer, 2011. Google ScholarDigital Library
- A. Schwarte, P. Haase, K. Hose, R. Schenkel, and M. Schmidt. FedX: A federation layer for distributed query processing on Linked Open Data. In Proc. 8th Extended Semantic Web Conference (ESWC 2011), Heraklion, Crete, Greece, LNCS 6644. Springer, 2011. Google ScholarDigital Library
Index Terms
- SemaGrow: optimizing federated SPARQL queries
Recommendations
Estimating selectivity for joined RDF triple patterns
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementA fundamental problem related to RDF query processing is selectivity estimation, which is crucial to query optimization for determining a join order of RDF triple patterns. In this paper we focus research on selectivity estimation for SPARQL graph ...
An Extension of SPARQL for Expressing Qualitative Preferences
The Semantic Web – ISWC 2017AbstractIn this paper we present SPREFQL, an extension of the SPARQL language that allows appending a "PREFER" clause that expresses ‘soft’ preferences over the query results obtained by the main body of the query. The extension does not add expressivity ...
Collaborative SPARQL Query Processing for Decentralized Semantic Data
Database and Expert Systems ApplicationsAbstractDecentralization allows users to regain freedom and control over their digital life. As a global shared data space, the Linked Data already supports decentralization. Data providers are free to publish their data on their web domains and users can ...
Comments