skip to main content
10.1145/2814864.2814886acmotherconferencesArticle/Chapter ViewAbstractPublication PagessemanticsConference Proceedingsconference-collections
research-article

SemaGrow: optimizing federated SPARQL queries

Published:16 September 2015Publication History

ABSTRACT

Processing SPARQL queries involves the construction of an efficient query plan to guide query execution. Alternative plans can vary in the resources and the amount of time that they need by orders of magnitude, making planning crucial for efficiency. On the other hand, the construction of optimal plans can become computationally intensive and it also operates upon detailed, difficult to obtain, metadata. In this paper we present Semagrow, a federated SPARQL querying system that uses metadata about the federated data sources in order to optimize query execution. We balance between a query optimizer that introduces little overhead, has appropriate fall backs in the absence of metadata, but at the same time produces optimal plans in as many situations as possible. Semagrow also exploits non-blocking and asynchronous stream processing technologies to achieve query execution efficiency and robustness. We also present and analyse empirical results using the FedBench benchmark to compare Semagrow against FedX and SPLENDID. Semagrow clearly outperforms SPLENDID and it is either on a par or much faster than FedX.

References

  1. K. Alexander, R. Cyganiak, M. Hausenblas, and J. Zhao. Describing linked datasets with the VoID vocabulary. W3C Interest Group Note, 3 March 2011.Google ScholarGoogle Scholar
  2. C. Buil-Aranda, A. Hogan, J. Umbrich, and P.-Y. Vandenbussche. SPARQL web-querying infrastructure: Ready for action? In Proc. 12th Intl Semantic Web Conference (ISWC 2013), Sydney, Australia, October 21-25, 2013, Part II, LNCS 8219. Springer, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Charalambidis, S. Konstantopoulos, and V. Karkaletsis. Dataset descriptions for optimizing federated querying. In 24th Intl World Wide Web Conference Companion Proceedings (WWW 2015), Poster Session, Florence, Italy, 18-22 May 2015, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Charalambidis, A. Troumpoukis, and J. Jakobitsch. Techniques for heterogeneous distributed semantic querying. Semagrow Public Deliverable D3.4, 2015.Google ScholarGoogle Scholar
  5. O. Görlitz and S. Staab. SPLENDID: SPARQL endpoint federation exploiting VOID descriptions. In Proc. 2nd Intl Workshop on Consuming Linked Data (COLD 2011), Bonn, Germany, CEUR 782, 2011.Google ScholarGoogle Scholar
  6. G. Graefe. Iterators, schedulers, and distributed-memory parallelism. Softw., Pract. Exper. 26(4), 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. M. Haas, D. Kossmann, E. L. Wimmers, and J. Yang. Optimizing queries across diverse data sources. In Proc. 23rd Intl Conference on Very Large Data Bases (VLDB'97), Athens, Greece. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. O. Hartig, C. Bizer, and J. C. Freytag. Executing SPARQL queries over the web of linked data. In Proc. of 8th Intl Semantic Web Conference (ISWC 2009), Chantilly, VA, USA. LNCS 5823. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Kossmann. The state of the art in distributed query processing. ACM Comput. Surv., 32(4):422--469, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Lokers, S. Konstantopoulos, A. Stellato, R. Knapen, and S. Janssen. Exploiting innovative linked open data and semantic technologies in agro-environmental modelling. In Proc. of the 7th Intl Congress on Environmental Modelling and Software (iEMSs 2014), San Diego, USA, 15-19 June 2014.Google ScholarGoogle Scholar
  11. M. Saleem, Y. Khan, A. Hasnain, I. Ermilov, and A.-C. Ngonga Ngomo. A fine-grained evaluation of SPARQL endpoint federation systems. Accepted to Semantic Web Journal. 2014.Google ScholarGoogle Scholar
  12. M. Saleem and A.-C. Ngonga Ngomo. Hibiscus: Hypergraph-based source selection for SPARQL endpoint federation. In Proc. 11th ESWC Conference, Anissaras, Crete, Greece, LNCS 8465. Springer, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  13. M. Saleem, A.-C. Ngonga Ngomo, J. Xavier Parreira, H. F. Deus, and M. Hauswirth. DAW: duplicate-aware federated query processing over the web of data. In Proc. 12th Intl Semantic Web Conference (ISWC 2013), Sydney, Australia, Part I, LNCS 8218, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Schmidt, O. Görlitz, P. Haase, G. Ladwig et al. Fedbench: A benchmark suite for federated semantic data query processing. In Proc. of the 10th Intl Semantic Web Conference (ISWC 2011), Bonn, Germany, Part I, LNCS 7031. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Schwarte, P. Haase, K. Hose, R. Schenkel, and M. Schmidt. FedX: A federation layer for distributed query processing on Linked Open Data. In Proc. 8th Extended Semantic Web Conference (ESWC 2011), Heraklion, Crete, Greece, LNCS 6644. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SemaGrow: optimizing federated SPARQL queries

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            SEMANTICS '15: Proceedings of the 11th International Conference on Semantic Systems
            September 2015
            220 pages
            ISBN:9781450334624
            DOI:10.1145/2814864

            Copyright © 2015 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 16 September 2015

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            SEMANTICS '15 Paper Acceptance Rate22of97submissions,23%Overall Acceptance Rate40of182submissions,22%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader