skip to main content
10.1145/3019612.3019833acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

A semantic federated search engine for domain-specific document retrieval

Published:03 April 2017Publication History

ABSTRACT

Retrieval of domain-specific documents became attractive for the Semantic Web community due to the possibility of integrating classic Information Retrieval (IR) techniques with semantic knowledge. Unfortunately, the gap between the construction of a full semantic search engine and the possibility of exploiting a repository of ontologies covering all possible domains is far from being filled. Recent solutions focused on the aggregation of different domain-specific repositories managed by third-parties. In this paper, we present a semantic federated search engine developed in the context of the EEXCESS EU project. Through the developed platform, users are able to perform federated queries over repositories in a transparent way, i.e. without knowing how their original queries are transformed before being actually submitted. The platform implements a facility for plugging new repositories and for creating, with the support of general purpose knowledge bases, knowledge graphs describing the content of each connected repository. Such knowledge graphs are then exploited for enriching queries performed by users.

References

  1. F. Corcoglioniti, M. Dragoni, M. Rospocher, and A. P. Aprosio. Knowledge extraction for information retrieval. In The Semantic Web. Latest Advances and New Domains - 13th European Semantic Web Conference, ESWC 2016, Creete, Grecia, May 29 -- June 2, 2016. Proceedings. To appear., 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. da Costa Pereira, M. Dragoni, and G. Pasi. Multidimensional relevance: Prioritized aggregation in a personalized information retrieval setting. Information processing & management, 48(2):340--357, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Rank aggregation methods for the Web. Proceedings of the 10th international conference on World Wide Web, pages 613--622, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Hagen, M. Potthast, A. Beyer, and B. Stein. Towards optimum query segmentation: In doubt without. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM '12, pages 1015--1024, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Ingwersen and K. Järvelin. The Turn - Integration of Information Seeking and Retrieval in Context, volume 18 of The Information Retrieval Series. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Kopliku, K. Pinel-Sauvagnat, and M. Boughanem. Aggregated search: A new information retrieval paradigm. ACM Computing Surveys (CSUR), 46(3):41, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Lu and J. Callan. Federated search of text-based digital libraries in hierarchical peer-to-peer networks. In Advances in Information Retrieval, pages 52--66. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Marchionini and R. White. Find what you need, understand what you find. Int. J. Hum. Comput. Interaction, 23(3):205--237, 2007. Google ScholarGoogle ScholarCross RefCross Ref
  9. D. Minnie and S. Srinivasan. Meta search engines for information retrieval on multiple domains. In Proceedings of the International Joint Journal Conference on Engineering and Technology (IJJCET 2011), pages 115--118. Citeseer, 2011.Google ScholarGoogle Scholar
  10. J. Montgomery, L. Si, J. Callan, and D. A. Evans. Effect of varying number of documents in blind feedback: analysis of the 2003 nrrc ria workshop bf_numdocs experiment suite. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 476--477. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. X. Rong. word2vec parameter learning explained. CoRR, abs/1411.2738, 2014.Google ScholarGoogle Scholar
  12. M. Shokouhi and L. Si. Federated search. Foundations and Trends in Information Retrieval, 5(1):1--102, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Stojanovic. An approach for the efficient retrieval in ontology-enhanced information portals. In D. Karagiannis and U. Reimer, editors, Practical Aspects of Knowledge Management, 5th International Conference, PAKM 2004, Vienna, Austria, December 2--3, 2004, Proceedings, volume 3336 of Lecture Notes in Computer Science, pages 414--424. Springer, 2004. Google ScholarGoogle ScholarCross RefCross Ref
  14. S. Zwicklbauer, C. Seifert, and M. Granitzer. From general to specialized domain: Analyzing three crucial problems of biomedical entity disambiguation. In Q. Chen, A. Hameurlain, F. Toumani, R. Wagner, and H. Decker, editors, Database and Expert Systems Applications - 26th International Conference, DEXA 2015, Valencia, Spain, September 1--4, 2015, Proceedings, Part I, volume 9261 of Lecture Notes in Computer Science, pages 76--93. Springer, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A semantic federated search engine for domain-specific document retrieval

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SAC '17: Proceedings of the Symposium on Applied Computing
          April 2017
          2004 pages
          ISBN:9781450344869
          DOI:10.1145/3019612

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 3 April 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,650of6,669submissions,25%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader