skip to main content
10.5555/1690219.1690289dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
research-article
Free Access

Unsupervised relation extraction by mining Wikipedia texts using information from the web

Published:02 August 2009Publication History

ABSTRACT

This paper presents an unsupervised relation extraction method for discovering and enhancing relations in which a specified concept in Wikipedia participates. Using respective characteristics of Wikipedia articles and Web corpus, we develop a clustering approach based on combinations of patterns: dependency patterns from dependency analysis of texts in Wikipedia, and surface patterns generated from highly redundant information related to the Web. Evaluations of the proposed approach on two different domains demonstrate the superiority of the pattern combination over existing approaches. Fundamentally, our method demonstrates how deep linguistic patterns contribute complementarily with Web surface patterns to the generation of various relations.

References

  1. Michele Banko, Michael J. Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni. 2007. Open information extraction from the Web. In Proceedings of IJCAI-2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Danushka Bollegala, Yutaka Matsuo and Mitsuru Ishizuka. 2007. Measuring Semantic Similarity between Words Using Web Search Engines. In Proceedings of WWW-2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Razvan C. Bunescu and Raymond J. Mooney. 2005. A shortest path dependency kernel for relation extraction. In Proceedings of HLT/EMLNP-2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jinxiu Chen, Donghong Ji, Chew Lim Tan and Zhengyu Niu. 2005. Unsupervised Feature Selection for Relation Extraction. In Proceedings of IJCNLP-2005.Google ScholarGoogle Scholar
  5. Aron Culotta and Jeffrey Sorensen. 2004. Dependency tree kernels for relation extraction. In Proceedings of the ACL-2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dmitry Davidov, Ari Rappoport and Moshe Koppel. 2007. Fully unsupervised discovery of concept-specific relationships by Web mining. In Proceedings of ACL-2007.Google ScholarGoogle Scholar
  7. Dmitry Davidov and Ari Rappoport. 2008. Classification of Semantic Relationships between Nominals Using Pattern Clusters. In Proceedings of ACL-2008.Google ScholarGoogle Scholar
  8. Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Yan, Jiawei Han, Philip S. Yu and Olivier Verscheure. 2008. Direct Mining of Discriminative and Essential Frequent Patterns via Model-based Search Tree. In Proceedings of KDD-2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Evgeniy Gabrilovich and Shaul Markovitch. 2006. Overcoming the brittleness bottleneck using wikipedia: Enhancing text categorization with encyclopedic knowledge. In Proceedings of AAAI-2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jim Giles. 2005. Internet encyclopaedias go head to head. Nature 438:900C901.Google ScholarGoogle ScholarCross RefCross Ref
  11. Sanda Harabagiu, Cosmin Adrian Bejan and Paul Morarescu. 2005. Shallow semantics for relation extraction. In Proceedings of IJCAI-2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Takaaki Hasegawa, Satoshi Sekine and Ralph Grishman. 2004. Discovering Relations among Named Entities from Large Corpora. In Proceedings of ACL-2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nanda Kambhatla. 2004. Combining lexical, syntactic and semantic features with maximum entropy models. In Proceedings of ACL-2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dat P. T. Nguyen, Yutaka Matsuo and Mitsuru Ishizuka. 2007. Relation extraction from Wikipedia using subtree mining. In Proceedings of AAAI-2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Patrick Pantel and Marco Pennacchiotti. 2006. Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In Proceedings of ACL-2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Benjamin Rosenfeld and Ronen Feldman. 2006. URES: an Unsupervised Web Relation Extraction System. In Proceedings of COLING/ACL-2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Benjamin Rosenfeld and Ronen Feldman. 2007. Clustering for Unsupervised Relation Identification. In Proceedings of CIKM-2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Peter D. Turney. 2006. Expressing implicit semantic relations without supervision. In Proceedings of ACL-2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Max Volkel, Markus Krotzsch, Denny Vrandecic, Heiko Haller and Rudi Studer. 2006. Semantic wikipedia. In Proceedings of WWW-2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Mohammed J. Zaki. 2002. Efficiently mining frequent trees in a forest. In Proceedings of SIGKDD-2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Min Zhang, Jie Zhang, Jian Su and Guodong Zhou. 2006. A Composite Kernel to Extract Relations between Entities with both Flat and Structured Features. In Proceedings of ACL-2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Unsupervised relation extraction by mining Wikipedia texts using information from the web

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image DL Hosted proceedings
              ACL '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
              August 2009
              595 pages
              ISBN:9781932432466
              • General Chair:
              • Keh-Yih Su

              Publisher

              Association for Computational Linguistics

              United States

              Publication History

              • Published: 2 August 2009

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate85of443submissions,19%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader