ABSTRACT
This paper presents an unsupervised relation extraction method for discovering and enhancing relations in which a specified concept in Wikipedia participates. Using respective characteristics of Wikipedia articles and Web corpus, we develop a clustering approach based on combinations of patterns: dependency patterns from dependency analysis of texts in Wikipedia, and surface patterns generated from highly redundant information related to the Web. Evaluations of the proposed approach on two different domains demonstrate the superiority of the pattern combination over existing approaches. Fundamentally, our method demonstrates how deep linguistic patterns contribute complementarily with Web surface patterns to the generation of various relations.
- Michele Banko, Michael J. Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni. 2007. Open information extraction from the Web. In Proceedings of IJCAI-2007. Google ScholarDigital Library
- Danushka Bollegala, Yutaka Matsuo and Mitsuru Ishizuka. 2007. Measuring Semantic Similarity between Words Using Web Search Engines. In Proceedings of WWW-2007. Google ScholarDigital Library
- Razvan C. Bunescu and Raymond J. Mooney. 2005. A shortest path dependency kernel for relation extraction. In Proceedings of HLT/EMLNP-2005. Google ScholarDigital Library
- Jinxiu Chen, Donghong Ji, Chew Lim Tan and Zhengyu Niu. 2005. Unsupervised Feature Selection for Relation Extraction. In Proceedings of IJCNLP-2005.Google Scholar
- Aron Culotta and Jeffrey Sorensen. 2004. Dependency tree kernels for relation extraction. In Proceedings of the ACL-2004. Google ScholarDigital Library
- Dmitry Davidov, Ari Rappoport and Moshe Koppel. 2007. Fully unsupervised discovery of concept-specific relationships by Web mining. In Proceedings of ACL-2007.Google Scholar
- Dmitry Davidov and Ari Rappoport. 2008. Classification of Semantic Relationships between Nominals Using Pattern Clusters. In Proceedings of ACL-2008.Google Scholar
- Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Yan, Jiawei Han, Philip S. Yu and Olivier Verscheure. 2008. Direct Mining of Discriminative and Essential Frequent Patterns via Model-based Search Tree. In Proceedings of KDD-2008. Google ScholarDigital Library
- Evgeniy Gabrilovich and Shaul Markovitch. 2006. Overcoming the brittleness bottleneck using wikipedia: Enhancing text categorization with encyclopedic knowledge. In Proceedings of AAAI-2006. Google ScholarDigital Library
- Jim Giles. 2005. Internet encyclopaedias go head to head. Nature 438:900C901.Google ScholarCross Ref
- Sanda Harabagiu, Cosmin Adrian Bejan and Paul Morarescu. 2005. Shallow semantics for relation extraction. In Proceedings of IJCAI-2005. Google ScholarDigital Library
- Takaaki Hasegawa, Satoshi Sekine and Ralph Grishman. 2004. Discovering Relations among Named Entities from Large Corpora. In Proceedings of ACL-2004. Google ScholarDigital Library
- Nanda Kambhatla. 2004. Combining lexical, syntactic and semantic features with maximum entropy models. In Proceedings of ACL-2004. Google ScholarDigital Library
- Dat P. T. Nguyen, Yutaka Matsuo and Mitsuru Ishizuka. 2007. Relation extraction from Wikipedia using subtree mining. In Proceedings of AAAI-2007. Google ScholarDigital Library
- Patrick Pantel and Marco Pennacchiotti. 2006. Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In Proceedings of ACL-2006. Google ScholarDigital Library
- Benjamin Rosenfeld and Ronen Feldman. 2006. URES: an Unsupervised Web Relation Extraction System. In Proceedings of COLING/ACL-2006. Google ScholarDigital Library
- Benjamin Rosenfeld and Ronen Feldman. 2007. Clustering for Unsupervised Relation Identification. In Proceedings of CIKM-2007. Google ScholarDigital Library
- Peter D. Turney. 2006. Expressing implicit semantic relations without supervision. In Proceedings of ACL-2006. Google ScholarDigital Library
- Max Volkel, Markus Krotzsch, Denny Vrandecic, Heiko Haller and Rudi Studer. 2006. Semantic wikipedia. In Proceedings of WWW-2006. Google ScholarDigital Library
- Mohammed J. Zaki. 2002. Efficiently mining frequent trees in a forest. In Proceedings of SIGKDD-2002. Google ScholarDigital Library
- Min Zhang, Jie Zhang, Jian Su and Guodong Zhou. 2006. A Composite Kernel to Extract Relations between Entities with both Flat and Structured Features. In Proceedings of ACL-2006. Google ScholarDigital Library
Index Terms
- Unsupervised relation extraction by mining Wikipedia texts using information from the web
Recommendations
Relation extraction from wikipedia using subtree mining
AAAI'07: Proceedings of the 22nd national conference on Artificial intelligence - Volume 2The exponential growth and reliability of Wikipedia have made it a promising data source for intelligent systems. The first challenge of Wikipedia is to make the encyclopedia machine-processable. In this study, we address the problem of extracting ...
Subtree mining for relation extraction from Wikipedia
NAACL-Short '07: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short PapersIn this study, we address the problem of extracting relations between entities from Wikipedia's English articles. Our proposed method first anchors the appearance of entities in Wikipedia's articles using neither Named Entity Recognizer (NER) nor ...
Social relation extraction based on chinese wikipedia articles
CLSW'12: Proceedings of the 13th Chinese conference on Chinese Lexical SemanticsOur work in this paper pays more attention to information extraction about social relations from Chinese Wikipedia articles and construction of social relation network. After obtaining the Chinese Wikipedia articles according to the provided person name,...
Comments