ABSTRACT
This paper proposes a novel hierarchical learning strategy to deal with the data sparseness problem in relation extraction by modeling the commonality among related classes. For each class in the hierarchy either manually predefined or automatically clustered, a linear discriminative function is determined in a top-down way using a perceptron algorithm with the lower-level weight vector derived from the upper-level weight vector. As the upper-level class normally has much more positive training examples than the lower-level class, the corresponding linear discriminative function can be determined more reliably. The upper-level discriminative function then can effectively guide the discriminative function learning in the lower-level, which otherwise might suffer from limited training data. Evaluation on the ACE RDC 2003 corpus shows that the hierarchical strategy much improves the performance by 5.6 and 5.1 in F-measure on least- and medium- frequent relations respectively. It also shows that our system outperforms the previous best-reported system by 2.7 in F-measure on the 24 subtypes using the same feature set.
- ACE. (2000-2005). Automatic Content Extraction. http://www.ldc.upenn.edu/Projects/ACE/Google Scholar
- Bunescu R. & Mooney R. J. (2005a). A shortest path dependency kernel for relation extraction. HLT/EMNLP'2005: 724--731. 6--8 Oct 2005. Vancover, B.C. Google ScholarDigital Library
- Bunescu R. & Mooney R. J. (2005b). Subsequence Kernels for Relation Extraction NIPS'2005. Vancouver, BC, December 2005Google Scholar
- Breiman L. (1996) Bagging Predictors. Machine Learning, 24(2): 123--140. Google ScholarCross Ref
- Collins M. (1999). Head-driven statistical models for natural language parsing. Ph.D. Dissertation, University of Pennsylvania. Google ScholarDigital Library
- Culotta A. and Sorensen J. (2004). Dependency tree kernels for relation extraction. ACL'2004. 423--429. 21--26 July 2004. Barcelona, Spain. Google ScholarDigital Library
- Kambhatla N. (2004). Combining lexical, syntactic and semantic features with Maximum Entropy models for extracting relations. ACL'2004(Poster). 178--181. 21--26 July 2004. Barcelona, Spain. Google ScholarDigital Library
- Miller G. A. (1990). WordNet: An online lexical database. International Journal of Lexicography. 3(4):235--312.Google ScholarCross Ref
- Miller S., Fox H., Ramshaw L. and Weischedel R. (2000). A novel use of statistical parsing to extract information from text. ANLP'2000. 226--233. 29 April - 4 May 2000, Seattle, USA Google ScholarDigital Library
- MUC-7. (1998). Proceedings of the 7th Message Understanding Conference (MUC-7). Morgan Kaufmann, San Mateo, CA.Google Scholar
- Platt J. 1999. Probabilistic Outputs for Support Vector Machines and Comparisions to regularized Likelihood Methods. In Advances in Large Margin Classifiers. Edited by Smola J., Bartlett P., Scholkopf B. and Schuurmans D. MIT Press.Google Scholar
- Roth D. and Yih W. T. (2002). Probabilistic reasoning for entities and relation recognition. CoLING'2002. 835--841.26--30 Aug 2002. Taiwan. Google ScholarDigital Library
- Zelenko D., Aone C. and Richardella. (2003). Kernel methods for relation extraction. Journal of Machine Learning Research. 3(Feb):1083--1106. Google ScholarDigital Library
- Zhang M., Su J., Wang D. M., Zhou G. D. and Tan C. L. (2005). Discovering Relations from a Large Raw Corpus Using Tree Similarity-based Clustering, IJCNLP '2005, Lecture Notes in Computer Science (LNCS 3651). 378--389. 11--16 Oct 2005. Jeju Island, South Korea. Google ScholarDigital Library
- Zhao S. B. and Grisman R. 2005. Extracting relations with integrated information using kernel methods. ACL'2005: 419--426. Univ of Michigan-Ann Arbor, USA, 25--30 June 2005. Google ScholarDigital Library
- Zhou G. D. and Su Jian. Named Entity Recognition Using a HMM-based Chunk Tagger, ACL'2002. pp473--480. Philadelphia. July 2002. Google ScholarDigital Library
- Zhou G. D., Su J. Zhang J. and Zhang M. (2005). Exploring various knowledge in relation extraction. ACL'2005. 427--434. 25--30 June, Ann Arbor, Michgan, USA.Google Scholar
- Modeling commonality among related classes in relation extraction
Recommendations
Learning labeling functions in distantly supervised relation extraction
Distant supervision has become the leading method for training large-scale information extractors. It could be encoded in the form of labeling functions, which employ knowledge bases to provide labels for the data. However, most previous works use only ...
Label propagation via bootstrapped support vectors for semantic relation extraction between named entities
This paper proposes a semi-supervised learning method for semantic relation extraction between named entities. Given a small amount of labeled data, it benefits much from a large amount of unlabeled data by first bootstrapping a moderate number of ...
Distant Supervision for Relation Extraction via Group Selection
ICONIP 2015: Proceeings, Part II, of the 22nd International Conference on Neural Information Processing - Volume 9490Distant supervision DS aligns relations between name entities from a knowledge base KB with free text and automatically annotates the training corpus with relation mentions. One big challenge of DS is that the heuristically generated relation labels ...
Comments