research-article

Chinese Open Relation Extraction and Knowledge Base Establishment

Authors:
Shengbin Jia

Tongji University, Shanghai, P.R. China

Tongji University, Shanghai, P.R. China
View Profile

,
Shijia E

Tongji University, Shanghai, P.R. China

Tongji University, Shanghai, P.R. China
View Profile

,
Maozhen Li

Brunel University, London, the United Kingdom

Brunel University, London, the United Kingdom
View Profile

,
Yang Xiang

Tongji University, Shanghai, P.R. China

Tongji University, Shanghai, P.R. China
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 17 Issue 3Article No.: 15pp 1–22https://doi.org/10.1145/3162077

Published:14 February 2018Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

Named entity relation extraction is an important subject in the field of information extraction. Although many English extractors have achieved reasonable performance, an effective system for Chinese relation extraction remains undeveloped due to the lack of Chinese annotation corpora and the specificity of Chinese linguistics. Here, we summarize three kinds of unique but common phenomena in Chinese linguistics. In this article, we investigate unsupervised linguistics-based Chinese open relation extraction (ORE), which can automatically discover arbitrary relations without any manually labeled datasets, and research the establishment of a large-scale corpus. By mapping the entity relations into dependency-trees and considering the unique Chinese linguistic characteristics, we propose a novel unsupervised Chinese ORE model based on Dependency Semantic Normal Forms (DSNFs). This model imposes no restrictions on the relative positions among entities and relationships and achieves a high yield by extracting relations mediated by verbs or nouns and processing the parallel clauses. Empirical results from our model demonstrate the effectiveness of this method, which obtains stable performance on four heterogeneous datasets and achieves better precision and recall in comparison with several Chinese ORE systems. Furthermore, a large-scale knowledge base of entity and relation, called COER, is established and published by applying our method to web text, which conquers the trouble of lack of Chinese corpora.

Supplemental Material

Available for Download

zip

jia.zip (871.2 KB)

Supplemental movie, appendix, image and software files for, Chinese Open Relation Extraction and Knowledge Base Establishment

References

Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. DBpedia: A nucleus for a web of open data. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 4825. 722--735. Google ScholarDigital Library
Michele Banko, M. J. Cafarella, and Stephen Soderland. 2007. Open information extraction for the web. In Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI’07). 2670--2676. Google ScholarDigital Library
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD’08). 1247--1250. Google ScholarDigital Library
Danushka Tarupathi Bollegala, Yutaka Matsuo, and Mitsuru Ishizuka. 2010. Relational duality: Unsupervised extraction of semantic relations between entities on the web. In Proceedings of the International World Wide Web Conference (WWW’10). 151--160. Google ScholarDigital Library
Miriam Butt. 2003. The light verb jungle. Harv. Work. Pap. Ling. 9, 1988 (2003), 1--49.Google Scholar
Wanxiang Che, Jianmin Jiang, Zhong Su, Yue Pan, and Ting Liu. 2005. Improved-edit-distance kernel for chinese relation extraction. In Proceedings of the 2nd International Joint Conference on Natural Language Processing (IJCNLP’05). 134--139.Google Scholar
Wanxiang Che, Zhenghua Li, and Ting Liu. 2010. LTP: A chinese language technology platform. In Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations (COLING’10). 13--16. Google ScholarDigital Library
Yu Chen, Dequan Zheng, and Tiejun Zhao. 2012. Chinese relation extraction based on deep belief nets. J. Softw. 23, 10 (2012), 2572--2585.Google ScholarCross Ref
Yanping Chen, Qinghua Zheng, and Ping Chen. 2015. Feature assembly method for extracting relations in chinese. Artif. Intell. 228 (2015), 179--194. Google ScholarDigital Library
Nancy Chinchor and Elaine Marsh. 1998. MUC-7 information extraction task definition. In Proceedings of the 7th Message Understanding Conference (MUC-7’98). 359--367.Google Scholar
Janara Christensen, Mausam, Stephen Soderland, and Oren Etzioni. 2010. Semantic role labeling for open information extraction. In Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2010 1st International Workshop on Formalisms and Methodology for Learning by Reading. 52--60. Google ScholarDigital Library
Janara Christensen, Stephen Soderland, and Oren Etzioni. 2011. An analysis of open information extraction based on semantic role labeling categories and subject descriptors. In Proceedings of the 6th International Conference on Knowledge Capture (K-CAP’11). 113--119. Google ScholarDigital Library
Luciano Del Corro and Rainer Gemulla. 2013. Clausie: Clause-based open information extraction. In Proceedings of the 22nd International Conference on World Wide Web. 355--366. Google ScholarDigital Library
Cicero Nogueira dos Santos, Bing Xiang, and Bowen Zhou. 2015. Classifying relations by ranking with convolutional neural networks. In Proceedings of the 53nd Annual Meeting on Association for Computational Linguistics (ACL’15). 626--634.Google ScholarCross Ref
Oren Etzioni, Anthony Fader, Janara Christensen, Stephen Soderland, and Mausam Mausam. 2011. Open information extraction: The second generation. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’11), Vol. 1. 3--10. Google ScholarDigital Library
Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying relations for open information extraction. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP’11). 1535--1545. Google ScholarDigital Library
D Freitag. 2000. Machine learning for information extraction in informal domains. Mach. Learn. 39, 2-3 (2000), 169--202. Google ScholarDigital Library
Lixin Gan, Changxuan Wan, Dexi Liu, and Jiang Tengjiao Zhong, Qing. 2016. Chinese named entity relation extraction based on syntactic and semantic features. J. Comput. Res. Dev. 53, 2 (2016), 284--302.Google Scholar
Xiyue Guo, Tingting He, Xiaohua Hu, and Qianjun Chen. 2014. Chinese named entity relation extraction based on syntactic and semantic features. J. Chin. Inf. Process. 28, 6 (2014), 183--189.Google Scholar
Takaaki Hasegawa, Satoshi Sekine, and Ralph Grishman. 2004. Discovering relations among named entities from large corpora. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL’04), Vol. 415. 415--422. Google ScholarDigital Library
Chen Huang, Longhua Qin, Guodong Zhou, and Qiaoming Zhu. 2010. Research on unsupervised chinese entity relation extraction based on convolution tree kernel. J. Chin. Inf. Process. 24, 4 (2010), 11--17.Google Scholar
Nanda Kambhatla. 2004. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL’04). 22. Google ScholarDigital Library
Johannes Kirschnick, Holmer Hemsen, and Volker Markl. 2016. JEDI : Joint entity and relation detection using type inference. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16). 61--66.Google ScholarCross Ref
Sebastian Krause, Hong Li, Hans Uszkoreit, and Feiyu Xu. 2012. Large-scale learning of relation-extraction rules with distant supervision from the web. In Proceedings of the 11th International Conference on the Semantic Web (ISWC’12), Vol. 1. 263--278. Google ScholarDigital Library
Wenjie Li, Peng Zhang, Furu Wei, Yuexian Hou, and Qin Lu. 2008. A novel feature-based approach to chinese entity relation extraction. In Proceedings of the 46nd Annual Meeting of the Association for Computational Linguistics (ACL’08). 89--92. Google ScholarDigital Library
Ruqi Lin, Jinxiu Chen, Xiaofang Yang, and Honglei Xu. 2010. Research on mixed model-based chinese relation extraction. In Proceedings of the 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT’10), Vol. 1. 687--691.Google Scholar
Yankai Lin, Shiqi Shen, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. 2016. Neural relation extraction with selective attention over instances. In Proceedings of the 54nd Annual Meeting on Association for Computational Linguistics (ACL’16). 2124--2133.Google ScholarCross Ref
Dandan Liu, Zhiwei Zhao, Yanan Hu, and Longhua Qian. 2013. Incorporating lexical semantic similarity to tree kernel-based chinese relation extraction. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 7717. 11--21. Google ScholarDigital Library
Mausam, Michael Schmitz, Robert Bart, Stephen Soderland, and Oren Etzioni. 2012. Open language learning for information extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 523--534. Google ScholarDigital Library
Filipe Mesquita, Jordan Schmidek, and Denilson Barbosa. 2013. Effectiveness and efficiency of open relation extraction. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 447--457.Google Scholar
Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing Associations. 1003--1011. Google ScholarDigital Library
Makoto Miwa and Mohit Bansal. 2016. End-to-end relation extraction using LSTMs on sequences and tree structures. In Proceedings of the 54nd Annual Meeting on Association for Computational Linguistics (ACL’16). arxiv:1601.0770Google ScholarCross Ref
Andrea Moro and Roberto Navigli. 2013. Integrating syntactic and semantic analysis into the open information extraction paradigm. In Proceedings of the 22th International Joint Conference on Artificial Intelligence (IJCAI’13). 2148--2154. Google ScholarDigital Library
Ndapandula Nakashole, Gerhard Weikum, and Fabian M. Suchanek. 2012. PATTY: A taxonomy of relational patterns with semantic types.. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’12). 1135--1145. Google ScholarDigital Library
Vasin Punyakanok, Dan Roth, and Wen-tau Yih. 2008. The importance of syntactic parsing and inference in semantic role labeling. Comput. Ling. 34, May 2007 (2008), 257--287. Google ScholarDigital Library
Bing Qin, An’an Liu, and Ting Liu. 2015. Unsupervised chinese open entity relation extraction. J. Comput. Res. Dev. 52, 5 (2015), 1029--1035.Google Scholar
Likun Qiu and Yue Zhang. 2014. ZORE : A syntax-based system for chinese open relation extraction. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1870--1880.Google ScholarCross Ref
Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. YAGO: A core of semantic knowledge unifyingwordnet and wikipedia fabian. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). 697. Google ScholarDigital Library
Yuen-hsien Tseng, Lung-hao Lee, Shu-yen Lin, Bo-shun Liao, Mei-jun Liu, Hsin-hsi Chen, Oren Etzioni, and Anthony Fader. 2014. Chinese open relation extraction for knowledge acquisition. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL’14). 12--16.Google Scholar
Jing Wang. 2012. Research on Unsupervised Chinese Entity Relation Extraction Method. Ph.D. thesis.Google Scholar
Jing Wang, Jing Yang, Liang He, Xin Lin, Chao Chen, and Tianlong Ma. 2011. Chinese entity relation extraction based on word cooccurrence. Energy Proc. 13 (2011), 8048--8055.Google Scholar
Fei Wu and Daniel S. Weld. 2010. Open information extraction using wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL’10). 118--127. Google ScholarDigital Library
Yan Xu, Lili Mou, Ge Li, and Yunchuan Chen. 2015. Classifying relations via long short term memory networks along shortest dependency paths. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1785--1794.Google ScholarCross Ref
Daojian Zeng, Kang Liu, Yubo Chen, and Jun Zhao. 2015. Distant supervision for relation extraction via piecewise convolutional neural networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1753--1762.Google ScholarCross Ref
Ji Zhang, You Ouyang, Wenjie Li, and Yuexian Hou. 2009. A novel composite kernel approach to chinese entity relation extraction. In Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy (ICCPOL’09), Vol. 5459. 236--247. Google ScholarDigital Library
Peng Zhang, Wenjie Li, Furu Wei, Qin Lu, and Yuexian Hou. 2008. Exploiting the role of position feature in chinese relation extraction. In Proceedings of the 6th International Language Resources and Evaluation (LREC’08). 2120--2124.Google Scholar
Y. Zhang and J. F. Zhou. 2000. A trainable method for extracting chinese entity names and their relations. In Proceedings of the 2nd Chinese Language Processing Workshop. 66--72. Google ScholarDigital Library
Shanshan Zheng. 2013. Entity Relation Extraction Based on Chinese Grammar in Open Area. Ph.D. Dissertation.Google Scholar

Index Terms

Chinese Open Relation Extraction and Knowledge Base Establishment
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. World Wide Web
    1. Web mining
      1. Data extraction and integration

Recommendations

Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text
Open Relation Extraction (ORE) plays a significant role in the field of Information Extraction. It breaks the limitation that traditional relation extraction must pre-define relational types in the annotated corpus and specific domains restrictions, to ...
Read More
Incorporating lexical semantic similarity to tree kernel-based chinese relation extraction
CLSW'12: Proceedings of the 13th Chinese conference on Chinese Lexical Semantics

Lexical semantic information plays an important role in semantic relation extraction between named entities. This paper incorporates two kinds of lexical semantic similarity measures, thesaurus-based and corpus-based, into convolution tree kernels and ...
Read More
Extracting Chinese Domain-specific Open Entity and Relation by Using Learning Patterns
ACM TURC '20: Proceedings of the ACM Turing Celebration Conference - China

Nowadays, Chinese domain-specific relation extraction faces a major challenge, that is the lack of annotation data. To cope with this challenge, the distant supervision which can automatically label large-scale training data was proposed. However, the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 17, Issue 3
September 2018
196 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3184403
Editor:
Nianwen Xue
Brandeis University, Waltham, USA
Issue’s Table of Contents
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 February 2018
- Accepted: 1 November 2017
- Revised: 1 July 2017
- Received: 1 April 2017
Published in tallip Volume 17, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Chinese entity relation extraction
and knowledge base
dependency parsing
linguistics
open
unsupervised
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 33
  Total Citations
  View Citations
- 906
  Total Downloads
- Downloads (Last 12 months)66
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Chinese Open Relation Extraction and Knowledge Base Establishment

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

Incorporating lexical semantic similarity to tree kernel-based chinese relation extraction

Extracting Chinese Domain-specific Open Entity and Relation by Using Learning Patterns

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Chinese Open Relation Extraction and Knowledge Base Establishment

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

Incorporating lexical semantic similarity to tree kernel-based chinese relation extraction

Extracting Chinese Domain-specific Open Entity and Relation by Using Learning Patterns

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media