ABSTRACT
Domain-specific relation extraction requires training data for supervised learning models, and thus, significant labeling effort. Distant supervision is often leveraged for creating large annotated corpora however these methods require handling the inherent noise. On the other hand, active learning approaches can reduce the annotation cost by selecting the most beneficial examples to label in order to learn a good model. The choice of examples can be performed sequentially, i.e. select one example in each iteration, or in batches, i.e. select a set of examples in each iteration. The optimization of the batch size is a practical problem faced in every real-world application of active learning, however it is often treated as a parameter decided in advance. In this work, we study the trade-off between model performance, the number of requested labels in a batch and the time spent in each round for real-time, domain specific relation extraction. Our results show that the use of an appropriate batch size produces competitive performance, even compared to a fully sequential strategy, while reducing the training time dramatically.
- Heike Adel, Benjamin Roth, and Hinrich Schütze. Comparing convolutional neural networks to traditional models for slot filling. In NAACL-HLT, 2016.Google ScholarCross Ref
- Alfredo Alba, Anni Coden, Anna Lisa Gentile, Daniel Gruhl, Petar Ristoski, and Steve Welch. Language agnostic dictionary extraction. In ISWC (ISWC-PDIndustry), number 1963 in CEUR Workshop Proceedings, 2017.Google Scholar
- Gabor Angeli, Julie Tibshirani, Jean Wu, and Christopher D Manning. Combining distant and partial supervision for relation extraction. In EMNLP, pages 1556-- 1567, 2014.Google ScholarCross Ref
- Isabelle Augenstein, Diana Maynard, and Fabio Ciravegna. Distantly supervised web relation extraction for knowledge base population. Semantic Web, 7(4):335-- 349, 2016.Google ScholarDigital Library
- Nguyen Bach and Sameer Badaskar. A review of relation extraction. Literature review for Language and Statistics II, 2, 2007.Google Scholar
- Klaus Brinker. Incorporating diversity in active learning with support vector machines. In Proceedings of the 20th International Conference on Machine Learning (ICML-03), pages 59--66, 2003. Google ScholarDigital Library
- Razvan Bunescu and Raymond Mooney. Learning to extract relations from the web using minimal supervision. In ACL, 2007.Google Scholar
- Razvan C Bunescu and Raymond J Mooney. A shortest path dependency kernel for relation extraction. In HLT/EMNLP, pages 724--731. ACL, 2005. Google ScholarDigital Library
- Rui Cai, Xiaodong Zhang, and Houfeng Wang. Bidirectional recurrent convolutional neural network for relation classification. In ACL, 2016.Google ScholarCross Ref
- Shayok Chakraborty, Vineeth Balasubramanian, and Sethuraman Panchanathan. Adaptive batch mode active learning. IEEE transactions on neural networks and learning systems, 26(8):1747--1760, 2015.Google ScholarCross Ref
- Anni Coden, Daniel Gruhl, Neal Lewis, Michael Tanenblatt, and Joe Terdiman. SPOT the drug! An unsupervised pattern matching method to extract drug names from very large clinical corpora. HISB'12, pages 33--39, 2012. Google ScholarDigital Library
- Aron Culotta and Jeffrey Sorensen. Dependency tree kernels for relation extraction. In ACL, 2004. Google ScholarDigital Library
- Begüm Demir, Claudio Persello, and Lorenzo Bruzzone. Batch-mode activelearning methods for the interactive classification of remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 49(3):1014--1031, 2011.Google ScholarCross Ref
- Lisheng Fu and Ralph Grishman. An efficient active learning framework for new relation types. In IJCNLP, 2013.Google Scholar
- Yarin Gal, Riashat Islam, and Zoubin Ghahramani. Deep Bayesian Active Learning with Image Data. In ICML, 2017.Google ScholarDigital Library
- Anna Lisa Gentile, Ziqi Zhang, Isabelle Augenstein, and Fabio Ciravegna. Unsupervised wrapper induction using linked data. In K-CAP, pages 41--48. ACM, 2013. Google ScholarDigital Library
- Yuhong Guo and Dale Schuurmans. Discriminative batch mode active learning. In NIPS, 2008. Google ScholarDigital Library
- Zhou GuoDong, Su Jian, Zhang Jie, and Zhang Min. Exploring various knowledge in relation extraction. In ACL, 2005. Google ScholarDigital Library
- Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, Diarmuid Ó Séaghdha, Sebastian Padó, Marco Pennacchiotti, Lorenza Romano, and Stan Szpakowicz. Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In DEW Workshop, pages 94--99. ACL, 2009. Google ScholarDigital Library
- Steven CH Hoi, Rong Jin, and Michael R Lyu. Large-scale text categorization by batch mode active learning. In Proceedings of the 15th international conference on World Wide Web, pages 633--642. ACM, 2006. Google ScholarDigital Library
- Wei-Ning Hsu and Hsuan-Tien Lin. Active learning by learning. In AAAI, 2015. Google ScholarDigital Library
- Sheng-Jun Huang, Rong Jin, and Zhi-Hua Zhou. Active learning by querying informative and representative examples. In NIPS, pages 892--900, 2010. Google ScholarDigital Library
- Guoliang Ji, Kang Liu, Shizhu He, and Jun Zhao. Distant supervision for relation extraction with sentence-level attention and entity descriptions. In AAAI, pages 3060--3066, 2017.Google Scholar
- Nanda Kambhatla. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In Proceedings of the ACL 2004 on Interactive poster and demonstration sessions, page 22. ACL, 2004. Google ScholarDigital Library
- Diederik P. Kingma and Jimmy Ba. Adam:a method for stochastic optimization. In ICLR, 2015.Google Scholar
- David D Lewis and Jason Catlett. Heterogeneous uncertainty sampling for supervised learning. In ICML, pages 148--156, 1994. Google ScholarDigital Library
- Zhuang Li, Lizhen Qu, Qiongkai Xu, and Mark Johnson. Unsupervised pretraining with sequence reconstruction loss for deep relation extraction models. In Australasian Language Technology Association Workshop 2016.Google Scholar
- Yankai Lin, Shiqi Shen, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. Neural relation extraction with selective attention over instances. In ACL, 2016.Google ScholarCross Ref
- ChunYang Liu, WenBo Sun, WenHan Chao, and Wanxiang Che. Convolution neural network for relation extraction. In Part II of the Proceedings of the 9th International Conference on Advanced Data Mining and Applications-Volume 8347, 2013. Google ScholarDigital Library
- Minguang Xiao Cong Liu. Semantic relation classification via hierarchical recurrent neural network with attention. In COLING, 2016.Google Scholar
- Yang Liu, Furu Wei, Sujian Li, Heng Ji, Ming Zhou, and Houfeng Wang. A dependency-based neural network for relation classification. In arXiv preprint arXiv:1507.04646, 2015.Google Scholar
- Ismini Lourentzou, Alfredo Alba, Anni Coden, Anna Lisa Gentile, Daniel Gruhl, and Steve Welch. Mining relations from unstructured content. In Advances in Knowledge Discovery and Data Mining - 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, Australia, June 2018, page to appear, 2018.Google Scholar
- Makoto Miwa and Mohit Bansal. End-to-end relation extraction using lstms on sequences and tree structures. In arXiv preprint arXiv:1601.00770, 2016.Google Scholar
- Raymond J Mooney and Razvan C Bunescu. Subsequence kernels for relation extraction. In NIPS, pages 171--178, 2006. Google ScholarDigital Library
- Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann machines. In ICML, pages 807--814, 2010. Google ScholarDigital Library
- Thien Huu Nguyen and Ralph Grishman. Employing word representations and regularization for domain adaptation of relation extraction. In ACL, 2014.Google ScholarCross Ref
- Thien Huu Nguyen and Ralph Grishman. Relation extraction: Perspective from convolutional neural networks. In VS@ HLT-NAACL, 2015.Google ScholarCross Ref
- Jeffrey Pennington, Richard Socher, and Christopher D Manning. Glove: Global vectors for word representation. In EMNLP, volume 14, pages 1532--1543, 2014.Google ScholarCross Ref
- Longhua Qian, Guodong Zhou, Fang Kong, Qiaoming Zhu, and Peide Qian. Exploiting constituent dependencies for tree kernel-based semantic relation extraction. In Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, pages 697--704. ACL, 2008. Google ScholarDigital Library
- Alexander J. Ratner, Christopher De Sa, Sen Wu, Daniel Selsam, and Christopher Ré. Data programming: Creating large training sets, quickly. In NIPS, pages 3567--3575, 2016. Google ScholarDigital Library
- Benjamin Roth, Tassilo Barth, Michael Wiegand, and Dietrich Klakow. A survey of noise reduction methods for distant supervision. In AKBC, pages 73--78. ACM, 2013. Google ScholarDigital Library
- Cicero Nogueira dos Santos, Bing Xiang, and Bowen Zhou. Classifying relations by ranking with convolutional neural networks. In arXiv preprint arXiv:1504.06580, 2015.Google Scholar
- Ozan Sener and Silvio Savarese. A geometric approach to active learning for convolutional neural networks. arXiv preprint arXiv:1708.00489, 2017.Google Scholar
- Burr Settles. Active learning literature survey. University of Wisconsin, Madison, 52(55--66):11, 2010.Google ScholarDigital Library
- Yatian Shen and Xuanjing Huang. Attention-based convolutional neural network for semantic relation extraction. In COLING, 2016.Google Scholar
- Samuel L Smith, Pieter-Jan Kindermans, and Quoc V Le. Don't decay the learning rate, increase the batch size. arXiv preprint arXiv:1711.00489, 2017.Google Scholar
- Gabriel Stanovsky, Daniel Gruhl, and Pablo Mendes. Recognizing mentions of adverse drug reaction in social media using knowledge-infused recurrent models. In EACL, pages 142--151. ACL, 2017.Google ScholarCross Ref
- Lucas Sterckx, Thomas Demeester, Johannes Deleu, and Chris Develder. Using active learning and semantic clustering for noise reduction in distant supervision. In AKBC at NIPS, pages 1--6, 2014.Google Scholar
- Fabian M Suchanek, Georgiana Ifrim, and Gerhard Weikum. Combining linguistic and statistical analysis to extract relations from web documents. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 712--717. ACM, 2006. Google ScholarDigital Library
- Ang Sun and Ralph Grishman. Active learning for relation type extension with local and global data views. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 1105--1112. ACM, 2012. Google ScholarDigital Library
- Simon Tong and Daphne Koller. Support vector machine active learning with applications to text classification. Journal of machine learning research, 2(Nov):45-- 66, 2001. Google ScholarDigital Library
- Ngoc Thang Vu, Heike Adel, Pankaj Gupta, et al. Combining recurrent and convolutional neural networks for relation classification. In NAACL-HLT, pages 534--539, 2016.Google ScholarCross Ref
- Linlin Wang, Zhu Cao, Gerard de Melo, and Zhiyuan Liu. Relation classification via multi-level attention cnns. In ACL, 2016.Google ScholarCross Ref
- Xiaobin Wang, Yu Hong, Jianmin Yao, Qiaoming Zhu, and Guodong Zhou. A novel approach for relation extraction with few labeled data. pages 73--84, 2016.Google Scholar
- Zheng Wang and Jieping Ye. Querying discriminative and representative samples for batch mode active learning. ACM Transactions on Knowledge Discovery from Data (TKDD), 9(3):17, 2015. Google ScholarDigital Library
- Kai Wei, Rishabh Iyer, and Jeff Bilmes. Submodularity in data subset selection and active learning. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 1954--1963, 2015. Google ScholarDigital Library
- Kun Xu, Yansong Feng, Songfang Huang, and Dongyan Zhao. Semantic relation classification via convolutional neural networks with simple negative sampling. arXiv preprint arXiv:1506.07650, 2015.Google Scholar
- Dmitry Zelenko, Chinatsu Aone, and Anthony Richardella. Kernel methods for relation extraction. Journal of machine learning research, 3:1083--1106, 2003. Google ScholarDigital Library
- Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, Jun Zhao, et al. Relation classification via convolutional deep neural network. In COLING, pages 2335-- 2344, 2014.Google Scholar
- Shubin Zhao and Ralph Grishman. Extracting relations with integrated information using kernel methods. In ACL, pages 419--426. ACL, 2005. Google ScholarDigital Library
Index Terms
Exploring the Efficiency of Batch Active Learning for Human-in-the-Loop Relation Extraction
Recommendations
Multi-label active learning by model guided distribution matching
Multi-label learning is an effective framework for learning with objects that have multiple semantic labels, and has been successfully applied into many real-world tasks. In contrast with traditional single-label learning, the cost of labeling a multi-...
Learning labeling functions in distantly supervised relation extraction
Distant supervision has become the leading method for training large-scale information extractors. It could be encoded in the form of labeling functions, which employ knowledge bases to provide labels for the data. However, most previous works use ...
Clustering-Augmented Multi-instance Learning for Neural Relation Extraction
Advances in Information RetrievalAbstractDespite its efficiency in generating training data, distant supervision for sentential relation extraction assigns labels to instances in a context-agnostic manner—a process that may introduce false labels and confuse sentential model learning. In ...
Comments