ABSTRACT
We introduce a novel method for grammatical error correction with a number of small corpora. To make the best use of several corpora with different characteristics, we employ a meta-learning with several base classifiers trained on different corpora. This research focuses on a grammatical error correction task for article errors. A series of experiments is presented to show the effectiveness of the proposed approach on two different grammatical error tagged corpora.
- R. K. Ando and T. Zhang. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6, pp. 1817--1853. Google ScholarDigital Library
- U. Aydin, S. Murat, Olcay T Yildiz, A. Ethem, 2009, Incremental construction of classifier and discriminant ensembles, Information Science, 179 (9), pp. 144--152. Google ScholarDigital Library
- L. Breiman, 1996, Bagging predictors, Machine Learning, pp. 123--140. Google ScholarDigital Library
- S. Cohen, L. Rokach, O. Maimon, 2007, Decision tree instance space decomposition with grouped gain-ratio, Information Science, 177 (17), pp. 3592--3612. Google ScholarDigital Library
- D. Dahlmeier, H. T. Ng, 2011, Grammatical error correction with alternating structure optimization, In Proceedings of the 49th Annual Meeting of the ACL-HLT 2011, pp. 915--923. Google ScholarDigital Library
- R. De Felice. 2008. Automatic Error Detection in Non-native English. Ph.D. thesis, University of Oxford.Google Scholar
- S. Dzeroski, B. Zenko, 2004, Is combining classifiers with stacking better than selecting the best one?, Machine Learning, 54 (3), pp. 255--273. Google ScholarDigital Library
- J. R. Finkel, T. Grenager, and C. Manning. 2005. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling. In Proceedings of the 43nd Annual Meeting of the ACL, pp. 363--370. Google ScholarDigital Library
- N. R. Han, M. Chodorow, and C. Leacock. 2006. Detecting errors in English article usage by non-native speakers. Natural Language Engineering, 12(02), pp. 115--129. Google ScholarDigital Library
- N. R. Han, J. Tetreault, S. H. Lee, and J. Y. Ha. 2010. Using an error-annotated learner corpus to develop an ESL/EFL error correction system. In Proceedings of LREC.Google Scholar
- D. Klein and C. D. Manning. 2003a. Accurate unlexicalized parsing. In Proceedings of ACL, pp. 423--430. Google ScholarDigital Library
- D. Klein and C. D. Manning. 2003b. Fast exact inference with a factored model for natural language processing. Advances in Neural Information Processing Systems (NIPS 2002), 15, pp. 3--10.Google Scholar
- K. Knight and I. Chander. 1994. Automated postediting of documents. In Proceedings of AAAI, pp. 779--784. Google ScholarDigital Library
- J. Lee. 2004. Automatic article restoration. In Proceedings of HLT-NAACL, pp. 31--36. Google ScholarDigital Library
- R. Nagata, A. Kawai, K. Morihiro, and N. Isu. 2006. A feedback-augmented method for detecting errors in the writing of learners of English. In Proceedings of COLING-ACL, pp. 241--248. Google ScholarDigital Library
- A. Mariko, 2007, Grammatical errors across proficiency levels in L2 spoken and written English, The Economic Journal of Takasaki City University of Economics, 49 (3, 4), pp. 117--129.Google Scholar
- E. Menahem, L. Rokach, Y. Elovici, 2009, Troika-An imporoved stacking schema for classification tasks, Information Science, 179 (24), pp. 4097--4122. Google ScholarDigital Library
- G. Minnen, F. Bond, and A. Copestake. 2000. Memory-based learning for article generation. In Proceedings of CoNLL, pp. 43--48. Google ScholarDigital Library
- E. Izumi, K. Uchimoto, H. Isahara, 2005, Error annotation for corpus of Japanese learner English, In Proceedings of the 6th International Workshop on Linguistically Interpreted Corpora, pp. 71--80.Google Scholar
- A. Rozovskaya and D. Roth. 2010. Training paradigms for correcting errors in grammar and usage. In Proceedings of HLT-NAACL, pp. 154--162. Google ScholarDigital Library
- K. Toutanova and C. D. Manning. 2000. Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. In Proceedings of the Joint SIGDAT Conference on EMNLP/VLC-2000, pp. 63--70. Google ScholarDigital Library
- H. Yannakoudakis, T. Briscoe, B. Medlock, 2011, A new dataset and method for automatically grading ESOL texts, In Proceedings of ACL, pp. 180--189. Google ScholarDigital Library
- G. P. Zhang, 2007, A neural network ensemble method with jittered training data for time series forecasting, Information Sciences: An International Journal, 177 (23), pp. 5329--5346. Google ScholarDigital Library
Recommendations
Grammatical error correction with alternating structure optimization
HLT '11: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1We present a novel approach to grammatical error correction based on Alternating Structure Optimization. As part of our work, we introduce the NUS Corpus of Learner English (NUCLE), a fully annotated one million words corpus of learner English available ...
Weaken Grammatical Error Influence in Chinese Grammatical Error Correction
Natural Language Processing and Chinese ComputingAbstractChinese grammatical error correction (CGEC), a task of correcting grammatical errors in text, is treated as a translation task, where error sentences are “translated” to correct sentences. However, some grammatical errors in the training data can ...
Multilingual fine-tuning for Grammatical Error Correction
AbstractFinding a single model capable of comprehending multiple languages is an area of active research in Natural Language Processing (NLP). Recently developed models such as mBART, mT5 or xProphetNet can solve problems connected with, for ...
Highlights- Single model is capable of solving GEC for multiple languages.
- Pre-trained ...
Comments