Abstract
Some natural languages belong to the same family or share similar syntactic and/or semantic regularities. This property persuades researchers to share computational models across languages and benefit from high-quality models to boost existing low-performance counterparts. In this article, we follow a similar idea, whereby we develop statistical and neural machine translation (MT) engines that are trained on one language pair but are used to translate another language. First we train a reliable model for a high-resource language, and then we exploit cross-lingual similarities and adapt the model to work for a close language with almost zero resources. We chose Turkish (Tr) and Azeri or Azerbaijani (Az) as the proposed pair in our experiments. Azeri suffers from lack of resources as there is almost no bilingual corpus for this language. Via our techniques, we are able to train an engine for the Az → English (En) direction, which is able to outperform all other existing models.
- Eleftherios Avramidis and Philipp Koehn. 2008. Enriching morphologically poor languages for statistical machine translation. In Proceeding of the the Annual Meeting of the Association for Computational Linguistics (ACL’08). 763--770.Google Scholar
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations.Google Scholar
- Yoshua Bengio. 2012. Deep learning of representations for unsupervised and transfer learning. In Proceedings of ICML Unsupervised and Transfer Learning Workshop. 17--36. Google ScholarDigital Library
- Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, and Marcello Federico. 2016. Neural versus phrase-based machine translation quality: A case study. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 257--267.Google ScholarCross Ref
- Arianna Bisazza and Marcello Federico. 2009. Morphological pre-processing for Turkish to English statistical machine translation. In Proceedings of the 6th International Workshop on Spoken Language Translation (IWSLT’09). 129--135.Google Scholar
- Arianna Bisazza, Nick Ruiz, and Marcello Federico. 2011. Fill-up versus interpolation methods for phrase-based SMT adaptation. In Proceedings of the 8th International Workshop on Spoken Language Translation (IWSLT’11).Google Scholar
- Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder--decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1724--1734.Google ScholarCross Ref
- Junyoung Chung, Kyunghyun Cho, and Yoshua Bengio. 2016. A character-level decoder without explicit segmentation for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1693--1703.Google ScholarCross Ref
- Ilknur Durgar El-Kahlout and Kemal Oflazer. 2006. Initial explorations in English to Turkish statistical machine translation. In Proceedings of the Workshop on Statistical Machine Translation. 7--14. Google ScholarDigital Library
- Ahmed El Kholy, Nizar Habash, Gregor Leusch, Evgeny Matusov, and Hassan Sawaf. 2013. Selective combination of pivot and direct statistical machine translation models. In Proceedings of the 6th International Joint Conference on Natural Language Processing. 1174--1180.Google Scholar
- Gülsen Eryigit and Eref Adali. 2004. An affix stripping morphological analyzer for turkish. In Proceedings of the IASTED International Conference on Artificial Intelligence and Applications. 299--304.Google Scholar
- Rauf Fatullayev, Ali Abbasov, and Abulfat Fatullayev. 2008. Dilmanc is the 1st MT system for azerbaijani. In Proceedings of the 2nd Swedish Language Technology Conference (SLTC’08). 63--64.Google Scholar
- Sharon Goldwater and David McClosky. 2005. Improving statistical MT through morphological analysis. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 676--683. Google ScholarDigital Library
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780. Google ScholarDigital Library
- Wenbin Jiang, Yajuan Lü, Liang Huang, and Qun Liu. 2015. Automatic adaptation of annotations. Comput. Linguist. 41, 1 (2015), 119--147. Google ScholarDigital Library
- Bevan Jones, Jacob Andreas, Daniel Bauer, Karl Moritz Hermann, and Kevin Knight. 2012. Semantics-based machine translation with hyperedge replacement grammars. In Proceedings of the 24th International Conference on Computational Linguistics. 1359--1376.Google Scholar
- Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1700--1709.Google Scholar
- Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR’15).Google Scholar
- Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 388--395.Google Scholar
- Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. 177--180. Google ScholarDigital Library
- Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1. 48--54. Google ScholarDigital Library
- Philipp Koehn and Josh Schroeder. 2007. Experiments in domain adaptation for statistical machine translation. In Proceedings of the 2nd Workshop on Statistical Machine Translation. 224--227. Google ScholarDigital Library
- Pierre Lison and Jrg Tiedemann. 2016. OpenSubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 923--929.Google Scholar
- Antonio Valerio Miceli-Barone and Giuseppe Attardi. 2013. Pre-reordering for machine translation using transition-based walks on dependency parse trees. In Proceedings of the Eighth Workshop on Statistical Machine Translation. 162--167.Google Scholar
- RP Ñeco and Mikel L Forcada. 1996. Beyond mealy machines: Learning translators with recurrent neural networks. In Proceedings of the World Conference on Neural Networks. 408--411.Google Scholar
- Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics—Volume 1. 160--167. Google ScholarDigital Library
- Kemal Oflazer and Ilknur Durgar El-Kahlout. 2007. Exploring different representational units in English-to-Turkish statistical machine translation. In Proceedings of the 2nd Workshop on Statistical Machine Translation. Prague, Czech Republic, 25--32. Google ScholarDigital Library
- Kurtulus Öztopçu. 1993. A comparison of modern azeri with modern turkish. Azerbaijan Int. 1, 3 (1993).Google Scholar
- Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345--1359. Google ScholarDigital Library
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 311--318. Google ScholarDigital Library
- Holger Schwenk, Daniel Dchelotte, and Jean-Luc Gauvain. 2006. Continuous space language models for statistical machine translation. In Proceedings of the COLING/ACL on Main Conference Poster Sessions. 723--730. Google ScholarDigital Library
- Andreas Stolcke. 2002. SRILM - An extensible language modeling toolkit. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP’02—INTERSPEECH).Google Scholar
- Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Confrence on Advances in Neural Information Processing Systems (NIPS’14). 3104--3112. Google ScholarDigital Library
- Jörg Tiedemann. 2012. Parallel data, tools and interfaces in OPUS. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’12). 2214--2218.Google Scholar
- Dong Wang and Thomas Fang Zheng. 2015. Transfer learning for speech and language processing. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA’15). 1225--1237.Google ScholarCross Ref
- Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of 39th Annual Meeting of the Association for Computational Linguistics. Google ScholarDigital Library
- Reyyan Yeniterzi and Kemal Oflazer. 2010. Syntax-to-morphology mapping in factored phrase-based statistical machine translation from English to Turkish. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 454--464. Google ScholarDigital Library
- Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. 2016. Transfer learning for low-resource neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Google ScholarCross Ref
Index Terms
- Translating Low-Resource Languages by Vocabulary Adaptation from Close Counterparts
Recommendations
Neural Machine Translation for Low-resource Languages: A Survey
Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since the early 2000s and has already entered a mature phase. While considered the most widely used solution for Machine Translation, its performance on low-resource ...
Leveraging Additional Resources for Improving Statistical Machine Translation on Asian Low-Resource Languages
Phrase-based machine translation (MT) systems require large bilingual corpora for training. Nevertheless, such large bilingual corpora are unavailable for most language pairs in the world, causing a bottleneck for the development of MT. For the Asian ...
Morpheme-Based Neural Machine Translation Models for Low-Resource Fusion Languages
Neural approaches, which are currently state-of-the-art in many areas, have contributed significantly to the exciting advancements in machine translation. However, Neural Machine Translation (NMT) requires a substantial quantity and good quality parallel ...
Comments