skip to main content
note

Translating Low-Resource Languages by Vocabulary Adaptation from Close Counterparts

Authors Info & Claims
Published:08 September 2017Publication History
Skip Abstract Section

Abstract

Some natural languages belong to the same family or share similar syntactic and/or semantic regularities. This property persuades researchers to share computational models across languages and benefit from high-quality models to boost existing low-performance counterparts. In this article, we follow a similar idea, whereby we develop statistical and neural machine translation (MT) engines that are trained on one language pair but are used to translate another language. First we train a reliable model for a high-resource language, and then we exploit cross-lingual similarities and adapt the model to work for a close language with almost zero resources. We chose Turkish (Tr) and Azeri or Azerbaijani (Az) as the proposed pair in our experiments. Azeri suffers from lack of resources as there is almost no bilingual corpus for this language. Via our techniques, we are able to train an engine for the Az → English (En) direction, which is able to outperform all other existing models.

References

  1. Eleftherios Avramidis and Philipp Koehn. 2008. Enriching morphologically poor languages for statistical machine translation. In Proceeding of the the Annual Meeting of the Association for Computational Linguistics (ACL’08). 763--770.Google ScholarGoogle Scholar
  2. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  3. Yoshua Bengio. 2012. Deep learning of representations for unsupervised and transfer learning. In Proceedings of ICML Unsupervised and Transfer Learning Workshop. 17--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, and Marcello Federico. 2016. Neural versus phrase-based machine translation quality: A case study. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 257--267.Google ScholarGoogle ScholarCross RefCross Ref
  5. Arianna Bisazza and Marcello Federico. 2009. Morphological pre-processing for Turkish to English statistical machine translation. In Proceedings of the 6th International Workshop on Spoken Language Translation (IWSLT’09). 129--135.Google ScholarGoogle Scholar
  6. Arianna Bisazza, Nick Ruiz, and Marcello Federico. 2011. Fill-up versus interpolation methods for phrase-based SMT adaptation. In Proceedings of the 8th International Workshop on Spoken Language Translation (IWSLT’11).Google ScholarGoogle Scholar
  7. Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder--decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1724--1734.Google ScholarGoogle ScholarCross RefCross Ref
  8. Junyoung Chung, Kyunghyun Cho, and Yoshua Bengio. 2016. A character-level decoder without explicit segmentation for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1693--1703.Google ScholarGoogle ScholarCross RefCross Ref
  9. Ilknur Durgar El-Kahlout and Kemal Oflazer. 2006. Initial explorations in English to Turkish statistical machine translation. In Proceedings of the Workshop on Statistical Machine Translation. 7--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ahmed El Kholy, Nizar Habash, Gregor Leusch, Evgeny Matusov, and Hassan Sawaf. 2013. Selective combination of pivot and direct statistical machine translation models. In Proceedings of the 6th International Joint Conference on Natural Language Processing. 1174--1180.Google ScholarGoogle Scholar
  11. Gülsen Eryigit and Eref Adali. 2004. An affix stripping morphological analyzer for turkish. In Proceedings of the IASTED International Conference on Artificial Intelligence and Applications. 299--304.Google ScholarGoogle Scholar
  12. Rauf Fatullayev, Ali Abbasov, and Abulfat Fatullayev. 2008. Dilmanc is the 1st MT system for azerbaijani. In Proceedings of the 2nd Swedish Language Technology Conference (SLTC’08). 63--64.Google ScholarGoogle Scholar
  13. Sharon Goldwater and David McClosky. 2005. Improving statistical MT through morphological analysis. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 676--683. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Wenbin Jiang, Yajuan Lü, Liang Huang, and Qun Liu. 2015. Automatic adaptation of annotations. Comput. Linguist. 41, 1 (2015), 119--147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Bevan Jones, Jacob Andreas, Daniel Bauer, Karl Moritz Hermann, and Kevin Knight. 2012. Semantics-based machine translation with hyperedge replacement grammars. In Proceedings of the 24th International Conference on Computational Linguistics. 1359--1376.Google ScholarGoogle Scholar
  17. Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1700--1709.Google ScholarGoogle Scholar
  18. Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR’15).Google ScholarGoogle Scholar
  19. Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 388--395.Google ScholarGoogle Scholar
  20. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. 177--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1. 48--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Philipp Koehn and Josh Schroeder. 2007. Experiments in domain adaptation for statistical machine translation. In Proceedings of the 2nd Workshop on Statistical Machine Translation. 224--227. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Pierre Lison and Jrg Tiedemann. 2016. OpenSubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 923--929.Google ScholarGoogle Scholar
  24. Antonio Valerio Miceli-Barone and Giuseppe Attardi. 2013. Pre-reordering for machine translation using transition-based walks on dependency parse trees. In Proceedings of the Eighth Workshop on Statistical Machine Translation. 162--167.Google ScholarGoogle Scholar
  25. RP Ñeco and Mikel L Forcada. 1996. Beyond mealy machines: Learning translators with recurrent neural networks. In Proceedings of the World Conference on Neural Networks. 408--411.Google ScholarGoogle Scholar
  26. Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics—Volume 1. 160--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Kemal Oflazer and Ilknur Durgar El-Kahlout. 2007. Exploring different representational units in English-to-Turkish statistical machine translation. In Proceedings of the 2nd Workshop on Statistical Machine Translation. Prague, Czech Republic, 25--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kurtulus Öztopçu. 1993. A comparison of modern azeri with modern turkish. Azerbaijan Int. 1, 3 (1993).Google ScholarGoogle Scholar
  29. Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345--1359. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 311--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Holger Schwenk, Daniel Dchelotte, and Jean-Luc Gauvain. 2006. Continuous space language models for statistical machine translation. In Proceedings of the COLING/ACL on Main Conference Poster Sessions. 723--730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Andreas Stolcke. 2002. SRILM - An extensible language modeling toolkit. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP’02—INTERSPEECH).Google ScholarGoogle Scholar
  33. Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Confrence on Advances in Neural Information Processing Systems (NIPS’14). 3104--3112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jörg Tiedemann. 2012. Parallel data, tools and interfaces in OPUS. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’12). 2214--2218.Google ScholarGoogle Scholar
  35. Dong Wang and Thomas Fang Zheng. 2015. Transfer learning for speech and language processing. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA’15). 1225--1237.Google ScholarGoogle ScholarCross RefCross Ref
  36. Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of 39th Annual Meeting of the Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Reyyan Yeniterzi and Kemal Oflazer. 2010. Syntax-to-morphology mapping in factored phrase-based statistical machine translation from English to Turkish. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 454--464. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. 2016. Transfer learning for low-resource neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Translating Low-Resource Languages by Vocabulary Adaptation from Close Counterparts

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 16, Issue 4
        December 2017
        146 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3097269
        Issue’s Table of Contents

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 September 2017
        • Accepted: 1 May 2017
        • Revised: 1 March 2017
        • Received: 1 November 2016
        Published in tallip Volume 16, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • note
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader