skip to main content
research-article
Open Access

Distortion Model Based on Word Sequence Labeling for Statistical Machine Translation

Authors Info & Claims
Published:01 February 2014Publication History
Skip Abstract Section

Abstract

This article proposes a new distortion model for phrase-based statistical machine translation. In decoding, a distortion model estimates the source word position to be translated next (subsequent position; SP) given the last translated source word position (current position; CP). We propose a distortion model that can simultaneously consider the word at the CP, the word at an SP candidate, the context of the CP and an SP candidate, relative word order among the SP candidates, and the words between the CP and an SP candidate. These considered elements are called rich context. Our model considers rich context by discriminating label sequences that specify spans from the CP to each SP candidate. It enables our model to learn the effect of relative word order among SP candidates as well as to learn the effect of distances from the training data. In contrast to the learning strategy of existing methods, our learning strategy is that the model learns preference relations among SP candidates in each sentence of the training data. This leaning strategy enables consideration of all of the rich context simultaneously. In our experiments, our model had higher BLUE and RIBES scores for Japanese-English, Chinese-English, and German-English translation compared to the lexical reordering models.

References

  1. Yaser Al-Onaizan and Kishore Papineni. 2006. Distortion models for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 529--536. DOI:http://dx.doi.org/10.3115/1220175.1220242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Adam L. Berger, Vincent J. Della Pietra, and Stephen A. Della Pietra. 1996. A maximum entropy approach to natural language processing. Comput. Linguist. 22, 1, 39--71. http://dl.acm.org/citation.cfm?id=234285.234289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chris Callison-Burch, Cameron Fordyce, Philipp Koehn, Christof Monz, and Josh Schroeder. 2008. Further meta-evaluation of machine translation. In Proceedings of the 3rd Workshop on Statistical Machine Translation. Association for Computational Linguistics, 70--106. http://www.aclweb.org/anthology/W/W08/W08-0309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Stanley F. Chen and Joshua T. Goodman. 1998. An empirical study of smoothing techniques for language modeling. Tech. rep. TR-10-98. Computer Science Group, Harvard University.Google ScholarGoogle Scholar
  5. Stanley F. Chen and Ronald Rosenfeld. 1999. A Gaussian prior for smoothing maximum entropy models. Tech. rep., School of Computer Science, Carnegie Mellon University.Google ScholarGoogle Scholar
  6. Colin Cherry. 2013. Improved reordering for phrase-based translation using sparse features. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 22--31. http://www.aclweb.org/anthology/N13-1003.Google ScholarGoogle Scholar
  7. Colin Cherry, Robert C. Moore, and Chris Quirk. 2012. On hierarchical re-ordering and permutation parsing for phrase-based decoding. In Proceedings of the 7th Workshop on Statistical Machine Translation. Association for Computational Linguistics, 200--209. http://www.aclweb.org/anthology/W12-3125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. David Chiang. 2007. Hierarchical phrase-based translation. Comput. Linguistics 33, 2, 201--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. David Chiang. 2010. Learning to translate with source and target syntax. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1443--1452. http://www.aclweb.org/anthology/P10-1146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. John DeNero and Jakob Uszkoreit. 2011. Inducing sentence structure from parallel corpora for reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 193--203. http://www.aclweb.org/anthology/D11-1018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Chris Dyer and Philip Resnik. 2010. Context-free reordering, finite-state translation. Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 858--866. http://www.aclweb.org/anthology/N10-1128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Theodoros Evgeniou and Massimiliano Pontil. 2002. Learning preference relations from data. In Proceedings of the 13th Italian Workshop on Neural Nets. Lecture Notes in Computer Science, vol. 2486, 23--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Minwei Feng, Jan-Thorsten Peter, and Hermann Ney. 2013. Advancements in reordering models for statistical machine translation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers). Association for Computational Linguistics, 322--332. http://www.aclweb.org/anthology/P13-1032.Google ScholarGoogle Scholar
  14. Yang Feng, Haitao Mi, Yang Liu, and Qun Liu. 2010. An efficient shift-reduce decoding algorithm for phrased-based machine translation. In Proceedings of the International Conference on Computational Linguistics. Coling 2010 Organizing Committee, 285--293. http://www.aclweb.org/anthology/C10-2033. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Michel Galley and Christopher D. Manning. 2008. A simple and effective hierarchical phrase reordering model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 848--856. http://www.aclweb.org/anthology/D08-1089. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Michel Galley, Mark Hopkins, Kevin Knight, and Daniel Marcu. 2004. What’s in a translation rule? In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL’04). D. Marcu, S. Dumais, and S. Roukos Eds., Association for Computational Linguistics, 273--280.Google ScholarGoogle ScholarCross RefCross Ref
  17. Niyu Ge. 2010. A direct syntax-driven reordering model for phrase-based machine translation. Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 849--857. http://www.aclweb.org/anthology/N10-1127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Dmitriy Genzel. 2010. Automatically learning source-side reordering rules for large scale machine translation. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling’10). Coling 2010 Organizing Committee, 376--384. http://www.aclweb.org/anthology/C10-1043. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Isao Goto, Bin Lu, Ka Po Chow, Eiichiro Sumita, and Benjamin K. Tsou. 2011. Overview of the patent machine translation task at the NTCIR-9 workshop. In Proceedings of the 9th NTCIR Workshop (NTCIR-9). 559--578.Google ScholarGoogle Scholar
  20. Isao Goto, Masao Utiyama, Eiichiro Sumita, Akihiro Tamura, and Sadao Kurohashi. 2013. Distortion model considering rich context for statistical machine translation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers). Association for Computational Linguistics, 155--165. http://www.aclweb.org/anthology/P13-1016.Google ScholarGoogle Scholar
  21. Spence Green, Michel Galley, and Christopher D. Manning. 2010. Improved models of distortion cost for statistical machine translation. Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 867--875. http://www.aclweb.org/anthology/N10-1129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Hieu Hoang, Philipp Koehn, and Adam Lopez. 2009. A unified framework for phrase-based, hierarchical, and syntax-based statistical machine translation. In Proceedings of the International Workshop on Spoken Language Translation. 152--159.Google ScholarGoogle Scholar
  23. Liang Huang, Kevin Knight, and Aravind Joshi. 2006. Statistical syntax-directed translation with extended domain of locality. In Proceedings of the 7th Conference of the Association for Machine Translation of the Americas. 66--73.Google ScholarGoogle Scholar
  24. Hideki Isozaki, Tsutomu Hirao, Kevin Duh, Katsuhito Sudoh, and Hajime Tsukada. 2010a. Automatic evaluation of translation quality for distant language pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 944--952. http://www.aclweb.org/anthology/D10-1092. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hideki Isozaki, Katsuhito Sudoh, Hajime Tsukada, and Kevin Duh. 2010b. Head finalization: A simple reordering rule for SOV languages. In Proceedings of the Joint 5th Workshop on Statistical Machine Translation and MetricsMATR. Association for Computational Linguistics, 244--251. http://www.aclweb.org/anthology/W10-1736. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’04). D. Lin and D. Wu Eds., Association for Computational Linguistics, 388--395.Google ScholarGoogle Scholar
  27. Philipp Koehn, Amittai Axelrod, Alexandra Birch Mayne, Chris Callison-Burch, Miles Osborne, and David Talbot. 2005. Edinburgh system description for the 2005 IWSLT speech translation evaluation. In Proceedings of the International Workshop on Spoken Language Translation.Google ScholarGoogle Scholar
  28. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. Association for Computational Linguistics, 177--180. http://www.aclweb.org/anthology/P07-2045. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). Morgan Kaufmann Publishers Inc., San Francisco, CA, 282--289. http://dl.acm.org/citation.cfm?id=645530.655813. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. D. C. Liu and J. Nocedal. 1989. On the limited memory method for large scale optimization. Math. Program. B 45, 3, 503--528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yang Liu, Qun Liu, and Shouxun Lin. 2006. Tree-to-string alignment template for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 609--616. DOI:http://dx.doi.org/10.3115/1220175.1220252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Yang Liu, Yajuan Lü, and Qun Liu. 2009. Improving tree-to-tree translation with packed forests. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics, 558--566. http://www.aclweb.org/anthology/P/P09/P09-1063. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Dennis Nolan Mehay and Christopher Hardie Brew. 2012. CCG syntactic reordering models for phrase-based machine translation. In Proceedings of the 7th Workshop on Statistical Machine Translation. Association for Computational Linguistics, 210--221. http://www.aclweb.org/anthology/W12-3126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Graham Neubig, Taro Watanabe, and Shinsuke Mori. 2012. Inducing a discriminative parser to optimize machine translation reordering. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 843--853. http://www.aclweb.org/anthology/D12-1077. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yizhao Ni, Craig Saunders, Sandor Szedmak, and Mahesan Niranjan. 2009. Handling phrase reorderings for machine translation. In Proceedings of the ACL-IJCNLP Conference Short Papers. Association for Computational Linguistics, 241--244. http://www.aclweb.org/anthology/P/P09/P09-2061. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 160--167. DOI:http://dx.doi.org/10.3115/1075096.1075117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 311--318. DOI:http://dx.doi.org/10.3115/1073083.1073135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Libin Shen, Jinxi Xu, and Ralph Weischedel. 2008. A new string-to-dependency machine translation algorithm with a target dependency language model. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 577--585. http://www.aclweb.org/anthology/P/P08/P08-1066.Google ScholarGoogle Scholar
  39. Andreas Stolcke, Jing Zheng, Wen Wang, and Victor Abrash. 2011. SRILM at sixteen: Update and outlook. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop.Google ScholarGoogle Scholar
  40. Christoph Tillman. 2004. A unigram orientation model for statistical machine translation. In Proceedings of the Human Language Technologies Conference of the North American Chapter of the Association for Computational Linguistics (Short Papers). D. Marcu, S. Dumais, and S. Roukos Eds., Association for Computational Linguistics, 101--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Karthik Visweswariah, Rajakrishnan Rajkumar, Ankur Gandhe, Ananthakrishnan Ramanathan, and Jiri Navratil. 2011. A word reordering model for improved machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 486--496. http://www.aclweb.org/anthology/D11-1045. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Chao Wang, Michael Collins, and Philipp Koehn. 2007. Chinese syntactic reordering for statistical machine translation. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). Association for Computational Linguistics, 737--745. http://www.aclweb.org/anthology/D/D07/D07-1077.Google ScholarGoogle Scholar
  43. Fei Xia and Michael McCord. 2004. Improving a statistical MT system with automatically learned rewrite patterns. In Proceedings of the 20th International Conference on Computational Linguistics. 508--514. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Deyi Xiong, Qun Liu, and Shouxun Lin. 2006. Maximum entropy based phrase reordering model for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 521--528. DOI:http://dx.doi.org/10.3115/1220175.1220241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Deyi Xiong, Min Zhang, Aiti Aw, and Haizhou Li. 2008. Linguistically annotated BTG for statistical machine translation. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling’08). Coling 2008 Organizing Committee, 1009--1016. http://www.aclweb.org/anthology/C08-1127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Deyi Xiong, Min Zhang, and Haizhou Li. 2012. Modeling the translation of predicate-argument structure for SMT. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 902--911. http://www.aclweb.org/anthology/P12-1095. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of 39th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 523--530. DOI:http://dx.doi.org/10.3115/1073012.1073079. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Richard Zens and Hermann Ney. 2006. Discriminative reordering models for statistical machine translation. In Proceedings of the Workshop on Statistical Machine Translation. Association for Computational Linguistics, 55--63. DOI:http://www.aclweb.org/anthology/W/W06/W06-3108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Richard Zens, Hermann Ney, Taro Watanabe, and Eiichiro Sumita. 2004. Reordering constraints for phrase-based statistical machine translation. In Proceedings of the 20th International Conference on Computational Linguistics. 205--211. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Distortion Model Based on Word Sequence Labeling for Statistical Machine Translation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian Language Information Processing
      ACM Transactions on Asian Language Information Processing  Volume 13, Issue 1
      February 2014
      93 pages
      ISSN:1530-0226
      EISSN:1558-3430
      DOI:10.1145/2590408
      Issue’s Table of Contents

      Copyright © 2014 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 February 2014
      • Accepted: 1 October 2013
      • Revised: 1 September 2013
      • Received: 1 May 2013
      Published in talip Volume 13, Issue 1

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader