research-article

Free Access

Statistical machine translation with local language models

Author:
Christof Monz

University of Amsterdam, Amsterdam, The Netherlands

University of Amsterdam, Amsterdam, The Netherlands
View Profile

Authors Info & Claims

EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language ProcessingJuly 2011Pages 869–879

Published:27 July 2011Publication History

EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing

Pages 869–879

ABSTRACT

Part-of-speech language modeling is commonly used as a component in statistical machine translation systems, but there is mixed evidence that its usage leads to significant improvements. We argue that its limited effectiveness is due to the lack of lexicalization. We introduce a new approach that builds a separate local language model for each word and part-of-speech pair. The resulting models lead to more context-sensitive probability distributions and we also exploit the fact that different local models are used to estimate the language model probability of each word during decoding. Our approach is evaluated for Arabic- and Chinese-to-English translation. We show that it leads to statistically significant improvements for multiple test sets and also across different genres, when compared against a competitive baseline and a system using a part-of-speech model.

References

Yaser Al-Onaizan and Kishore Papineni. 2006. Distortion models for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 529--536. Google ScholarDigital Library
Jeff A. Bilmes and Katrin Kirchhoff. 2003. Factored language models and generalized parallel backoff. In Proceedings of the the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 4--6. Google ScholarDigital Library
Alexandra Birch, Miles Osborne, and Philipp Koehn. 2007. CCG supertags in factored statistical machine translation. In Proceedings of the Second Workshop on Statistical Machine Translation, pages 9--16. Google ScholarDigital Library
Hélène Bonneau-Maynard, Alexandre Allauzen, Daniel Déchelotte, and Holger Schwenk. 2007. Combining morphosyntactic enriched representation with n-best reranking in statistical translation. In Proceedings of the NAACL-HLT Workshop on Syntax and Structure in Statistical Translation, pages 65--71. Google ScholarDigital Library
Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, and Jeffrey Dean. 2007. Large language models in machine translation. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 858--867.Google Scholar
Eugene Charniak. 2000. A maximum-entropy-inspired parser. In Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference, pages 132--139. Google ScholarDigital Library
Stanley F. Chen and Joshua Goodman. 1999. An empirical study of smoothing techniques for language modeling. Computer Speech and Language, 13(4):359--393.Google ScholarDigital Library
David Chiang, Kevin Knight, and Wei Wang. 2009. 11,001 new features for statistical machine translation. In Proceedings of the North American Chapter of the Association for Computational Linguistics, pages 218--226. Google ScholarDigital Library
Michael Collins, Brian Roark, and Murat Saraclar. 2005. Discriminative syntactic language modeling for speech recognition. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 507--514. Google ScholarDigital Library
Saša Hasan, Oliver Bender, and Hermann Ney. 2006. Reranking translation hypotheses using structural properties. In Proceedings of the EACL Workshop on Learning Structured Information in Natural Language Applications, pages 41--48.Google Scholar
Peter Heeman. 1998. POS tagging versus classes in language modeling. In Proceedings of the Sixth Workshop on Very Large Corpora, pages 179--187.Google Scholar
Katrin Kirchhoff and Mei Yang. 2005. Improved language modeling for statistical machine translation. In Proceedings of the ACL Workshop on Building and Using Parallel Texts, pages 125--128. Google ScholarDigital Library
Philipp Koehn and Hieu Hoang. 2007. Factored translation models. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 868--876.Google Scholar
Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 48--54. Google ScholarDigital Library
Philipp Koehn, Amittai Axelrod, Alexandra Birch Mayne, Chris Callison-Burch, Miles Osborne, and David Talbot. 2005. Edinburgh system description for the 2005 IWSLT speech translation evaluation. In Proceedings of the International Workshop on Spoken Language Translation.Google Scholar
Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondřej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pages 177--180. Google ScholarDigital Library
Philipp Koehn, Abhishek Arun, and Hieu Hoang. 2008. Towards better machine translation quality for the german--english language pairs. In Proceedings of the Third Workshop on Statistical Machine Translation, pages 139--142. Google ScholarDigital Library
Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 388--395.Google Scholar
Roland Kuhn. 1988. Speech recognition and the frequency of recently used words: a modified Markov model for natural language. In Proceedings of the 12th conference on Computational Linguistics, pages 348--350. Google ScholarDigital Library
Zhifei Li and Sanjeev Khudanpur. 2008. Large-scale discriminative n-gram language models for statisti- cal machine translation. In Proceedings of AMTA, pages 133--142.Google Scholar
Christopher D. Manning. 1993. Automatic acquisition of a large subcategorization dictionary from corpora. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 235--242. Google ScholarDigital Library
Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. 1993. Building a large annotated corpus of english: the penn treebank. Computational Linguistics, 19:313--330. Google ScholarDigital Library
Eric W. Noreen. 1989. Computer Intensive Methods for Testing Hypotheses. An Introduction. Wiley-Interscience.Google Scholar
Franz Josef Och, Daniel Gildea, Sanjeev Khudanpur, Anoop Sarkar, Kenji Yamada, ALex Fraser, Shankar Kumar, Libin Shen, David Smith, Katherine Eng, Viren Jain, Zhen Jin, and Dragomir Radev. 2004. A smorgasbord of features for statistical machine translation. In Proceedings of the 2004 Meeting of the North American chapter of the Association for Computational Linguistics, pages 161--168.Google Scholar
Franz-Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL), pages 160--167. Google ScholarDigital Library
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2001. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL 2001), pages 311--318. Google ScholarDigital Library
Matt Post and Daniel Gildea. 2008. Parsers as language models for statistical machine translation. In Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, pages 172--181.Google Scholar
Stefan Riezler and John T. Maxwell. 2005. On some pitfalls in automatic evaluation and significance testing for MT. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages 57--64.Google Scholar
Holger Schwenk. 2007. Continuous space language models. Computer Speech and Language, 21:492--518. Google ScholarDigital Library
Libin Shen, Jinxi Xu, Bing Zhang, Spyros Matsoukas, and Ralph Weischedel. 2009. Effective use of linguistic and contextual information for statistical machine translation. In Proceedings of Empirical Methods in Natural Language Processing, pages 72--80. Google ScholarDigital Library
Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of Association for Machine Translation in the Americas, pages 223--231.Google Scholar
Andreas Stolcke. 2002. SRILM---an extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing, pages 901--904.Google Scholar
Christoph Tillmann. 2004. A unigram orientation model for statistical machine translation. In Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference (HLT/NAACL-04), pages 101--104. Google ScholarDigital Library
Wen Wang and Mary P. Harper. 2002. The Super-ARV language model: investigating the effectiveness of tightly integrating multiple knowledge sources. In Proceedings of Empirical Methods in Natural Language Processing, pages 238--247. Google ScholarDigital Library
Wen Wang, Andreas Stolcke, and Jing Zheng. 2007. Reranking machine translation hypotheses with structured and web-based language models. In IEEE Workshop on Automatic Speech Recognition & Understanding, pages 159--164.Google ScholarCross Ref
Jing Zheng, Necip Fazil Ayan, Wen Wang, Dimitra Vergyri, Nicolas Scheffer, and Andreas Stolcke. 2008. SRI systems in the NIST MT08 Evaluation. In Proceedings of the NIST 2008 Open MT Evaluation Workshop.Google Scholar

Statistical machine translation with local language models
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Factored bilingual n-gram language models for statistical machine translation

In this work, we present an extension of n-gram-based translation models based on factored language models (FLMs). Translation units employed in the n-gram-based approach to statistical machine translation (SMT) are based on mappings of sequences of raw ...
Read More
An incremental syntactic language model for statistical phrase-based machine translation
Read More
Preordering using a Target-Language Parser via Cross-Language Syntactic Projection for Statistical Machine Translation

When translating between languages with widely different word orders, word reordering can present a major challenge. Although some word reordering methods do not employ source-language syntactic structures, such structures are inherently useful for word ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing
July 2011
1647 pages
ISBN:9781937284114
General Chair:
Paola Merlo
University of Geneva
,
Program Chairs:
Regina Barzilay
Massachusetts Institute of Technology
,
Mark Johnson
Macquarie University
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 27 July 2011
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate73of234submissions,31%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 162
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Statistical machine translation with local language models

EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing

ABSTRACT

References

Cited By

Recommendations

Factored bilingual n-gram language models for statistical machine translation

An incremental syntactic language model for statistical phrase-based machine translation

Preordering using a Target-Language Parser via Cross-Language Syntactic Projection for Statistical Machine Translation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Statistical machine translation with local language models

EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing

ABSTRACT

References

Cited By

Recommendations

Factored bilingual n-gram language models for statistical machine translation

An incremental syntactic language model for statistical phrase-based machine translation

Preordering using a Target-Language Parser via Cross-Language Syntactic Projection for Statistical Machine Translation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media