skip to main content
10.5555/1614108.1614114dlproceedingsArticle/Chapter ViewAbstractPublication PagesnaaclConference Proceedingsconference-collections
research-article
Free Access

Translation model pruning via usage statistics for statistical machine translation

Published:22 April 2007Publication History

ABSTRACT

We describe a new pruning approach to remove phrase pairs from translation models of statistical machine translation systems. The approach applies the original translation system to a large amount of text and calculates usage statistics for the phrase pairs. Using these statistics the relevance of each phrase pair can be estimated. The approach is tested against a strong baseline based on previous work and shows significant improvements.

References

  1. Yasuhiro Akiba, Marcello Federico, Noriko Kando, Hiromi Nakaiwa, Michael Paul, and Jun'ichi Tsujii}. 2004. Overview of the IWSLT04 Evaluation Campaign. Proceedings of IWSLT 2004, Kyoto, Japan.Google ScholarGoogle Scholar
  2. Chris Callison-Burch, Colin Bannard, and Josh Schroeder. 2005. Scaling Phrase-Based Statistical Machine Translation to Larger Corpora and Longer Phrases. Proceedings of ACL 2005, Ann Arbor, MI, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Yann Le Cun, John S. Denker, and Sara A. Solla. 1990. Optimal brain damage. In Advances in Neural Information Processing Systems 2, pages 598--605. Morgan Kaufmann, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Matthias Eck, Ian Lane, Nguyen Bach, Sanjika Hewavitharana, Muntsin Kolss, Bing Zhao, Almut Silja Hildebrand, Stephan Vogel, and Alex Waibel. 2006. The UKA/CMU Statistical Machine Translation System for IWSLT 2006. Proceedings of IWSLT 2006, Kyoto, Japan.Google ScholarGoogle Scholar
  5. Ryosuke Isotani, Kyoshi Yamabana, Shinichi Ando, Ken Hanazawa, Shin-ya Ishikawa and Ken.ichi Iso. 2003. Speech-to-speech translation software on PDAs for travel conversation. NEC research&development, Tokyo, Japan.Google ScholarGoogle Scholar
  6. Philipp Koehn. 2004. A Beam Search Decoder for Statistical Machine Translation Models. Proceedings of AMTA 2004, Baltimore, MD, USA.Google ScholarGoogle ScholarCross RefCross Ref
  7. Franz Josef Och and Hermann Ney, 2000. Improved statistical alignment models, Proceedings of ACL 2000, Hongkong, China. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a Method for Automatic Evaluation of Machine Translation. Proceedings of ACL 2002, Philadelphia, PA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Toshiyuki Takezawa, Eiichiro Sumita, Fumiaki Sugaya, Hirofumi Yamamoto, and Seiichi Yamamoto. 2002. Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversation in the Real World. Proceedings of LREC 2002, Las Palmas, Spain.Google ScholarGoogle Scholar
  10. Stephan Vogel. 2005. PESA: Phrase Pair Extraction as Sentence Splitting. Proceedings of MTSummit X, Phuket, Thailand.Google ScholarGoogle Scholar
  11. Ying Zhang and Stephan Vogel. 2005. An Efficient Phrase-to-Phrase Alignment Model for Arbitrarily Long Phrases and Large Corpora. Proceedings of EAMT 2005, Budapest, Hungary.Google ScholarGoogle Scholar
  1. Translation model pruning via usage statistics for statistical machine translation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        NAACL-Short '07: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
        April 2007
        228 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 22 April 2007

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate21of29submissions,72%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader