skip to main content
10.5555/1626394.1626419dlproceedingsArticle/Chapter ViewAbstractPublication PagesstatmtConference Proceedingsconference-collections
research-article
Free Access

TectoMT: highly modular MT system with tectogrammatics used as transfer layer

Published:19 June 2008Publication History

ABSTRACT

We present a new English→Czech machine translation system combining linguistically motivated layers of language description (as defined in the Prague Dependency Treebank annotation scenario) with statistical NLP approaches.

References

  1. Ondřej Bojar and Zdeněk Žabokrtský. 2006. CzEng: Czech-English Parallel Corpus, Release version 0.5. Prague Bulletin of Mathematical Linguistics, 86:59--62.Google ScholarGoogle Scholar
  2. Thorsten Brants. 2000. TnT - A Statistical Part-of-Speech Tagger. pages 224--231, Seattle.Google ScholarGoogle Scholar
  3. Michael Collins. 1999. Head-driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania, Philadelphia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jan Cuřín et al. 2004. Prague Czech - English Dependency Treebank, Version 1.0. CD-ROM, Linguistics Data Consortium, LDC Catalog No.: LDC2004T25, Philadelphia.Google ScholarGoogle Scholar
  5. Jan Hajič et al. 2006. Prague Dependency Treebank 2.0. CD-ROM, Linguistic Data Consortium, LDC Catalog No.: LDC2006T01, Philadelphia.Google ScholarGoogle Scholar
  6. Jan Hajič. 2004. Disambiguation of Rich Inflection -- Computational Morphology of Czech. Charles University -- The Karolinum Press, Prague.Google ScholarGoogle Scholar
  7. Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1994. Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ryan McDonald, Fernando Pereira, Kiril Ribarov, and Jan Hajič. 2005. Non-Projective Dependency Parsing using Spanning Tree Algorithms. In Proceedings of HTL/EMNLP, pages 523--530, Vancouver, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Arul Menezes and Stephen D. Richardson. 2001. A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora. In Proceedings of the workshop on Data-driven methods in machine translation, volume 14, pages 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Guido Minnen, John Carroll, and Darren Pearce. 2000. Robust Applied Morphological Generation. In Proceedings of the 1st International Natural Language Generation Conference, pages 201--208, Israel. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Franz Josef Och and Hermann Ney. 2003. A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics, 29(1):19--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Petr Sgall. 1967. Generativní popis jazyka a česká deklinace. Academia, Prague.Google ScholarGoogle Scholar

Index Terms

  1. TectoMT: highly modular MT system with tectogrammatics used as transfer layer

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        StatMT '08: Proceedings of the Third Workshop on Statistical Machine Translation
        June 2008
        248 pages
        ISBN:9781932432091

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 19 June 2008

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate24of59submissions,41%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader