skip to main content
10.3115/1220175.1220307dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free Access

Learning to predict case markers in Japanese

Published:17 July 2006Publication History

ABSTRACT

Japanese case markers, which indicate the grammatical relation of the complement NP to the predicate, often pose challenges to the generation of Japanese text, be it done by a foreign language learner, or by a machine translation (MT) system. In this paper, we describe the task of predicting Japanese case markers and propose machine learning methods for solving it in two settings: (i) monolingual, when given information only from the Japanese sentence; and (ii) bilingual, when also given information from a corresponding English source sentence in an MT context. We formulate the task after the well-studied task of English semantic role labelling, and explore features from a syntactic dependency structure of the sentence. For the monolingual task, we evaluated our models on the Kyoto Corpus and achieved over 84% accuracy in assigning correct case markers for each phrase. For the bilingual task, we achieved an accuracy of 92% per phrase using a bilingual dataset from a technical domain. We show that in both settings, features that exploit dependency information, whether derived from gold-standard annotations or automatically assigned, contribute significantly to the prediction of case markers.

References

  1. Baldwin, T. 2004. Making Sense of Japanese Relative Clause Constructions, In Proceedings of the 2nd Workshop on Text Meaning and Interpretation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Blaheta, D. and E. Charniak. 2000. Assigning function tags to parsed text. In Proceedings of NAACL, pp.234--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Carreras, X. and L. Màrquez. 2005. Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling. In Proceedings of CoNLL-2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Clarkson, P. R. and R. Rosenfeld. 1997. Statistical Language Modeling Using the CMU-Cambridge Toolkit. In Proceedings of ESCA Eurospeech, pp. 2007--2010.Google ScholarGoogle Scholar
  5. Collins, M. 2000. Discriminative reranking for natural language parsing. In Proceedings of ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gamon, M., E. Ringger, S. Corston-Oliver and R. Moore. 2002. Machine-learned Context for Linguistic Operations in German Sentence Realization. In Proceeding of ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gildea, D. and D. Jurafsky. 2002. Automatic Labeling of Semantic Roles. In Computational Linguistics 28(3): 245--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Hacioglu, K. 2004. Semantic Role Labeling using Dependency Trees. In Proceedings of COLING 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Kawahara, D., N. Kaji and S. Kurohashi. 2000. Japanese Case Structure Analysis by Unsupervised Construction of a Case Frame Dictionary. In Proceedings of COLING, pp. 432--438. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kurohashi, S. and M. Nagao. 1997. Kyoto University Text Corpus Project. In Proceedings of ANLP, pp.115--118.Google ScholarGoogle Scholar
  11. Masuoka, T. and Y. Takubo. 1992. Kiso Nihongo Bunpou (Fundamental Japanese grammar), revised version. Kuroshio Shuppan, Tokyo.Google ScholarGoogle Scholar
  12. Murata, M., and H. Isahara. 2005. Japanese Case Analysis Based on Machine Learning Method that Uses Borrowed Supervised Data. In Proceedings of IEEE NLP-KE-2005, pp.774--779.Google ScholarGoogle Scholar
  13. Och, F. J. and H. Ney. 2000. Improved statistical alignment models. In Proceedings of ACL: pp.440--447. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Palmer, M., D. Gildea and P. Kingsbury. 2005. The Proposition Bank: An Annotated Corpus of Semantic Roles. In Computational Linguistics 31(1). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Pradhan, S., W. Ward, K. Hacioglu, L. Martin, D. Jurafsky. 2004. Shallow Semantic Parsing Using Support Vector Machines. In Proceedings of HLT/NAACL.Google ScholarGoogle Scholar
  16. Quirk, C., A. Menezes and C. Cherry. 2005. Dependency Tree Translation: Syntactically Informed Phrasal SMT. In Proceedings of ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Teramura, H. 1991. Nihongo-no shintakusu-toimi (Japanese syntax and meaning). Volume III. Kuroshio Shuppan, Tokyo.Google ScholarGoogle Scholar
  18. Toutanova, K., A. Haghighi and C. D. Manning. 2005. Joint Learning Improves Semantic Role Labeling. In Proceeding of ACL, pp.589--596. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Uchimoto, K., S. Sekine and H. Isahara. 2002. Text Generation from Keywords. In Proceedings of COLING 2002, pp.1037--1043. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Learning to predict case markers in Japanese

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
          July 2006
          1214 pages

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 17 July 2006

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate85of443submissions,19%
        • Article Metrics

          • Downloads (Last 12 months)20
          • Downloads (Last 6 weeks)6

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader