skip to main content
research-article

Using Short Dependency Relations from Auto-Parsed Data for Chinese Dependency Parsing

Authors Info & Claims
Published:01 August 2009Publication History
Skip Abstract Section

Abstract

Dependency parsing has become increasingly popular for a surge of interest lately for applications such as machine translation and question answering. Currently, several supervised learning methods can be used for training high-performance dependency parsers if sufficient labeled data are available.

However, currently used statistical dependency parsers provide poor results for words separated by long distances. In order to solve this problem, this article presents an effective dependency parsing approach of incorporating short dependency information from unlabeled data. The unlabeled data is automatically parsed by using a deterministic dependency parser, which exhibits a relatively high performance for short dependencies between words. We then train another parser that uses the information on short dependency relations extracted from the output of the first parser. The proposed approach achieves an unlabeled attachment score of 86.52%, an absolute 1.24% improvement over the baseline system on the Chinese Treebank data set. The results indicate that the proposed approach improves the parsing performance for longer distance words.

References

  1. Brants, T. 2000. TnT--A statistical part-of-speech tagger. In Proceedings of the 6th Conference on Applied Natural Language Processing (ANLP’00). 224--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Buchholz, S. and Marsi, E. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the Conference on Natural Language Learning (CoNLL’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chang, C. and Lin, C. 2001. LIBSVM: A library for support vector machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm.Google ScholarGoogle Scholar
  4. Cheng, Y., Asahara, M., and Matsumoto, Y. 2005a. Chinese deterministic dependency analyzer: Examining effects of global features and root node finder. In Proceedings of the Association of Computer Linguistics Special Interest Group on Chinese Language Processing (ACL-SIGHAN’05).Google ScholarGoogle Scholar
  5. Cheng, Y., Asahara, M., and Matsumoto, Y. 2005b. Machine learning-based dependency analyzer for Chinese. J. Chinese Lang. Comput. 13--24.Google ScholarGoogle Scholar
  6. Cui, H., Sun, R., Li, K., Kan, M.-Y., and Chua, T.-S. 2005. Question answering passage retrieval using dependency relations. In Proceedings of the 28th Annual International Conference on Research and Development in Information Retrieval (SIGIR’05). 400--407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ding, Y. and Palmer, M. 2005. Machine translation using probabilistic synchronous dependency insertion grammars. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL’05). 541--548. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Hall, J., Nilsson, J., Nivre, J., Eryigit, G., Megyesi, B., Nilsson, M., and Saers, M. 2007. Single malt or blended? A study in multilingual parser optimization. In Proceedings of the Conference on Natural Language Learning Shared Task Session of the Conference on Empirical Methods in Natural Language Processing (EMNLP-CoNLL’07). 933--939.Google ScholarGoogle Scholar
  9. Kawahara, D. and Kurohashi, S. 2006. A fully-lexicalized probabilistic model for Japanese syntactic and case structure analysis. In Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT’06). 176--183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. McClosky, D., Charniak, E., and Johnson, M. 2006. Reranking and self-training for parser adaptation. In Proceedings of the International Conference on Computer Linguistics (COLING’06). 337--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. McDonald, R., Lerman, K., and Pereira, F. 2006. Multilingual dependency analysis with a two-stage discriminative parser. In Proceedings of the Conference on Natural Language Learning (CoNLL’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. McDonald, R. and Nivre, J. 2007. Characterizing the errors of data-driven dependency parsing models. In Proceedings of the Conference on Natural Language Learning Shared Task Session of the Conference on Empirical Methods in Natural Language Processing (EMNLP-CoNLL’07). 122--131.Google ScholarGoogle Scholar
  13. Nakagawa, T. and Uchimoto, K. 2007. A hybrid approach to word segmentation and pos tagging. In Proceedings of the Association of Computer Learning (ACL’07). 217--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Nivre, J. 2003. An efficient algorithm for projective dependency parsing. In Proceedings of the International Conference on Parsing Technologies (IWPT’03). 149--160.Google ScholarGoogle Scholar
  15. Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., and Yuret, D. 2007. The CoNLL 2007 shared task on dependency parsing. In Proceedings of the Conference on Natural Language Learning Shared Task Session of the Conference on Empirical Methods in Natural Language Processing (EMNLP-CoNLL’07). 915--932.Google ScholarGoogle Scholar
  16. Nivre, J., Hall, J., Nilsson, J., Eryigit, G., and Marinov, S. 2006. Labeled pseudo-projective dependency parsing with support vector machines. In Proceedings of the Conference on Natural Language Learning (CoNLL’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nivre, J. and Kubler, S. 2006. Dependency parsing: Tutorial at COLING-ACL 2006. In Proceedings of the International Conference on Computer Linguistics (COLING’06).Google ScholarGoogle Scholar
  18. Reichart, R. and Rappoport, A. 2007. Self-training for enhancement and domain adaptation of statistical parsers trained on small datasets. In Proceedings of the Association for Computer Learning (ACL’07).Google ScholarGoogle Scholar
  19. Sagae, K. and Tsujii, J. 2007. Dependency parsing and domain adaptation with LR models and parser ensembles. In Proceedings of the Conference on Natural Language Learning Shared Task Session of the Conference on Empirical Methods in Natural Language Processing (EMNLP-CoNLL’07). 1044--1050.Google ScholarGoogle Scholar
  20. Smith, N. A. and Eisner, J. 2006. Annealing structural bias in multilingual weighted grammar induction. In Proceedings of the International Conference on Computer Linguistics (COLING’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Steedman, M., Osborne, M., Sarkar, A., Clark, S., Hwa, R., Hockenmaier, J., Ruhlen, P., Baker, S., and Crim, J. 2003. Bootstrapping statistical parsers from small datasets. http://www.cs.pitt.edu/~hwa/eaclo3.ps.Google ScholarGoogle Scholar
  22. Wang, M., Sagae, K., and Mitamura, T. 2006. A fast, accurate deterministic parser for Chinese. In Proceedings of the International Conference on Computer Linguistics (COLING’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Wang, Q. I., Lin, D., and Schuurmans, D. 2007. Simple training of dependency parsers via structured boosting. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Wang, Q. I., Schuurmans, D., and Lin, D. 2005. Strictly lexical dependency parsing. In Proceedings of the International Conference on Parsing Technologies (IWPT’05). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yamada, H. and Matsumoto, Y. 2003. Statistical dependency analysis with support vector machines. In Proceedings of the International Conference on Parsing Technologies (IWPT’03). 195--206.Google ScholarGoogle Scholar
  26. Yu, K., Kurohashi, S., and Liu, H. 2007. A three-step deterministic parser for Chinese dependency parsing. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (HLT’07). 201--204. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Using Short Dependency Relations from Auto-Parsed Data for Chinese Dependency Parsing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian Language Information Processing
      ACM Transactions on Asian Language Information Processing  Volume 8, Issue 3
      August 2009
      81 pages
      ISSN:1530-0226
      EISSN:1558-3430
      DOI:10.1145/1568292
      Issue’s Table of Contents

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 August 2009
      • Accepted: 1 January 2009
      • Revised: 1 December 2008
      • Received: 1 April 2008
      Published in talip Volume 8, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader