Abstract
Parsing algorithms that process the input from left to right and construct a single derivation have often been considered inadequate for natural language parsing because of the massive ambiguity typically found in natural language grammars. Nevertheless, it has been shown that such algorithms, combined with treebank-induced classifiers, can be used to build highly accurate disambiguating parsers, in particular for dependency-based syntactic representations. In this article, we first present a general framework for describing and analyzing algorithms for deterministic incremental dependency parsing, formalized as transition systems. We then describe and analyze two families of such algorithms: stack-based and list-based algorithms. In the former family, which is restricted to projective dependency structures, we describe an arc-eager and an arc-standard variant; in the latter family, we present a projective and a non-projective variant. For each of the four algorithms, we give proofs of correctness and complexity. In addition, we perform an experimental evaluation of all algorithms in combination with SVM classifiers for predicting the next parsing action, using data from thirteen languages. We show that all four algorithms give competitive accuracy, although the non-projective list-based algorithm generally outperforms the projective algorithms for languages with a non-negligible proportion of non-projective constructions. However, the projective algorithms often produce comparable results when combined with the technique known as pseudo-projective parsing. The linear time complexity of the stack-based algorithms gives them an advantage with respect to efficiency both in learning and in parsing, but the projective list-based algorithm turns out to be equally efficient in practice. Moreover, when the projective algorithms are used to implement pseudo-projective parsing, they sometimes become less efficient in parsing (but not in learning) than the non-projective list-based algorithm. Although most of the algorithms have been partially described in the literature before, this is the first comprehensive analysis and evaluation of the algorithms within a unified framework.
- Abney, Steven and Mark Johnson. 1991. Memory requirements and local ambiguities of parsing strategies. Journal of Psycholinguistic Research, 20:233-250.Google Scholar
- Aho, Alfred V., Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley, Reading, MA. Google Scholar
- Attardi, Giuseppe. 2006. Experiments with a multilanguage nonprojective dependency parser. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X), pages 166-170, New York. Google Scholar
- Böhmová, Alena, Jan Haji¿, Eva Haji¿ová, and Barbora Hladká. 2003. The Prague Dependency Treebank: A three-level annotation scenario. In Anne Abeillé, editor, Treebanks: Building and Using Parsed Corpora. Kluwer Academic Publishers, Dordrecht, pages 103-127.Google Scholar
- Buchholz, Sabine and Erwin Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the Tenth Conference on Computational Natural Language Learning, pages 149-164, New York. Google Scholar
- Chang, Chih-Chung and Chih-Jen Lin, 2001. LIBSVM: A Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/ ~cjlin/libsvm.Google Scholar
- Cheng, Yuchang, Masayuki Asahara, and Yuji Matsumoto. 2005. Machine learning-based dependency analyzer for Chinese. In Proceedings of International Conference on Chinese Computing (ICCC), pages 66-73, Singapore.Google Scholar
- Cormen, Thomas H., Charles E. Leiserson, and Ronald L. Rivest. 1990. Introduction to Algorithms. MIT Press, Cambridge, MA. Google Scholar
- Covington, Michael A. 2001. A fundamental algorithm for dependency parsing. In Proceedings of the 39th Annual ACM Southeast Conference, pages 95-102, Athens, GA.Google Scholar
- Eisner, Jason M. 1996. Three new probabilistic models for dependency parsing: An exploration. In Proceedings of the 16th International Conference on Computational Linguistics (COLING), pages 340-345, Copenhagen. Google Scholar
- Haji¿, Jan, Barbora Vidova Hladka, Jarmila Panevová, Eva Haji¿ová, Petr Sgall, and Petr Pajas. 2001. Prague Dependency Treebank 1.0. LDC, 2001T10.Google Scholar
- Hall, J., J. Nilsson, J. Nivre, G. Eryi¿it, B. Megyesi, M. Nilsson, and M. Saers. 2007. Single malt or blended? A study in multilingual parser optimization. In Proceedings of the CoNLL shared task of EMNLP-CoNLL 2007, pages 933-939, Prague.Google Scholar
- Hall, Johan, Joakim Nivre, and Jens Nilsson. 2006. Discriminative classifiers for deterministic dependency parsing. In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 316-323, Sydney. Google Scholar
- Hudson, Richard A. 1990. English Word Grammar. Blackwell, Oxford.Google Scholar
- Johansson, Richard and Pierre Nugues. 2006. Investigating multilingual dependency parsing. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X), pages 206-210, New York. Google Scholar
- Kalt, Tom. 2004. Induction of greedy controllers for deterministic treebank parsers. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 17-24, Barcelona.Google Scholar
- Kudo, Taku and Yuji Matsumoto. 2002. Japanese dependency analysis using cascaded chunking. In Proceedings of the Sixth Workshop on Computational Language Learning (CoNLL), pages 63-69, Taipei. Google Scholar
- Marcus, Mitchell P. 1980. A Theory of Syntactic Recognition for Natural Language. MIT Press, Cambridge, MA. Google Scholar
- Marcus, Mitchell P., Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19:313-330. Google Scholar
- Marcus, Mitchell P., Beatrice Santorini, Mary Ann Marcinkiewicz, Robert MacIntyre, Ann Bies, Mark Ferguson, Karen Katz, and Britta Schasberger. 1994. The Penn Treebank: Annotating predicate-argument structure. In Proceedings of the ARPA Human Language Technology Workshop, pages 114-119, Plainsboro, NJ. Google Scholar
- Marinov, S. 2007. Covington variations. In Proceedings of the CoNLL Shared Task of EMNLP-CoNLL 2007, pages 1144-1148, Prague.Google Scholar
- McDonald, Ryan, Koby Crammer, and Fernando Pereira. 2005. Online large-margin training of dependency parsers. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pages 91-98, Ann Arbor, MI. Google Scholar
- McDonald, Ryan, Kevin Lerman, and Fernando Pereira. 2006. Multilingual dependency analysis with a two-stage discriminative parser. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL), pages 216-220. Google Scholar
- McDonald, Ryan and Joakim Nivre. 2007. Characterizing the errors of data-driven dependency parsing models. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 122-131, Prague.Google Scholar
- McDonald, Ryan and Fernando Pereira. 2006. Online learning of approximate dependency parsing algorithms. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 81-88, Trento.Google Scholar
- McDonald, Ryan, Fernando Pereira, Kiril Ribarov, and Jan Haji¿. 2005. Non-projective dependency parsing using spanning tree algorithms. In Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 523-530, Vancouver. Google Scholar
- Mel'¿uk, Igor. 1988. Dependency Syntax: Theory and Practice. State University of NewYork Press, New York.Google Scholar
- Nilsson, Jens, Joakim Nivre, and Johan Hall. 2007. Generalizing tree transformations for inductive dependency parsing. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pages 968-975, Prague.Google Scholar
- Nivre, Joakim. 2003. An efficient algorithm for projective dependency parsing. In Proceedings of the 8th International Workshop on Parsing Technologies (IWPT), pages 149-160, Nancy.Google Scholar
- Nivre, Joakim. 2004. Incrementality in deterministic dependency parsing. In Proceedings of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together (ACL), pages 50-57, Barcelona. Google Scholar
- Nivre, Joakim. 2006a. Constraints on non-projective dependency graphs. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 73-80, Trento. Google Scholar
- Nivre, Joakim. 2006b. Inductive Dependency Parsing. Springer, Dordrecht. Google Scholar
- Nivre, Joakim. 2007. Incremental non-projective dependency parsing. In Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT), pages 396-403, Rochester, NY. Google Scholar
- Nivre, Joakim, Johan Hall, and Jens Nilsson. 2004. Memory-based dependency parsing. In Proceedings of the 8th Conference on Computational Natural Language Learning (CoNLL), pages 49-56, Boston, MA.Google Scholar
- Nivre, Joakim, Johan Hall, Jens Nilsson, Atanas Chanev, Gülsen Eryi¿it, Sandra Kübler, Svetoslav Marinov, and Erwin Marsi. 2007. MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13:95-135.Google Scholar
- Nivre, Joakim, Johan Hall, Jens Nilsson, Gülsen Eryi¿it, and Svetoslav Marinov. 2006. Labeled pseudoprojective dependency parsing with support vector machines. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL), pages 221-225, New York, NY. Google Scholar
- Nivre, Joakim and Jens Nilsson. 2005. Pseudo-projective dependency parsing. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pages 99-106, Ann Arbor, MI. Google Scholar
- Nivre, Joakim and Mario Scholz. 2004. Deterministic dependency parsing of English text. In Proceedings of the 20th International Conference on Computational Linguistics (COLING), pages 64-70, Geneva. Google Scholar
- Ratnaparkhi, Adwait. 1997. A linear observed time statistical parser based on maximum entropy models. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1-10, Providence, RI.Google Scholar
- Ratnaparkhi, Adwait. 1999. Learning to parse natural language with maximum entropy models. Machine Learning, 34:151-175. Google Scholar
- Sagae, Kenji and Alon Lavie. 2005. A classifier-based parser with linear run-time complexity. In Proceedings of the 9th International Workshop on Parsing Technologies (IWPT), pages 125-132, Vancouver. Google Scholar
- Sagae, Kenji and Alon Lavie. 2006. A best-first probabilistic shift-reduce parser. In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 691-698, Sydney. Google Scholar
- Sgall, Petr, Eva Haji¿ová, and Jarmila Panevová. 1986. The Meaning of the Sentence in Its Pragmatic Aspects. Reidel, Dordrecht.Google Scholar
- Shieber, Stuart M. 1983. Sentence disambiguation by a shift-reduce parsing technique. In Proceedings of the 21st Conference on Association for Computational Linguistics (ACL), pages 113-118, Cambridge, MA. Google Scholar
- Shieber, Stuart M., Yves Schabes, and Fernando C. N. Pereira. 1995. Principles and implementation of deductive parsing. Journal of Logic Programming, 24:3-36.Google Scholar
- Tesnière, Lucien. 1959. Éléments de syntaxe structurale. Editions Klincksieck, Paris.Google Scholar
- Yamada, Hiroyasu and Yuji Matsumoto. 2003. Statistical dependency analysis with support vector machines. In Proceedings of the 8th International Workshop on Parsing Technologies (IWPT), pages 195-206, Nancy.Google Scholar
Index Terms
- Algorithms for deterministic incremental dependency parsing
Recommendations
Deterministic dependency parsing of English text
COLING '04: Proceedings of the 20th international conference on Computational LinguisticsThis paper presents a deterministic dependency parser based on memory-based learning, which parses English text in linear time. When trained and evaluated on the Wall Street Journal section of the Penn Treebank, the parser achieves a maximum attachment ...
Deterministic parsing of ambiguous grammars
Methods of describing the syntax of programming languages in ways that are more flexible and natural than conventional BNF descriptions are considered. These methods involve the use of ambiguous context-free grammars together with rules to resolve ...
Comments