ABSTRACT
We present a classifier-based parser that produces constituent trees in linear time. The parser uses a basic bottom-up shift-reduce algorithm, but employs a classifier to determine parser actions instead of a grammar. This can be seen as an extension of the deterministic dependency parser of Nivre and Scholz (2004) to full constituent parsing. We show that, with an appropriate feature set used in classification, a very simple one-path greedy parser can perform at the same level of accuracy as more complex parsers. We evaluate our parser on section 23 of the WSJ section of the Penn Treebank, and obtain precision and recall of 87.54% and 87.61%, respectively.
- Charniak, E. 2000. A maximum-entropy-inspired parser. Proceedings of the First Annual Meeting of the North American Chapter of the Association for Computational Linguistics. Seattle, WA. Google ScholarDigital Library
- Collins, M. 1997. Three generative, lexicalized models for statistical parsing. Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (pp. 16--23). Madrid, Spain. Google ScholarDigital Library
- Daelemans, W., Zavrel, J., van der Sloot, K., and van den Bosch, A. 2004. TiMBL: Tilburg Memory Based Learner, version 5.1, reference guide. ILK Research Group Technical Report Series no. 04--02, 2004.Google Scholar
- Gildea, D., and Palmer, M. 2002. The necessity of syntactic parsing for predicate argument recognition. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (pp. 239--246). Philadelphia, PA. Google ScholarDigital Library
- Kalt, T. 2004. Induction of greedy controllers for deterministic treebank parsers. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. Barcelona, Spain.Google Scholar
- Kudo, T., and Matsumoto, Y. 2004. A boosting algorithm for classification of semi-structured text. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. Barcelona, Spain.Google Scholar
- Kudo, T., and Matsumoto, Y. 2001. Chunking with support vector machines. Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics. Pittsburgh, PA. Google ScholarDigital Library
- Johnson, M. 1998. PCFG models of linguistic tree representations. Computational Linguistics, 24:613--632. Google ScholarDigital Library
- Marcus, M. P., Santorini, B., and Marcinkiewics, M. A. 1993. Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, 19. Google ScholarDigital Library
- Nivre, J., and Scholz, M. 2004. Deterministic dependency parsing of English text. Proceedings of the 20th International Conference on Computational Linguistics (pp. 64--70). Geneva, Switzerland. Google ScholarDigital Library
- Ratnaparkhi, A. 1997. A linear observed time statistical parser based on maximum entropy models. Proceedings of the Second Conference on Empirical Methods in Natural Language Processing. Providence, Rhode Island.Google Scholar
- Veenstra, J., van den Bosch, A. 2000. Single-classifier memory-based phrase chunking. Proceedings of Fourth Workshop on Computational Natural Language Learning (CoNLL 2000). Lisbon, Portugal. Google ScholarDigital Library
- Wong, A., and Wu. D. 1999. Learning a lightweight robust deterministic parser. Proceedings of the Sixth European Conference on Speech Communication and Technology. Budapest.Google ScholarCross Ref
- Yamada, H., and Matsumoto, Y. 2003. Statistical dependency analysis with support vector machines. Proceedings of the Eighth International Workshop on Parsing Technologies. Nancy, France.Google Scholar
- Zhang, T., Damerau, F., and Johnson, D. 2002. Text chunking using regularized winnow. Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. Tolouse, France. Google ScholarDigital Library
- A classifier-based parser with linear run-time complexity
Recommendations
Run-Time Extensible (Semi-)Top-Down Parser
TSD '99: Proceedings of the Second International Workshop on Text, Speech and DialogueWhen reading a text or listening to a speech the words are processed by humans in the order they come. Intuitively there are some mental actions just after morfologic analysis of any newly recognized word. This mental action helps understanding of the ...
A unification-based parser for relational grammar
ACL '93: Proceedings of the 31st annual meeting on Association for Computational LinguisticsWe present an implemented unification-based parser for relational grammars developed within the stratified feature grammar (SFG) framework, which generalizes Kasper-Rounds logic to handle relational grammar analyses. We first introduce the key aspects ...
A CKY parser for picture grammars
We study the complexity of the membership or parsing problem for pictures generated by a family of picture grammars: Siromoney's Context-Free Kolam Array grammars (coincident with Matz's context-free picture grammars). We describe a new parsing ...
Comments