skip to main content
10.3115/977035.977059dlproceedingsArticle/Chapter ViewAbstractPublication PageseaclConference Proceedingsconference-collections
Article
Free Access

Representing text chunks

Published:08 June 1999Publication History

ABSTRACT

Dividing sentences in chunks of words is a useful preprocessing step for parsing, information extraction and information retrieval. (Ramshaw and Marcus, 1995) have introduced a "convenient" data representation for chunking by converting it to a tagging task. In this paper we will examine seven different data representations for the problem of recognizing noun phrase chunks. We will show that the the data representation choice has a minor influence on chunking performance. However, equipped with the most suitable data representation, our memory-based learning chunker was able to improve the best published chunking results for a standard data set.

References

  1. Steven Abney. 1991. Parsing by chunks. In Principle-Based Parsing. Kluwer Academic Publishers.Google ScholarGoogle Scholar
  2. Shlomo Argamon, Ido Dagan, and Yuval Krymolowski. 1998. A memory-based approach to learning shallow natural language patterns. In Proceedings of the 17th International Conference on Computational Linguistics (COLING-ACL '98). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Claire Cardie and David Pierce. 1998. Error-driven pruning of treebank grammars for base noun phrase identification. In Proceedings of the 17th International Conference on Computational Linguistics (COLING-ACL '98). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Walter Daelemans, Jakub Zavrel, Ko van der Sloot, and Antal van den Bosch. 1998. TiMBL: Tilburg Memory Based Learner - version 1.0 - Reference Guide. ILK, Tilburg University, The Netherlands. http://ilk.kub.nl/~ilk/papers/ilk9803.ps.gz.Google ScholarGoogle Scholar
  5. Walter Daelemans, Antal van den Bosch, and Jakub Zavrel. 1999. Forgetting exceptions is harmful in language learning. Machine Learning, 11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Lance A. Ramshaw and Mitchell P. Marcus. 1995. Text chunking using transformation-based learning. In Proceedings of the Third ACL Workshop on Very Large Corpora.Google ScholarGoogle Scholar
  7. Adwait Ratnaparkhi. 1998. Maximum Entropy Models for Natural Language Ambiguity Resolution. PhD thesis Computer and Information Science, University of Pennsylvania. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jorn Veenstra. 1998. Fast np chunking using memory-based learning techniques. In BENELEARN-98: Proceedings of the Eigth Belgian-Dutch Conference on Machine Learning. ATO-DLO, Wageningen, report 352.Google ScholarGoogle Scholar
  1. Representing text chunks

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        EACL '99: Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
        June 1999
        310 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 8 June 1999

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate100of360submissions,28%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader