skip to main content
10.3115/974499.974526dlproceedingsArticle/Chapter ViewAbstractPublication PagesanlcConference Proceedingsconference-collections
Article
Free Access

A simple rule-based part of speech tagger

Published:31 March 1992Publication History

ABSTRACT

Automatic part of speech tagging is an area of natural language processing where statistical techniques have been more successful than rule-based methods. In this paper, we present a simple rule-based part of speech tagger which automatically acquires its rules and tags with accuracy comparable to stochastic taggers. The rule-based tagger has many advantages over these taggers, including: a vast reduction in stored information required, the perspicuity of a small set of meaningful rules, ease of finding and implementing improvements to the tagger, and better portability from one tag set, corpus genre or language to another. Perhaps the biggest contribution of this work is in demonstrating that the stochastic method is not the only viable method for part of speech tagging. The fact that a simple rule-based tagger that automatically learns its rules can perform so well should offer encouragement for researchers to further explore rule-based tagging, searching for a better and more expressive set of rule templates and other variations on the simple but effective theme described below.

References

  1. {Church 88} Church, K. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In Proceedings of the Second Conference on Applied Natural Language Processing, ACL, 136--143, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. {Cutting et al. 92} Cutting, D., Kupiec, J., Pederson, J. and Sibun, P. A Practical Part-of-Speech Tagger. In Proceedings of the Third Conference on Applied Natural Language Processing, ACL, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. {DeRose 88} DeRose, S. J. Grammatical Category Disambiguation by Statistical Optimization. Computational Linguistics 14: 31--39, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. {Deroualt and Merialdo 86} Deroualt, A. and Merialdo, B. Natural language modeling for phoneme-to-text transcription. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-8, No. 6, 742--749, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. {Francis and Kučera 82} Francis, W. Nelson and Kučera, Henry, Frequency analysis of English usage. Lexicon and grammar. Houghton Mifflin, Boston, 1982.Google ScholarGoogle Scholar
  6. {Garside et al. 87} Garside, R., Leech, G. & Sampson, G. The Computational Analysis of English: A Corpus-Based Approach. Longman: London, 1987.Google ScholarGoogle Scholar
  7. {Green and Rubin 71} Green, B. and Rubin, G. Automated Grammatical Tagging of English. Department of Linguistics, Brown University, 1971.Google ScholarGoogle Scholar
  8. {Hindle 89} Hindle, D. Acquiring disambiguation rules from text. Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. {Jelinek 85} Jelinek, F. Markov source modeling of text generation. In J. K. Skwirzinski, ed., Impact of Processing Techniques on Communication, Dordrecht, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  10. {Klein and Simmons 63} Klein, S. and Simmons, R. F. A Computational Approach to Grammatical Coding of English Words. JACM 10: 334--47. 1963. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. {Kupiec 89} Kupiec, J. Augmenting a hidden Markov model for phrase-dependent word tagging. In Proceedings of the DARPA Speech and Natural Language Workshop, Morgan Kaufmann, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. {Meteer et al. 91} Meteer, M., Schwartz, R., and Weischedel, R. Empirical Studies in Part of Speech Labelling, Proceedings of the DARPA Speech and Natural Language Workshop, Morgan Kaufmann, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. A simple rule-based part of speech tagger

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        ANLC '92: Proceedings of the third conference on Applied natural language processing
        March 1992
        273 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 31 March 1992

        Qualifiers

        • Article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader