skip to main content
10.3115/992133.992154dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free Access

Automatic acquisition of hyponyms from large text corpora

Published:23 August 1992Publication History

ABSTRACT

We describe a method for the automatic acquisition of the hyponymy lexical relation from unrestricted text. Two goals motivate the approach: (i) avoidance of the need for pre-encoded knowledge and (ii) applicability across a wide range of text. We identify a set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest. We describe a method for discovering these patterns and suggest that other lexical relations will also be acquirable in this way. A subset of the acquisition algorithm is implemented and the results are used to augment and critique the structure of a large hand-built thesaurus. Extensions and applications to areas such as information retrieval are suggested.

References

  1. Ahlswede, T. & M. Evens (1988). Parsing vs. text processing in the analysis of dictionary definitions. Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics, pages 217--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alshawi, H. (1987). Processing dictionary definitions with phrasal pattern hierarchies. American Journal of Computational Linguistics, 13(3):195--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Batali, J. (1991). Automatic Acquisition and Use of Some of the Knowledge in Physics Texts. PhD thesis, Massachusetts Institute of Technology, Artificial Intelligence Laboratory. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Brent, M. R. (1991). Automatic acquisition of subcategorization frames from untagged, free-text corpora. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Calzolari, N. & R. Bindi (1990). Acquisition of lexical information from a large textual italian corpus. In Proceedings of the Thirteenth International Conference on Computational Linguistics, Helsinki. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Coates-Stephens, S. (1991). Coping with lexical inadequacy - the automatic acquisition of proper nouns from news text. In The Proceedings of the 7th Annual Conference of the UW Centre for the New OED and Text Research: Using Corpora, pages 154--169, Oxford.Google ScholarGoogle Scholar
  7. Cutting, D., J. Kupiec, J. Pedersen, & P. Sibun (1991). A practical part-of-speech tagger. Submitted to The 3rd Conference on Applied Natural Language Processing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Grolier (1990). Academic American Encyclopedia Grolier Electronic Publishing, Danbury, Connecticut.Google ScholarGoogle Scholar
  9. Hearst, M. A. (1991). Noun homograph disambiguation using local context in large text corpora. In The Proceedings of the 7th Annual Conference of the UW Centre for the New OED and Text Research: Using Corpora, Oxford.Google ScholarGoogle Scholar
  10. Hindle, D. (1990). Noun classification from predicate-argument structures. Proceedings of the 28th Annual Meeting of the Association for Computational Linguistics, pages 268--275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jacobs, P. & U. Zernik (1988). Acquiring lexical knowledge from text: A case study. In Proceedings of AAAI88, pages 739--744.Google ScholarGoogle Scholar
  12. Jensen, K. & J.-L. Binot (1987). Disambiguating prepositional phrase attachments by using online dictionary definitions. American Journal of Computational Linguistics, 13(3):251--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Markowitz, J., T. Ahlswede, & M. Evens (1986). Semantically significant patterns in dictionary definitions. Proceedings of the 24th Annual Meeting of the Association for Computational Linguistics, pages 112--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Miller, G. A., R. Beckwith, C. Fellbaum, D. Gross, & K. J. Miller (1990). Introduction to wordnet: An on-line lexical database. Journal of Lexicography, 3(4):235--244.Google ScholarGoogle ScholarCross RefCross Ref
  15. Morris, J. & G. Hirst (1991). Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Computational Linguistics, 17(1):21--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Nakamura, J. & M. Nagao (1988). Extraction of semantic information from an ordinary english dictionary and its evaluation. In Proceedings of the Twelfth International Conference on Computational Linguistics, pages 459--464, Budapest. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Smadja, F. A. & K. R. McKeown (1990). Automatically extracting and representing collocations for language generation. Proceedings of the 28th Annual Meeting of the Association for Computational Linguistics, pages 252--259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Velardi, P. & M. T. Pazienza (1989). Computer aided interpretation of lexical cooccurrences. Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, pages 185--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Wilks, Y. A., D. C. Fass, C. ming Guo, J. E. McDonald, T. Plate, & B. M. Slator (1990). Providing machine tractable dictionary tools. Journal of Machine Translation, 2.Google ScholarGoogle ScholarCross RefCross Ref
  1. Automatic acquisition of hyponyms from large text corpora

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        COLING '92: Proceedings of the 14th conference on Computational linguistics - Volume 2
        August 1992
        433 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 23 August 1992

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate1,537of1,537submissions,100%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader