skip to main content
10.3115/1072228.1072235dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free Access

The computation of word associations: comparing syntagmatic and paradigmatic approaches

Published:24 August 2002Publication History

ABSTRACT

It is shown that basic language processes such as the production of free word associations and the generation of synonyms can be simulated using statistical models that analyze the distribution of words in large text corpora. According to the law of association by contiguity, the acquisition of word associations can be explained by Hebbian learning. The free word associations as produced by subjects on presentation of single stimulus words can thus be predicted by applying first-order statistics to the frequencies of word co-occurrences as observed in texts. The generation of synonyms can also be conducted on co-occurrence data but requires second-order statistics. The reason is that synonyms rarely occur together but appear in similar lexical neighborhoods. Both approaches are systematically compared and are validated on empirical data. It turns out that for both tasks the performance of the statistical system is comparable to the performance of human subjects.

References

  1. Agarwal, R. (1995). Semantic Feature Extraction from Technical Texts with Limited Human Intervention. Dissertation, Mississippi State University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Berland, M., Charniak, E. (1999). Finding Parts in Very Large Corpora. In: Proceedings of ACL 1999, College Park. 57--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. de Saussure, F. (1916/1996). Cours de linguistique générale. Paris: Payot.Google ScholarGoogle Scholar
  4. Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Grefenstette, G. (1993). Evaluation techniques for automatic semantic extraction: comparing syntactic and window based approaches. In: Proceedings of the Workshop on Acquisition of Lexical Knowledge from Text, Columbus, Ohio.Google ScholarGoogle Scholar
  6. Grefenstette, G. (1994). Explorations in Automatic Thesaurus Discovery. Dordrecht: Kluwer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Kiss, G. R., Armstrong, C., Milroy, R., Piper, J. (1973). An associative thesaurus of English and its computer analysis. In: A. Aitken, R. Beiley and N. Hamilton-Smith (eds.): The Computer and Literary Studies, Edinburgh: University Press.Google ScholarGoogle Scholar
  8. Landauer, T. K.; Dumais, S. T. (1997). A solution to Plato's problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211--240.Google ScholarGoogle ScholarCross RefCross Ref
  9. Lin, D. (1998). Automatic Retrieval and Clustering of Similar Words. In: Proceedings of COLING-ACL 1998, Montreal, Vol. 2, 768--773. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Rapp, R. (1996). Die Berechnung von Assoziationen. Hildesheim: Olms.Google ScholarGoogle Scholar
  11. Rapp, R. (1999). Automatic identification of word translation from unrelated English and German corpora. In: Proceedings of ACL 1999, College Park. 519--526. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ruge, G. (1992). Experiments on Linguistically Based Term Associations. Information Processing & Management 28(3), 317--332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ruge, G. (1995). Wortbedeutung und Termassoziation. Hildesheim: Olms.Google ScholarGoogle Scholar
  14. Salton, G.; McGill, M. (1983). Introduction to Modern Information Retrieval. New York: McGraw-Hill. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Schütze, H. (1997). Ambiguity Resolution in Language Learning: Computational and Cognitive Models. Stanford: CSLI Publications.Google ScholarGoogle Scholar
  16. Smadja, F. (1993). Retrieving collocations from text: Xtract. Computational Linguistics 19(1), 143--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Wettler, M.; Rapp, R. (1993). Computation of word associations based on the co-occurrences of words in large corpora. In: Proceedings of the 1st Workshop on Very Large Corpora: Columbus, Ohio, 84--93.Google ScholarGoogle Scholar
  18. Wettler, M., Rapp, R., Ferber, R. (1993). Freie Assoziationen und Kontiguitäten von Wörtern in Texten. Zeitschrift für Psychologie, 201, 99--108.Google ScholarGoogle Scholar
  1. The computation of word associations: comparing syntagmatic and paradigmatic approaches

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image DL Hosted proceedings
      COLING '02: Proceedings of the 19th international conference on Computational linguistics - Volume 1
      August 2002
      1184 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      • Published: 24 August 2002

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,537of1,537submissions,100%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader