skip to main content
10.5555/1870658.1870731dlproceedingsArticle/Chapter ViewAbstractPublication PagesemnlpConference Proceedingsconference-collections
research-article
Free Access

Word sense induction & disambiguation using hierarchical random graphs

Published:09 October 2010Publication History

ABSTRACT

Graph-based methods have gained attention in many areas of Natural Language Processing (NLP) including Word Sense Disambiguation (WSD), text summarization, keyword extraction and others. Most of the work in these areas formulate their problem in a graph-based setting and apply unsupervised graph clustering to obtain a set of clusters. Recent studies suggest that graphs often exhibit a hierarchical structure that goes beyond simple flat clustering. This paper presents an unsupervised method for inferring the hierarchical grouping of the senses of a polysemous word. The inferred hierarchical structures are applied to the problem of word sense disambiguation, where we show that our method performs significantly better than traditional graph-based methods and agglomerative clustering yielding improvements over state-of-the-art WSD systems based on sense induction.

References

  1. Eneko Agirre and Aitor. Soroa. 2007. Semeval-2007 Task 02: Evaluating Word Sense Induction and Discrimination Systems. In Proceedings of SemEval-2007, pages 7--12, Prague, Czech Republic. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Eneko Agirre, David Martínez, Oier López de Lacalle, and Aitor Soroa. 2006. Two Graph-based Algorithms for State-of-the-art WSD. In Proceedings of EMNLP-2006, pages 585--593, Sydney, Australia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chris Biemann. 2006. Chinese Whispers - An Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems. In Proceedings of TextGraphs, pages 73--80, New York, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. J. Mach. Learn. Res., 3:993--1022. Google ScholarGoogle ScholarCross RefCross Ref
  5. Sergey Brin and Lawrence Page. 1998. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Comput. Netw. ISDN Syst., 30(1--7):107--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Samuel Brody and Mirella Lapata. 2009. Bayesian Word Sense Induction. In Proceedings of EACL-2009, pages 103--111, Athens, Greece. ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Aaron Clauset, Cristopher Moore, and Mark E. J. Newman. 2006. Structural Inference of Hierarchies in Networks. In Proceedings of the ICML-2006 Workshop on Social Network Analysis, pages 1--13, Pittsburgh, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Aaron Clauset, Cristopher Moore, and Mark E. J. Newman. 2008. Hierarchical Structure and the Prediction of Missing Links in Networks. Nature, 453(7191):98--101.Google ScholarGoogle ScholarCross RefCross Ref
  9. Stijn Dongen. 2000. Performance Criteria for Graph Clustering and Markov Cluster Experiments. Technical report, CWI (Centre for Mathematics and Computer Science), Amsterdam, The Netherlands. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Beate Dorow and Dominic Widdows. 2003. Discovering Corpus-specific Word Senses. In Proceedings of the EACL-2003, pages 79--82, Budapest, Hungary. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ted Dunning. 1993. Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics, 19(1):61--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Phil Edmonds and Beate Dorow. 2001. Senseval-2: Overview. In Proceedings of SensEval-2, pages 1--5, Toulouse, France. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ioannis P. Klapaftis and Suresh Manandhar. 2008. Word Sense Induction Using Graphs of Collocations. In Proceedings of ECAI-2008, pages 298--302, Patras, Greece. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ioannis P. Klapaftis and Suresh Manandhar. 2010. Taxonomy Learning Using Word Sense Induction. In Proceedings of NAACL-HLT-2010, pages 82--90, Los Angeles, California, June. ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Suresh Manandhar, Ioannis P. Klapaftis, Dmitriy Dligach, and Sameer S. Pradhan. 2010. Semeval-2010 Task 14: Word Sense Induction & Disambiguation. In Proceedings of SemEval-2, Uppsala, Sweden. ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Rada Mihalcea. 2004. Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization. In Proceedings of the ACL 2004 on Interactive poster and demonstration sessions, page 20, Morristown, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Mark Newman and Gerard Barkema. 1999. Monte Carlo Methods in Statistical Physics. Oxford: Clarendon Press, New York, USA.Google ScholarGoogle Scholar
  18. Zheng-Yu Niu, Dong-Hong Ji, and Chew-Lim Tan. 2007. I2R: Three Systems for Word Sense Discrimination, Chinese Word Sense Disambiguation, and English Word Sense Disambiguation. In Proceedings of SemEval-2007, pages 177--182, Prague, Czech Republic. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Patrick Pantel and Dekang Lin. 2003. Automatically Discovering Word Senses. In Proceedings of NAACL-HLT-2003, pages 21--22, Morristown, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ted Pedersen and Anagha Kulkarni. 2006. Automatic Cluster Stopping With Criterion Functions and the gap Statistic. In Proceedings of the 2006 Conference of the North American Chapter of the ACL on Human Language Technology, pages 276--279, Morristown, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ted Pedersen. 2007. UMND2: Senseclusters Applied to the Sense Induction Task of Senseval-4. In Proceedings of SemEval-2007, pages 394--397, Prague, Czech Republic. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Daniel Ramage, Anna N. Rafferty, and Christopher D. Manning. 2009. Random Walks for Text Semantic Similarity. In Proceedings of TextGraphs-4, Suntec, Singapore, August. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Noam Slonim, Nir Friedman, and Naftali Tishby. 2002. Unsupervised Document Classification Using Sequential Information Maximization. In SIGIR 2002, pages 129--136, New York, NY, USA. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Benjamin Snyder and Martha Palmer. 2004. The English All-words Task. In Rada Mihalcea and Phil Edmonds, editors, In Proceedings of Senseval-3, pages 41--43, Barcelona, Spain.Google ScholarGoogle Scholar
  25. Jean Véronis. 2004. Hyperlex: Lexical Cartography for Information Retrieval. Computer Speech & Language, 18(3):223--252.Google ScholarGoogle Scholar
  26. Julie Weeds, David Weir, and Diana McCarthy. 2004. Characterising Measures of Lexical Distributional Similarity. In Proceedings of COLING-2004, pages 10--15, Morristown, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Dominic Widdows and Beate Dorow. 2002. A Graph Model for Unsupervised Lexical Acquisition. In Proceedings of Coling-2002, pages 1--7, Morristown, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Word sense induction & disambiguation using hierarchical random graphs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          EMNLP '10: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
          October 2010
          1332 pages

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 9 October 2010

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader