ABSTRACT
This article describes the results of a systematic in-depth study of the criteria used for word sense disambiguation. Our study is based on 60 target words: 20 nouns, 20 adjectives and 20 verbs. Our results are not always in line with some practices in the field. For example, we show that omitting non-content words decreases performance and that bigrams yield better results than unigrams.
- Audibert L. (2001), LoX: Outil Polyvalent pour l'Exploration de Corpus Annotés, 5ème Rencontre des étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL-2001), 411--419.Google Scholar
- Audibert L. (2003), Etude des Critères de Désambiguïsation Sémantique Automatique: Résultats sur les Cooccurrences, 10ème conférence sur le Traitement Automatique des Langues Naturelles (TALN-2003), 35--44.Google Scholar
- Bruce R., Wiebe J., Perdersen T. (1996), The Measure of a Model, 1st Conference on Empirical Methods in Natural Language Processing (EMNLP-1996), 101--112.Google Scholar
- Cussens J. (1993), Bayes and Pseudo-Bayes Estimates of Conditional Probability and their Reliability, 6th European Conference on Machine Learning (ECML-1993), 136--152. Google ScholarDigital Library
- Daelemans W., Hoste V., Meulder F. D., Naudts B. (2003), Combined Optimization of Feature Selection and Algorithm Parameter Interaction in Machine Learning of Language, 14th European Conference on Machine Learning (ECML-2003), 84--95.Google ScholarDigital Library
- Domingos P., Pazzani M. (1997), Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier, Machine Learning, 29: 103--130. Google ScholarDigital Library
- El-Bèze M., Loupy C. d., Marteau P.-F. (1998), WSD Based on Three Short Context Methods, SENSEVAL Workshop, in press.Google Scholar
- Golding A. R. (1995), A Bayesion Hybrid Method for Context-Sensitive Spelling Correction, 3th Workshop on Very Large Corpora, 39--53.Google Scholar
- Kilgarriff A. (1997), Evaluating Word Sense Disambiguation Programs: Progress Report, Speech and Language Technology (SALT-1997) Workshop on Evaluation in Speech and Language Technology, 114--120.Google Scholar
- Kilgarriff A., Rosenzweig J. (2000), English Senseval: Report and Results, 2nd International Conference on Language Resources and Evaluation (LREC-2000), 3: 1239--1244.Google Scholar
- Mooney R. J. (1996), Comparative Experiments on Disambiguating Word Senses: an Illustration of the Role of Bias in Machine Learning, 1st Conference on Empirical Methods in Natural Language Processing (EMNLP-1996), 82--91.Google Scholar
- Ng H. T. (1997a), Exemplar-Based Word Sense Disambiguation: Some Recent Improvements, 2nd Conference on Empirical Methods in Natural Language Processing (EMNLP-1997), 208--213.Google Scholar
- Ng H. T. (1997b), Getting Serious About Word Sense Disambiguation, Association for Computational Linguistics Special Interest Group on the Lexicon (ACL-SIGLEX-1997): Workshop "Tagging Text with Lexical Semantics: Why, What, and How ?" 1--7.Google Scholar
- Ng H. T., Lee Y. K. (1996), Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach, 34th Annual Meeting of the Society for Computational Linguistics, 40--47. Google ScholarDigital Library
- Ng H. T., Lee Y. K. (2002), An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation, 7th Conference on Empirical Methods in Natural Language Processing (EMNLP-2002), 41--48. Google ScholarDigital Library
- Ng H. T., Zelle J. (1997), Corpus-Based Approaches to Semantic Interpretation in Natural Language Processing, Artificial Intelligence Magazine - Special Issue on Natural Language Processing, 18: 45--64.Google Scholar
- Palmer M. (1998), Are WordNet Sense Distinctions Appropriate for Computational Lexicons, Association for Computational Linguistics Special Interest Group on the Lexicon (ACL-SIGLEX-1998): Senseval, in press.Google Scholar
- Pedersen T. (2001), Machine Learning with Lexical Features: the Duluth Approach to Senseval-2, 2nd International Workshop on Evaluating Word Sense Disambiguation Systems (Senseval-2), 139--142. Google ScholarDigital Library
- Reymond D. (2002), Méthodologie pour la Création d'un Dictionnaire Distributionnel dans une Perspective d'Étiquetage Lexical Semi-Automatique, 6ème Rencontre des étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL-2002), 405--414.Google Scholar
- Segond F. (2000), Framework and Results for French, Computers and the Humanities, 34: 49--60.Google ScholarCross Ref
- Valli A., Véronis J. (1999), Étiquetage Grammatical de Corpus Oraux: Problèmes et Perpectives, Revue Française de Linguistique Appliquée, 4: 113--133.Google Scholar
- Véronis J. (1998), A Study of Polysemy Judgements and Inter-Annotator Agreement, Programme and Advanced Papers of the Senseval-1 Workshop, 2--4.Google Scholar
- Véronis J. (2001), Sense Tagging: Does It Make Sense ?, Corpus Linguistics Conference, http://www.up.univ-mrs.fr/~veronis/pdf/2001-lancaster-sense.pdf.Google Scholar
- Yarowsky D. (1993), One Sense Per Collocation, ARPA Workshop on Human Language Technology, 266--271. Google ScholarDigital Library
- Yarowsky D. (1994), A Comparision of Corpus-Based Techniques for Restoring Accents in Spanish and French Text, 2nd Annual Workshop on Very Large Text Corpora, 19--32.Google Scholar
- Yarowsky D. (2000), Hierarchical Decision List for Word Sense Disambiguation, Computers and the Humanities, 34: 179--186.Google Scholar
- Word sense disambiguation criteria: a systematic study
Recommendations
An unsupervised method for word sense disambiguation
AbstractWord sense disambiguation (WSD) finds the actual meaning of a word according to its context. This paper presents a novel WSD method to find the correct sense of a word present in a sentence. The proposed method uses both the WordNet ...
Unsupervised Word-Sense Disambiguation Using Bilingual Comparable Corpora
An unsupervised method for word-sense disambiguation using bilingual comparable corpora was developed. First, it extracts word associations, i.e., statistically significant pairs of associated words, from the corpus of each language. Then, it aligns ...
A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation
Word Sense Disambiguation (WSD) aims to automatically predict the correct sense of a word used in a given context. All human languages exhibit word sense ambiguity, and resolving this ambiguity can be difficult. Standard benchmark resources are required ...
Comments