ABSTRACT
It is well-known that there are polysemous words like sentence whose "meaning" or "sense" depends on the context of use. We have recently reported on two new word-sense disambiguation systems, one trained on bilingual material (the Canadian Hansards) and the other trained on monolingual material (Roget's Thesaurus and Grolier's Encyclopedia). As this work was nearing completion, we observed a very strong discourse effect. That is, if a polysemous word such as sentence appears two or more times in a well-written discourse, it is extremely likely that they will all share the same sense. This paper describes an experiment which confirmed this hypothesis and found that the tendency to share sense in the same discourse is extremely strong (98%). This result can be used as an additional source of constraint for improving the performance of the word-sense disambiguation algorithm. In addition, it could also be used to help evaluate disambiguation algorithms that did not make use of the discourse constraint.
- Black, Ezra (1988), "An Experiment in Computational Discrimination of English Word Senses," IBM Journal of Research and Development, v 32, pp 185--194. Google ScholarDigital Library
- Brown, Peter, Stephen Della Pietra, Vincent Della Pietra, and Robert Mercer (1991), "Word Sense Disambiguation using Statistical Methods," Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, pp 264--270. Google ScholarDigital Library
- Chapman, Robert (1977). Roget's International Thesaurus (Fourth Edition), Harper and Row, New York.Google Scholar
- Dagan, Ido, Alon Itai, and Ulrike Schwall (1991), "Two Languages are more Informative than One," Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, pp 130--137. Google ScholarDigital Library
- Gale, Church, and Yarowsky, 1992, "Discrimination Decisions for 100,000-Dimensional Spaces" AT&T Statistical Research Report No. 103.Google Scholar
- Grolier's Inc. (1991) New Grolier's Electronic Encyclopedia.Google Scholar
- Hirst, G. (1987), Semantic Interpretation and the Resolution of Ambiguity, Cambridge University Press, Cambridge. Google ScholarDigital Library
- Kelly, Edward, and Phillip Stone (1975), Computer Recognition of English Word Senses, North-Holland, Amsterdam.Google Scholar
- Mosteller, Fredrick, and David Wallace (1964) Inference and Disputed Authorship: The Federalist, Addison-Wesley, Reading, Massachusetts.Google Scholar
- Salton, G. (1989) Automatic Text Processing, Addison-Wesley Publishing Co. Google ScholarDigital Library
- Yarowsky, David (1991) "Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora", submitted to COLING-92. Google ScholarDigital Library
- One sense per discourse
Recommendations
One Sense per N-gram
WI-IAT '10: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03This paper presents a novel supposition, One Sense Per N-gram (N>1), which we believe is appropriate for more linguistic phenomena and can serve as a general version instead of the celebrated One Sense Per Collocation supposition, at least in Chinese ...
A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation
Word Sense Disambiguation (WSD) aims to automatically predict the correct sense of a word used in a given context. All human languages exhibit word sense ambiguity, and resolving this ambiguity can be difficult. Standard benchmark resources are required ...
Class Based Sense Definition Model for word sense tagging and disambiguation
SIGHAN '03: Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17We present an unsupervised learning strategy for word sense disambiguation (WSD) that exploits multiple linguistic resources including a parallel corpus, a bilingual machine readable dictionary, and a thesaurus. The approach is based on Class Based ...
Comments