ABSTRACT
We describe the University of Heidelberg (UHD) system for the Cross-Lingual Word Sense Disambiguation SemEval-2010 task (CL-WSD). The system performs CL-WSD by applying graph algorithms previously developed for monolingual Word Sense Disambiguation to multilingual co-occurrence graphs. UHD has participated in the Best and out-of-five (OOF) evaluations and ranked among the most competitive systems for this task, thus indicating that graph-based approaches represent a powerful alternative for this task.
- }}Eneko Agirre, David Martínez, Oier López de Lacalle, and Aitor Soroa. 2006. Two graph-based algorithms for state-of-the-art WSD. In Proc. of EMNLP-06, pages 585--593. Google ScholarDigital Library
- }}Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30(1--7): 107--117. Google ScholarDigital Library
- }}Christiane Fellbaum, editor. 1998. WordNet: An Electronic Database. MIT Press, Cambridge, MA.Google Scholar
- }}Philipp Koehn. 2005. Europarl: A parallel corpus for statistical machine translation. In Proceedings of Machine Translation Summit X.Google Scholar
- }}Els Lefever and Veronique Hoste. 2010. SemEval-2010 Task 3: Cross-lingual Word Sense Disambiguation. In Proc. of SemEval-2010. Google ScholarDigital Library
- }}Mausam, Stephen Soderland, Oren Etzioni, Daniel Weld, Michael Skinner, and Jeff Bilmes. 2009. Compiling a massive, multilingual dictionary via probabilistic inference. In Proc. of ACL-IJCNLP-09, pages 262--270. Google ScholarDigital Library
- }}Roberto Navigli and Simone Paolo Ponzetto. 2010. BabelNet: Building a very large multilingual semantic network. In Proc. of ACL-10. Google ScholarDigital Library
- }}Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1): 19--51. Google ScholarDigital Library
- }}Helmut Schmid. 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing (NeMLaP '94), pages 44--49.Google Scholar
- }}Ralf Steinberger, Bruno Pouliquen, Anna Widiger, Camelia Ignat, Tomaž Erjavec, Dan Tufiş, and Dániel Varga. 2006. The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages. In Proc. of LREC '06.Google Scholar
- }}Jean Véronis. 2004. Hyperlex: lexical cartography for information retrieval. Computer Speech & Language, 18(3):223--252.Google Scholar
- }}Piek Vossen, editor. 1998. EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Kluwer, Dordrecht, The Netherlands. Google ScholarDigital Library
- }}Andrea Zielinski, Christian Simon, and Tilman Wittl. 2009. Morphisto: Service-oriented open source morphology for German. In State of the Art in Computational Morphology, volume 41 of Communications in Computer and Information Science, pages 64--75. Springer.Google Scholar
Index Terms
- UHD: Cross-lingual word sense disambiguation using multilingual co-occurrence graphs
Recommendations
Cross-lingual word sense disambiguation for languages with scarce resources
Canadian AI'11: Proceedings of the 24th Canadian conference on Advances in artificial intelligenceWord Sense Disambiguation has long been a central problem in computational linguistics. Word Sense Disambiguation is the ability to identify the meaning of words in context in a computational manner. Statistical and supervised approaches require a large ...
Choosing the best dictionary for Cross-Lingual Word Sense Disambiguation
Selection of the best dictionary for Cross-Lingual Word Sense Disambiguation tasks.Potential improvements offered by automatically built dictionaries in ideal systems.Performance of different dictionaries on a particular unsupervised CLWSD ...
Unsupervised Word-Sense Disambiguation Using Bilingual Comparable Corpora
An unsupervised method for word-sense disambiguation using bilingual comparable corpora was developed. First, it extracts word associations, i.e., statistically significant pairs of associated words, from the corpus of each language. Then, it aligns ...
Comments