ABSTRACT
Various informations can be used to align parallel texts at word level: co-occurrence frequencies, position difference, part-of-speech, graphic resemblance, etc. This paper proposes a simple method to combine these clues in an efficient way. The association score is computed from the probabilities of pairing two units under Null hypothesis, assuming that the association is fortuitous. This approach has been applied to a literary English-French parallel text with good results.
- L. Ahrenberg, M. Merkel, A. S. Hein and J. Tiedemann 2000. Evaluation of word alignment systems. In Proceedings of the 2nd International Conference on Language Resources and Evaluation, LREC-2000. European Language Resources Association.Google Scholar
- R. D. Brown, J. G. Carbonell, Y. Yimin 2000. Automatic dictionary extraction for cross-language information retrieval. In Parallel Text Processing, J. Véronis, ed., 275--297. Dordrecht, Nederlands: Kluwer Academic Publishers.Google Scholar
- Ido Dagan, K. W. Church and W. Gale. 1993 Robust Bilingual Word Alignment for Machine Aided Translation. In Proceedings of the Workshop on Very Large Corpora, Academic and Industrial Perspectives, pp. 1--8.Google Scholar
- Ted Dunning 1993. Accurate Methods for the Statistics of surprise and Coincidence. Computational Linguistics. Vol 19, 1, pp. 61--74. Google ScholarDigital Library
- Huang Jin-Xia and Choi Key-Sun 2000. Chinese-Korean Word Alignment Based on Linguistic Comparison. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, ACL-2000, pp.392--399 Google ScholarDigital Library
- Gabriel Lopes and João Mexia 2001. Cognates alignment. In Proceedings of MT Summit VIII.Google Scholar
- Dan Melamed 1998. Automatic evaluation and uniform filter cascades for inducing n-best translation lexicons. In Third Workshop on Very Large Corpora (WVLC3), Boston, MA.Google Scholar
- J. Nerbonne 2000. Parallel texts in computer-assisted language learning, In Parallel Text Processing, J. Véronis, ed., pages 299--311. Dordrecht, Kluwer Academic Publishers.Google Scholar
- Hwee Tou Ng, Wang Bin and Chan Yee Seng 2003. Exploiting Parallel Texts for Word Sense Disambiguation: In Proceedings of the Annual Meeting of the Association for Computational Linguistics, ACL 2003. Google ScholarDigital Library
- Jörg Tiedemann 2003. Combining Clues for Word Alignment. In Proceedings of the 10th Conference of the European Chapter of the ACL (EACL03), Budapest, Hungary, April 12--17, 2003. Google ScholarDigital Library
- Dan Tufis. 2002. A cheap and fast way to build useful translation lexicons. In Proceedings of the 19th International Conference on Computational Linguistics, COLING-2002. Google ScholarDigital Library
- J. Véronis and P. Langlais. 2000. Evaluation of parallel text alignment systems - The ARCADE project. In Parallel Text Processing, J. Véronis, ed., 49--68. Dordrecht, Nederlands: Kluwer Academic Publishers.Google Scholar
Recommendations
Combining Lexical Stability and Improved Lexical Chain for Unsupervised Word Sense Disambiguation
KAM '09: Proceedings of the 2009 Second International Symposium on Knowledge Acquisition and Modeling - Volume 01Word Sense Disambiguation (WSD) is a traditional AI-hard problem. An improvement of WSD would have a significant impact on applications such as knowledge acquisition, text mining, information extraction, etc. Lexical chain holds a set of semantically ...
Combining clues for word alignment
EACL '03: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1In this paper, a word alignment approach is presented which is based on a combination of clues. Word alignment clues indicate associations between words and phrases. They can be based on features such as frequency, part-of-speech, phrase type, and the ...
Lexical disambiguation using simulated annealing
HLT '91: Proceedings of the workshop on Speech and Natural LanguageThe resolution of lexical ambiguity is important for most natural language processing tasks, and a range of computational techniques have been proposed for its solution. None of these has yet proven effective on a large scale. In this paper, we describe ...
Comments