ABSTRACT
This paper reports on an exploratory investigation as to whether classes of Urdu N-V complex predicates can be identified on the basis syntactic patterns and lexical choices associated with the N-V complex predicates. Working with data from a POS annotated corpus, we show that choices with respect to the number of arguments, case marking on subjects and which light verbs are felicitous with which nouns depend heavily on the semantics of the noun in the N-V complex predicate. This initial work represents an important step towards identifying semantic criteria relevant for complex predicate formation. Identifying the semantic criteria and being able to systematically code them in turn represents a first step towards building up a lexical resource for nouns as part of developing natural language processing tools for the underresourced South Asian language Urdu.
- Bhattacharyya, P. (2010). IndoWordNet. In Proceedings of LREC2010. Malta, May.Google Scholar
- Butt, M. (1995). The Structure of Complex Predicates in Urdu. Stanford: CSLI Publications.Google Scholar
- Butt, M., T. Bögel, A. Hautli, and S. Sulger (2009). Urdu and the modular architecture of ParGram. In Proceedings of the Conference on Language and Technology 2009 (CLT09), pp. 1--7.Google Scholar
- Butt, M. and T. H. King (2007). Urdu in a parallel grammar development environment. Language Resources and Evaluation 41, 191--207.Google ScholarCross Ref
- Butt, M. and G. Ramchand (2005). Complex aspectual structure in Hindi/Urdu. In N. Ertischik-Shir and T. Rapoport (Eds.), The Syntax of Aspect, pp. 117--153. Oxford: Oxford University Press.Google ScholarCross Ref
- Graddol, D. (2004). The future of language. Science 303, 1329--1331.Google ScholarCross Ref
- Humayoun, M. (2006). Urdu morphology, orthography and lexicon extraction. MSc Thesis, Department of Computing Science, Chalmers University of Technology.Google Scholar
- Hwang, J. D., A. Bhatia, C. Bonial, A. Mansouri, A. Vaidya, N. Xue, and M. Palmer (2010). Propbank annotation of multilingual light verb constructions. In Proceedings of the Fourth Linguistic Annotation Workshop (LAW), ACL 2010, Uppsala, Sweden, pp. 82--90. Google ScholarDigital Library
- Levin, B. (1993). English Verb Classes and Alternations. A Preliminary Investigation. Chicago: The University of Chicago Press.Google Scholar
- Mohanan, T. (1994). Argument Structure in Hindi. Stanford: CSLI Publications.Google Scholar
- Schulte im Walde, S. (2009). The induction of verb frames and verb classes from corpora. In A. Lüdeling and M. Kytö (Eds.), Corpus Linguistics. An International Handbook. Berlin: Mouton de Gruyter.Google Scholar
Index Terms
- Discovering semantic classes for Urdu N-V complex predicates
Recommendations
A survey on Urdu and Urdu like language stemmers and stemming techniques
Stemming is one of the basic steps in natural language processing applications such as information retrieval, parts of speech tagging, syntactic parsing and machine translation, etc. It is a morphological process that intends to convert the inflected ...
Roman-Urdu-Parl: Roman-Urdu and Urdu Parallel Corpus for Urdu Language Understanding
Availability of corpora is a basic requirement for conducting research in a particular language. Unfortunately, for a morphologically rich language like Urdu, despite being used by over a 100 million people around the globe, the dearth of corpora is a ...
Developing a Cross-lingual Semantic Word Similarity Corpus for English–Urdu Language Pair
Semantic word similarity is a quantitative measure of how much two words are contextually similar. Evaluation of semantic word similarity models requires a benchmark corpus. However, despite the millions of speakers and the large digital text of the Urdu ...
Comments