ABSTRACT
Schwa deletion is an important issue in grapheme-to-phoneme conversion for Indo-Aryan languages (IAL). In this paper, we describe a syllable minimization based algorithm for dealing with this that outperforms the existing methods in terms of efficiency and accuracy. The algorithm is motivated by the fact that deletion of schwa is a diachronic and sociolinguistic phenomenon that facilitates faster communication through syllable economy. The contribution of the paper is not just a better algorithm for schwa deletion; rather we describe here a constrained optimization based framework that can partly model the evolution of languages, and hence, can be used for solving many problems in computational linguistics that call for diachronic explanations.
- Bart de Boer 2000. Self Organization in Vowel Systems. Journal of Phonetics, 28:441--465Google ScholarCross Ref
- Angelo Cangelosi and Domenico Parisi (Eds) 2002. Simulating the Evolution of Language. Springer-Verlag, London Google ScholarDigital Library
- Suniti K. Chatterji 1926. The Origin and Development of the Bengali Language. Rupa and Co.Google Scholar
- Eric Fosler-Lussier, Steven Greenberg and N Morgan 1999. Incorporating contextual phonetics into automatic speech recognition. Proc. Int. Cong. Phon. Sci., San Francisco, pp. 611--614.Google Scholar
- Steven Greenberg 1999. Speaking in shorthand - A syllablecentric perspective for understanding pronunciation variation. Speech Communication, 29:159--176. Google ScholarDigital Library
- Hindi Bangla English -- Tribhasa Abhidhaan. 2001 Sandhya PublicationGoogle Scholar
- John E. Hopcroft and Jeffery D. Ullman 1979. Introduction to Automata Theory, Languages and Computation, Addison-Wesley, USA Google ScholarDigital Library
- Rene Kager 1999. Optimality Theory. Cambridge University PressGoogle Scholar
- S. Kaira 1976. Schwa-deletion in Hindi. Language forum (back volumes), Bhari publications, 2 (1)Google Scholar
- Peter F. MacNeilage and Barbara L. Davis 2000. On the Origin of Internal Structure of Word Forms. Science, 288:527--31Google ScholarCross Ref
- B. G. Misra 1967. Historical Phonology of Standard Hindi: Proto Indo European to the present. Cornell University Ph. D. dissertationGoogle Scholar
- Manjari Ohala 1977. The Treatment of Phonological variation: An example from Hindi. Lingua, 42: 161--76Google ScholarCross Ref
- Manjari Ohala. 1983. Aspects of Hindi Phonology, volume II. MLBD Series in Linguistics, Motilal Banarsidass, New Delhi.Google Scholar
- Bhuvana Narasimhan, Richard Sproat and G Kiraz. 2001. Schwa-deletion in Hindi Text-to-Speech Synthesis. Workshop on Computational Linguistics in South Asian Languages, 21st SALA, KonstanzGoogle Scholar
- Martin A. Nowak, Natalia L. Komarova and Partha Niyogi 2002. Computational and Evolutionary Aspects of Language, Nature, 417:611--17Google ScholarCross Ref
- B. R. Pray 1970. Topics in Hindi -- Urdu grammar. Research Monograph 1, Berkeley: Center for South and Southeast Asia Studies, University of CaliforniaGoogle Scholar
- Bernard Tranel 1999. Optional Schwa Deletion: on syllable economy in French. Formal Perspectives on Romance Linguistics, Ed. By J. Mark Authier, Barbar S. Bullock,&Lisa A. Reed.Google Scholar
- T. Vennemann 1988. Preference Laws for Syllable Structures. Mouton de Gruyter, BerlinGoogle Scholar
- A diachronic approach for schwa deletion in Indo Aryan languages
Recommendations
A Generic Tool for Identification of Indo-Aryan Multi Word Expression
AbstractThe linguistic tools are essential for any language. A linguistic tool could be Parts of Speech Tagger (POST), Grammar Checker (GC), Alankaar Finder (AF), and Identification of Multi-word Expression (IMWE). MWE is a term that is used to represent ...
Towards Developing Uniform Lexicon Based Sorting Algorithm for Three Prominent Indo-Aryan Languages
Three different Indic/Indo-Aryan languages - Bengali, Hindi and Nepali have been explored here in character level to find out similarities and dissimilarities. Having shared the same root, the Sanskrit, Indic languages bear common characteristics. That is ...
Low Resource Neural Machine Translation: Assamese to/from Other Indo-Aryan (Indic) Languages
Machine translation (MT) systems have been built using numerous different techniques for bridging the language barriers. These techniques are broadly categorized into approaches like Statistical Machine Translation (SMT) and Neural Machine Translation (...
Comments