ABSTRACT
Automatic speech recognition (ASR) systems are increasingly being developed for under-resourced languages, especially for use in multilingual spoken dialogue systems. We investigate different approaches to the acoustic modelling of Sepedi affricates for ASR. We determine that it is possible to model various of these complex consonants as a sequence of much simpler sounds. This approach reduces the Sepedi phoneme inventory from 45 to 32, resulting in simpler dictionary development and transcription processes, as well as more accurate acoustic modelling.
- E. Barnard, M. Davel, and C. van Heerden. ASR corpus design for resource-scarce languages. In Proc. Interspeech, pages 2847--2850, Brighton, UK, Sept. 2009.Google Scholar
- N. G. Clements and E. Hume. The handbook of phonological theory, chapter The internal organization of speech sounds, pages 245--306. Blackwell, 1995.Google Scholar
- M. Davel and O. Martirosian. Pronunciation dictionary development in resource-scarce environments. In Proc. Interspeech, pages 2851--2854, Brighton, UK, Sept. 2009.Google Scholar
- M. H. Davel and E. Barnard. A unified phoneme set for the south african languages. in prep.Google Scholar
- P. Lehohla. Census 2001: Census in brief. Statistics South Africa, 2003.Google Scholar
- T. M. Modiba. Aspects of automatic speech recognition with respect to Northern Sotho. Master's thesis, University of the North, South Africa, 2004.Google Scholar
- J. Roux, E. Botha, and J. du Preez. Developing a multilingual telephone based information system in african languages. In Proc. LREC, pages 975--980, Athens, Greece, June 2000.Google Scholar
- C. van Heerden, E. Barnard, and M. Davel. Basic speech recognition for spoken dialogues. In Proc. Interspeech, pages 3003--3006, Brighton, UK, Sept. 2009.Google Scholar
- D. van Niekerk and E. Barnard. Phonetic alignment for speech synthesis in under-resourced languages. In Proc. Interspeech, pages 880--883, Brighton, UK, Sept. 2009.Google Scholar
- S. Zerbian. Onset consonants in Tswana: CW-sequences and affricates. 2009.Google Scholar
Index Terms
- Acoustic modelling of Sepedi affricates for ASR
Recommendations
Improving Acoustic Models with Captioned Multimedia Speech
ICMCS '99: Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2Speech recognition can be used to create searchable transcripts for audio indexing in digital video libraries. Large amounts of hand-transcribed speech training data are required to build or improve acoustic models of highly accurate speech recognition ...
Consonant gemination in Italian: The affricate and fricative case
Highlights- Consonant duration is the primary acoustic cue of gemination in intervocalic italian fricatives.
AbstractConsonant gemination in Italian affricates and fricatives was investigated, completing the overall study of gemination of Italian consonants. Results of the analysis of other consonant categories, i.e. stops, nasals, and liquids, ...
European Portuguese Accent in Acoustic Models for Non-native English Speakers
Progress in Pattern Recognition, Image Analysis and ApplicationsAbstractThe development of automatic speech recognition systems poses several known difficulties. One of them concerns the recognizer’s accuracy when dealing with non-native speakers of a given language. Normally a recognizer precision is lower for non-...
Comments