This study focuses on the effective pronunciation modelling of words from different languages encountered during the development of a Sepedi automatic speech recognition (ASR) system. While the speech corpus used for training the ASR system consists mostly of Sepedi utterances, many words from English (and other South African languages) are embedded within the Sepedi sentences. In order to model these words effectively, different approaches to pronunciation dictionary development are investigated, specifically: (1) using language-specific letter-to-sound rules to predict the pronunciation of each word (based on the language of the word) and mapping foreign phonemes to Sepedi phonemes using linguistically motivated mappings, (2) experimenting with data-driven foreign-to-Sepedi phonemes using linguistically motivated mappings, and (3) using Sepedi letter-to-sound to predict the pronunciation of all words irrespective of language. We find that the data-driven phoneme mappings are more accurate than the initial linguistically motivated mappings evaluated, and (with a slight margin) obtain our best result using Sepedi letter-to-sound rules across all words in the speech corpus.
Reference:
Modipa, T and Davel, MH. 2010. Pronunciation modelling of foreign words for Sepedi ASR. 21st Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, 22-23 November 2010, pp 185-189
Modipa, T., & Davel, M. (2010). Pronunciation modelling of foreign words for Sepedi ASR. PRASA 2010. http://hdl.handle.net/10204/4715
Modipa, T, and MH Davel. "Pronunciation modelling of foreign words for Sepedi ASR." (2010): http://hdl.handle.net/10204/4715
Modipa T, Davel M, Pronunciation modelling of foreign words for Sepedi ASR; PRASA 2010; 2010. http://hdl.handle.net/10204/4715 .