Modipa, TDavel, MH2010-12-232010-12-232010-11Modipa, T and Davel, MH. 2010. Pronunciation modelling of foreign words for Sepedi ASR. 21st Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, 22-23 November 2010, pp 185-189978-0-7992-2470-2http://hdl.handle.net/10204/471521st Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, 22-23 November 2010This study focuses on the effective pronunciation modelling of words from different languages encountered during the development of a Sepedi automatic speech recognition (ASR) system. While the speech corpus used for training the ASR system consists mostly of Sepedi utterances, many words from English (and other South African languages) are embedded within the Sepedi sentences. In order to model these words effectively, different approaches to pronunciation dictionary development are investigated, specifically: (1) using language-specific letter-to-sound rules to predict the pronunciation of each word (based on the language of the word) and mapping foreign phonemes to Sepedi phonemes using linguistically motivated mappings, (2) experimenting with data-driven foreign-to-Sepedi phonemes using linguistically motivated mappings, and (3) using Sepedi letter-to-sound to predict the pronunciation of all words irrespective of language. We find that the data-driven phoneme mappings are more accurate than the initial linguistically motivated mappings evaluated, and (with a slight margin) obtain our best result using Sepedi letter-to-sound rules across all words in the speech corpus.enSepediAutomatic speech recognitionPronunciation modellingPattern recognitionPRASA 2010Pronunciation modelling of foreign words for Sepedi ASRConference PresentationModipa, T., & Davel, M. (2010). Pronunciation modelling of foreign words for Sepedi ASR. PRASA 2010. http://hdl.handle.net/10204/4715Modipa, T, and MH Davel. "Pronunciation modelling of foreign words for Sepedi ASR." (2010): http://hdl.handle.net/10204/4715Modipa T, Davel M, Pronunciation modelling of foreign words for Sepedi ASR; PRASA 2010; 2010. http://hdl.handle.net/10204/4715 .TY - Conference Presentation AU - Modipa, T AU - Davel, MH AB - This study focuses on the effective pronunciation modelling of words from different languages encountered during the development of a Sepedi automatic speech recognition (ASR) system. While the speech corpus used for training the ASR system consists mostly of Sepedi utterances, many words from English (and other South African languages) are embedded within the Sepedi sentences. In order to model these words effectively, different approaches to pronunciation dictionary development are investigated, specifically: (1) using language-specific letter-to-sound rules to predict the pronunciation of each word (based on the language of the word) and mapping foreign phonemes to Sepedi phonemes using linguistically motivated mappings, (2) experimenting with data-driven foreign-to-Sepedi phonemes using linguistically motivated mappings, and (3) using Sepedi letter-to-sound to predict the pronunciation of all words irrespective of language. We find that the data-driven phoneme mappings are more accurate than the initial linguistically motivated mappings evaluated, and (with a slight margin) obtain our best result using Sepedi letter-to-sound rules across all words in the speech corpus. DA - 2010-11 DB - ResearchSpace DP - CSIR KW - Sepedi KW - Automatic speech recognition KW - Pronunciation modelling KW - Pattern recognition KW - PRASA 2010 LK - https://researchspace.csir.co.za PY - 2010 SM - 978-0-7992-2470-2 T1 - Pronunciation modelling of foreign words for Sepedi ASR TI - Pronunciation modelling of foreign words for Sepedi ASR UR - http://hdl.handle.net/10204/4715 ER -