Pronunciation modelling and bootstrapping

Davel, MH2012-02-232012-02-232005-08Davel, MH. Pronunciation modelling and bootstrapping. 2005. Submitted in partial fulfilment of the requirements for the degree Philosophiae Doctor (Electronic Engineering), Faculty of Engineering, the Built Environment and Information Technology, University of Pretoriahttp://www.meraka.org.za/pubs/davel05pronunciation.pdfhttp://hdl.handle.net/10204/5595Submitted in partial fulfilment of the requirements for the degree Philosophiae Doctor (Electronic Engineering), Faculty of Engineering, the Built Environment and Information Technology, University of PretoriaBootstrapping techniques have the potential to accelerate the development of language technology resources. This is of specific importance in the developing world where language technology resources are scarce and linguistic diversity is high. In this thesis we analyse the pronunciation modelling task within a bootstrapping framework, as a case study in the bootstrapping of language technology resources. We analyse the grapheme-to-phoneme conversion task in the search for a grapheme-to-phoneme conversion algorithm that can be utilised during bootstrapping. We experiment with enhancements to the Dynamically Expanding Context algorithm and develop a new algorithm for grapheme-tophoneme rule extraction (Default&Refine) that utilises the concept of a ‘default phoneme’ to create a cascade of increasingly specialised rules. This algorithm displays a number of attractive properties including rapid learning, language independence, good asymptotic accuracy, robustness to noise, and the production of a compact rule set. In order to have greater flexibility with regard to the various heuristic choices made during rewrite rule extraction, we define a new theoretical framework for analysing instance-based learning of rewrite rule sets. We define the concept of minimal representation graphs, and discuss the utility of these graphs in obtaining the smallest possible rule set describing a given set of discrete training data. We develop an approach for the interactive creation of pronunciation models via bootstrapping, and implement this approach in a system that integrates various of the analysed grapheme-to-phoneme alignment and conversion algorithms. The focus of this work is on combining machine learning and human intervention in such a way as to minimise the amount of human effort required during bootstrapping, and a generic framework for the analysis of this process is defined. Practical tools that support the bootstrapping process are developed and the efficiency of the process is analysed from both a machine learning and a human factors perspective. We find that even linguistically untrained users can use the system to create electronic pronunciation dictionaries accurately, in a fraction of the time the traditional approach requires. We create new dictionaries in a number of languages (isiZulu, Afrikaans and Sepedi) and demonstrate the utility of these dictionaries by incorporating them in speech technology systems.enBootstrappingGrapheme-to-phoneme conversionGrapheme-to-phoneme alignmentLetter-to-soundPronunciation modellingPronunciation predictionPronunciation rulesPronunciation dictionaryLanguage technology resource developmentPronunciation modelling and bootstrappingReportDavel, M. (2005). <i>Pronunciation modelling and bootstrapping</i> Retrieved from http://hdl.handle.net/10204/5595Davel, MH <i>Pronunciation modelling and bootstrapping.</i> 2005. http://hdl.handle.net/10204/5595Davel M. Pronunciation modelling and bootstrapping. 2005 [cited yyyy month dd]. Available from: http://hdl.handle.net/10204/5595TY - Report AU - Davel, MH AB - Bootstrapping techniques have the potential to accelerate the development of language technology resources. This is of specific importance in the developing world where language technology resources are scarce and linguistic diversity is high. In this thesis we analyse the pronunciation modelling task within a bootstrapping framework, as a case study in the bootstrapping of language technology resources. We analyse the grapheme-to-phoneme conversion task in the search for a grapheme-to-phoneme conversion algorithm that can be utilised during bootstrapping. We experiment with enhancements to the Dynamically Expanding Context algorithm and develop a new algorithm for grapheme-tophoneme rule extraction (Default&Refine) that utilises the concept of a ‘default phoneme’ to create a cascade of increasingly specialised rules. This algorithm displays a number of attractive properties including rapid learning, language independence, good asymptotic accuracy, robustness to noise, and the production of a compact rule set. In order to have greater flexibility with regard to the various heuristic choices made during rewrite rule extraction, we define a new theoretical framework for analysing instance-based learning of rewrite rule sets. We define the concept of minimal representation graphs, and discuss the utility of these graphs in obtaining the smallest possible rule set describing a given set of discrete training data. We develop an approach for the interactive creation of pronunciation models via bootstrapping, and implement this approach in a system that integrates various of the analysed grapheme-to-phoneme alignment and conversion algorithms. The focus of this work is on combining machine learning and human intervention in such a way as to minimise the amount of human effort required during bootstrapping, and a generic framework for the analysis of this process is defined. Practical tools that support the bootstrapping process are developed and the efficiency of the process is analysed from both a machine learning and a human factors perspective. We find that even linguistically untrained users can use the system to create electronic pronunciation dictionaries accurately, in a fraction of the time the traditional approach requires. We create new dictionaries in a number of languages (isiZulu, Afrikaans and Sepedi) and demonstrate the utility of these dictionaries by incorporating them in speech technology systems. DA - 2005-08 DB - ResearchSpace DP - CSIR KW - Bootstrapping KW - Grapheme-to-phoneme conversion KW - Grapheme-to-phoneme alignment KW - Letter-to-sound KW - Pronunciation modelling KW - Pronunciation prediction KW - Pronunciation rules KW - Pronunciation dictionary KW - Language technology resource development LK - https://researchspace.csir.co.za PY - 2005 T1 - Pronunciation modelling and bootstrapping TI - Pronunciation modelling and bootstrapping UR - http://hdl.handle.net/10204/5595 ER -