The authors focus on factors related to the underlying rule-extraction algorithms, and demonstrate variants of the Dynamically Expanding Context algorithm, which are beneficial for this application. They show that continuous updating of the learned rules, coupled with a new approach to grapheme-to-phoneme alignment and a sliding-window approach to choosing the content window, leads to an efficient and accurate bootstrapping mechanism. In this paper the authors describe the techniques implemented to optimise the process from a machine learning perspective and report on the results achieved.
Reference:
Davel, MH and Barnard, E. 2004. Efficient generation of pronunciation dictionaries: machine learning factors during bootstrapping. 8th International Conference on Spoken Language Processing, Jeju Island, Korea, 4 - 8 October 2004
Davel, M., & Barnard, E. (2004). Efficient generation of pronunciation dictionaries: machine learning factors during bootstrapping. http://hdl.handle.net/10204/5502
Davel, MH, and E Barnard. "Efficient generation of pronunciation dictionaries: machine learning factors during bootstrapping." (2004): http://hdl.handle.net/10204/5502
Davel M, Barnard E, Efficient generation of pronunciation dictionaries: machine learning factors during bootstrapping; 2004. http://hdl.handle.net/10204/5502 .