Data requirements for speaker independent acoustic models

Badenhorst, JACDavel, M2009-06-172009-06-172008-11Badenhorst, JAC and Davel, M. 2008. Data requirements for speaker independent acoustic models. 19th Annual Symposium of the Pattern Recognition Association of South Africa (PRASA 2008), Cape Town, South Africa, 27-28 November 2008, pp 147-152http://hdl.handle.net/10204/343919th Annual Symposium of the Pattern Recognition Association of South Africa (PRASA 2008), Cape Town, South Africa, 27-28 November 2008When developing speech recognition systems in resource-constrained environments, careful design of the training corpus can play an important role in compensating for data scarcity. One of the factors to consider relates to the speaker composition of a corpus, finding the appropriate balance between the number of speakers and the number of speaker-specific utterances. The authors define a model stability measure based on the Bhattacharyya bound and apply this to analyse intra- and inter-speaker variability of a training corpus. The authors find that the different phone groups exhibit a significantly different behaviour across groups, but within groups similar trends are observed. They demonstrate that at a predictable point, additional data from one speaker does not contribute further to modelling accuracy and demonstrate the trends that can be expected when additional speakers are addedenSpeech recognitionAcoustic modelsTraining corpusBhattacharyyaPRASA 2008Nineteenth Annual Symposium of the Pattern Recognition Association of South AfricaData requirements for speaker independent acoustic modelsConference PresentationBadenhorst, J., & Davel, M. (2008). Data requirements for speaker independent acoustic models. PRASA 2008. http://hdl.handle.net/10204/3439Badenhorst, JAC, and M Davel. "Data requirements for speaker independent acoustic models." (2008): http://hdl.handle.net/10204/3439Badenhorst J, Davel M, Data requirements for speaker independent acoustic models; PRASA 2008; 2008. http://hdl.handle.net/10204/3439 .TY - Conference Presentation AU - Badenhorst, JAC AU - Davel, M AB - When developing speech recognition systems in resource-constrained environments, careful design of the training corpus can play an important role in compensating for data scarcity. One of the factors to consider relates to the speaker composition of a corpus, finding the appropriate balance between the number of speakers and the number of speaker-specific utterances. The authors define a model stability measure based on the Bhattacharyya bound and apply this to analyse intra- and inter-speaker variability of a training corpus. The authors find that the different phone groups exhibit a significantly different behaviour across groups, but within groups similar trends are observed. They demonstrate that at a predictable point, additional data from one speaker does not contribute further to modelling accuracy and demonstrate the trends that can be expected when additional speakers are added DA - 2008-11 DB - ResearchSpace DP - CSIR KW - Speech recognition KW - Acoustic models KW - Training corpus KW - Bhattacharyya KW - PRASA 2008 KW - Nineteenth Annual Symposium of the Pattern Recognition Association of South Africa LK - https://researchspace.csir.co.za PY - 2008 T1 - Data requirements for speaker independent acoustic models TI - Data requirements for speaker independent acoustic models UR - http://hdl.handle.net/10204/3439 ER -