A speech processing system is often required to perform in a different environment than the one for which it was initially developed. In such a case, data from the new environment may be more limited in quantity and of poorer quality than the carefully selected training data used to construct the system initially. The authors investigated the process of porting a Spoken Language Identification (S-LID) system to a new environment and describe methods to prepare it for more effective use. Specifically they demonstrate that retraining only the classifier component of the system provides a significant improvement over an initial system developed using acoustic models channel-normalized to the new environment. They also find that the most accurate system requires retraining of both the acoustic models and the final classifier.
Reference:
Peche, M, Davel, M and Barnard, E. Porting a spoken language identification systen to a new environment. Nineteenth Annual Symposium of the Pattern Recognition Association of South Africa (PRASA 2008), Cape Town, South Africa, 27-28 November, pp 103-107.
Peche, M., Davel, M., & Barnard, E. (2008). Porting a spoken language identification systen to a new environment. http://hdl.handle.net/10204/3020
Peche, M, M Davel, and E Barnard. "Porting a spoken language identification systen to a new environment." (2008): http://hdl.handle.net/10204/3020
Peche M, Davel M, Barnard E, Porting a spoken language identification systen to a new environment; 2008. http://hdl.handle.net/10204/3020 .