dc.contributor.author |
Badenhorst, J
|
|
dc.contributor.author |
De Waal, A
|
|
dc.contributor.author |
De Wet, Febe
|
|
dc.date.accessioned |
2012-06-29T08:26:12Z |
|
dc.date.available |
2012-06-29T08:26:12Z |
|
dc.date.issued |
2012-05 |
|
dc.identifier.citation |
Badenhorst, J, De Waal, A and De Wet, F. Quality measurements for mobile data collection in the developing world. Third International Workshop on Spoken Languages Technologies for Under-resourced Languages (SLTU'12), Monkey Valley, Cape Town, South Africa, 7-9 May 2012 |
en_US |
dc.identifier.uri |
http://hdl.handle.net/10204/5954
|
|
dc.description |
Third International Workshop on Spoken Languages Technologies for Under-resourced Languages (SLTU'12), Monkey Valley, Cape Town, South Africa, 7-9 May 2012 |
en_US |
dc.description.abstract |
The collection of speech data suitable for speech technology development is a challenge for under-resourced languages. Factors such as cost, availability of mother-tongue speakers and vast geographic distances call for techniques to optimise the data collection process in order to reduce re-collection of data. The use of mobile devices facilitate remote speech data collection. Although mobile (and remote) data collection addresses the challenging factors mentioned above, the environment is still less controlled than in the case of laboratory or studio-based recordings. In this paper we firstly revisit semi-realtime, basic quality control checks as implemented on available mobile-based speech data collection software (Woefzela). In addition, we introduce a quality control technique that uses speech duration estimation to validate the acoustic quality of the speech samples. We compare both techniques with manual verifications. |
en_US |
dc.language.iso |
en |
en_US |
dc.relation.ispartofseries |
Workflow;9020 |
|
dc.subject |
Speech data collection |
en_US |
dc.subject |
Resource-scarce environment |
en_US |
dc.subject |
Under-resourced languages |
en_US |
dc.subject |
Automatic speech recognition |
en_US |
dc.subject |
Mobile data collection |
en_US |
dc.subject |
Android |
en_US |
dc.subject |
Woefzela |
en_US |
dc.subject |
Speech technology development |
en_US |
dc.title |
Quality measurements for mobile data collection in the developing world |
en_US |
dc.type |
Conference Presentation |
en_US |
dc.identifier.apacitation |
Badenhorst, J., De Waal, A., & De Wet, F. (2012). Quality measurements for mobile data collection in the developing world. http://hdl.handle.net/10204/5954 |
en_ZA |
dc.identifier.chicagocitation |
Badenhorst, J, A De Waal, and Febe De Wet. "Quality measurements for mobile data collection in the developing world." (2012): http://hdl.handle.net/10204/5954 |
en_ZA |
dc.identifier.vancouvercitation |
Badenhorst J, De Waal A, De Wet F, Quality measurements for mobile data collection in the developing world; 2012. http://hdl.handle.net/10204/5954 . |
en_ZA |
dc.identifier.ris |
TY - Conference Presentation
AU - Badenhorst, J
AU - De Waal, A
AU - De Wet, Febe
AB - The collection of speech data suitable for speech technology development is a challenge for under-resourced languages. Factors such as cost, availability of mother-tongue speakers and vast geographic distances call for techniques to optimise the data collection process in order to reduce re-collection of data. The use of mobile devices facilitate remote speech data collection. Although mobile (and remote) data collection addresses the challenging factors mentioned above, the environment is still less controlled than in the case of laboratory or studio-based recordings. In this paper we firstly revisit semi-realtime, basic quality control checks as implemented on available mobile-based speech data collection software (Woefzela). In addition, we introduce a quality control technique that uses speech duration estimation to validate the acoustic quality of the speech samples. We compare both techniques with manual verifications.
DA - 2012-05
DB - ResearchSpace
DP - CSIR
KW - Speech data collection
KW - Resource-scarce environment
KW - Under-resourced languages
KW - Automatic speech recognition
KW - Mobile data collection
KW - Android
KW - Woefzela
KW - Speech technology development
LK - https://researchspace.csir.co.za
PY - 2012
T1 - Quality measurements for mobile data collection in the developing world
TI - Quality measurements for mobile data collection in the developing world
UR - http://hdl.handle.net/10204/5954
ER -
|
en_ZA |