DSpace
 

Researchspace >
General science, engineering & technology >
General science, engineering & technology >
General science, engineering & technology >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10204/5954

Title: Quality measurements for mobile data collection in the developing world
Authors: Badenhorst, J
De Waal, A
De Wet, F
Keywords: Speech data collection
Resource-scarce environment
Under-resourced languages
Automatic speech recognition
Mobile data collection
Android
Woefzela
Speech technology development
Issue Date: May-2012
Citation: Badenhorst, J, De Waal, A and De Wet, F. Quality measurements for mobile data collection in the developing world. Third International Workshop on Spoken Languages Technologies for Under-resourced Languages (SLTU'12), Monkey Valley, Cape Town, South Africa, 7-9 May 2012
Series/Report no.: Workflow;9020
Abstract: The collection of speech data suitable for speech technology development is a challenge for under-resourced languages. Factors such as cost, availability of mother-tongue speakers and vast geographic distances call for techniques to optimise the data collection process in order to reduce re-collection of data. The use of mobile devices facilitate remote speech data collection. Although mobile (and remote) data collection addresses the challenging factors mentioned above, the environment is still less controlled than in the case of laboratory or studio-based recordings. In this paper we firstly revisit semi-realtime, basic quality control checks as implemented on available mobile-based speech data collection software (Woefzela). In addition, we introduce a quality control technique that uses speech duration estimation to validate the acoustic quality of the speech samples. We compare both techniques with manual verifications.
Description: Third International Workshop on Spoken Languages Technologies for Under-resourced Languages (SLTU'12), Monkey Valley, Cape Town, South Africa, 7-9 May 2012
URI: http://hdl.handle.net/10204/5954
Appears in Collections:Human language technologies
General science, engineering & technology

Files in This Item:

File Description SizeFormat
Badenhorst_2012.pdf4.31 MBAdobe PDFView/Open
View Statistics

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback