DSpace
 

Researchspace >
General science, engineering & technology >
General science, engineering & technology >
General science, engineering & technology >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10204/4128

Title: Collecting and evaluating speech recognition corpora for nine Southern Bantu languages
Authors: Badenhorst, JAC
Van Heerden, C
Davel, M
Barnard, E
Keywords: Automated telephony systems
Lwazi corpus
Automatic speech recognition system
ASR
Rural areas
African languages
Speech corpus
Southern Bantu languages
Language technologies
Computational linguistics
Issue Date: Mar-2009
Publisher: Association for Computational Linguistics
Citation: Badenhorst, JAC,Van Heerden, C, Davel, M et al. 2009. Collecting and evaluating speech recognition corpora for nine Southern Bantu languages. EACL Workshop on Language Technologies for African Languages, Athens, Greece, 31 March 2009, pp 1-8
Abstract: The authors describes the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively small compared to major corpora in world languages, and we report on our investigation of the stability of the ASR models derived from the corpus. We also report on phoneme distance measures across languages, and describe initial phone recognisers that were developed using this data.
Description: EACL Workshop on Language Technologies for African Languages, Athens, Greece, 31 March 2009
URI: http://hdl.handle.net/10204/4128
ISBN: 1932432256
Appears in Collections:Human language technologies
General science, engineering & technology

Files in This Item:

File Description SizeFormat
Badenhorst_2009.pdf594.95 kBAdobe PDFView/Open
View Statistics

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback