Pooling ASR data for closely related languages

Van Heerden, C; Kleynhans, N; Barnard, E; Davel, M

dc.contributor.author	Van Heerden, C
dc.contributor.author	Kleynhans, N
dc.contributor.author	Barnard, E
dc.contributor.author	Davel, M
dc.date.accessioned	2012-07-03T15:01:49Z
dc.date.available	2012-07-03T15:01:49Z
dc.date.issued	2010-05
dc.identifier.citation	Van Heerden, C, Kleynhans, N, Barnard, E and Davel, M. Pooling ASR data for closely related languages. Proceedings of the Workshop on Spoken Languages Technologies for Under-Resourced Languages (SLTU 2010), Penang, Malaysia, May 2010	en_US
dc.identifier.isbn	978-967-5417-75-7
dc.identifier.uri	http://www.mica.edu.vn/sltu-2010/proceedings/Proceedings%20of%20the%202nd%20International%20Workshop%20on%20Spoken%20Languages%20Technologies%20for%20Under-resourced%20Languages.pdf
dc.identifier.uri	http://hdl.handle.net/10204/5974
dc.description	Proceedings of the Workshop on Spoken Languages Technologies for Under-Resourced Languages (SLTU 2010), Penang, Malaysia, May 2010	en_US
dc.description.abstract	We describe several experiments that were conducted to assess the viability of data pooling as a means to improve speech-recognition performance for under-resourced languages. Two groups of closely related languages from the Southern Bantu language family were studied, and our tests involved phoneme recognition on telephone speech using standard tied-triphone Hidden Markov Models. Approximately 6 to 11 hours of speech from around 170 speakers was available for training in each language. We find that useful improvements in recognition accuracy can be achieved when pooling data from languages that are highly similar, with two hours of data from a closely related language being approximately equivalent to one hour of data from the target language in the best case. However, the benefit decreases rapidly as languages become slightly more distant, and is also expected to decrease when larger corpora are available. Our results suggest that similarities in triphone frequencies are the most accurate predictor of the performance of language pooling in the conditions studied here.	en_US
dc.language.iso	en	en_US
dc.publisher	School of Computer Sciences, Universiti Sains Malaysia	en_US
dc.subject	Speech recognition	en_US
dc.subject	Data pooling	en_US
dc.subject	Under-resourced languages	en_US
dc.title	Pooling ASR data for closely related languages	en_US
dc.type	Conference Presentation	en_US
dc.identifier.apacitation	Van Heerden, C., Kleynhans, N., Barnard, E., & Davel, M. (2010). Pooling ASR data for closely related languages. School of Computer Sciences, Universiti Sains Malaysia. http://hdl.handle.net/10204/5974	en_ZA
dc.identifier.chicagocitation	Van Heerden, C, N Kleynhans, E Barnard, and M Davel. "Pooling ASR data for closely related languages." (2010): http://hdl.handle.net/10204/5974	en_ZA
dc.identifier.vancouvercitation	Van Heerden C, Kleynhans N, Barnard E, Davel M, Pooling ASR data for closely related languages; School of Computer Sciences, Universiti Sains Malaysia; 2010. http://hdl.handle.net/10204/5974 .	en_ZA
dc.identifier.ris	TY - Conference Presentation AU - Van Heerden, C AU - Kleynhans, N AU - Barnard, E AU - Davel, M AB - We describe several experiments that were conducted to assess the viability of data pooling as a means to improve speech-recognition performance for under-resourced languages. Two groups of closely related languages from the Southern Bantu language family were studied, and our tests involved phoneme recognition on telephone speech using standard tied-triphone Hidden Markov Models. Approximately 6 to 11 hours of speech from around 170 speakers was available for training in each language. We find that useful improvements in recognition accuracy can be achieved when pooling data from languages that are highly similar, with two hours of data from a closely related language being approximately equivalent to one hour of data from the target language in the best case. However, the benefit decreases rapidly as languages become slightly more distant, and is also expected to decrease when larger corpora are available. Our results suggest that similarities in triphone frequencies are the most accurate predictor of the performance of language pooling in the conditions studied here. DA - 2010-05 DB - ResearchSpace DP - CSIR KW - Speech recognition KW - Data pooling KW - Under-resourced languages LK - https://researchspace.csir.co.za PY - 2010 SM - 978-967-5417-75-7 T1 - Pooling ASR data for closely related languages TI - Pooling ASR data for closely related languages UR - http://hdl.handle.net/10204/5974 ER -	en_ZA

Files in this item

Name: vanHeerden_2010.pdf

Size: 3.491Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

Conference Publications

Show simple item record

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.