Basic speech recognition for spoken dialogues

Van Heerden, C; Barnard, E; Davel, M

dc.contributor.author	Van Heerden, C
dc.contributor.author	Barnard, E
dc.contributor.author	Davel, M
dc.date.accessioned	2009-10-12T08:32:14Z
dc.date.available	2009-10-12T08:32:14Z
dc.date.issued	2009-09
dc.identifier.citation	Van Heerden, C, Barnard, E and Davel, M. 2009. Basic speech recognition for spoken dialogues. 10th Annual Conference of the International Speech Communication Association (Interspeech 2009). Brighton, UK, 6-10 September, 2009. pp 3003-3006	en
dc.identifier.issn	1990-9772
dc.identifier.uri	http://hdl.handle.net/10204/3649
dc.description	10th Annual Conference of the International Speech Communication Association (Interspeech 2009). Brighton, UK, 6-10 September 2009	en
dc.description.abstract	Spoken dialogue systems (SDSs) have great potential for information access in the developing world. However, the realisation of that potential requires the solution of several challenging problems, including the development of sufficiently accurate speech recognisers for a diverse multitude of languages. The paper investigates the feasibility of developing small-vocabulary speaker-independent ASR systems designed for use in a telephone-based information system, using ten resource-scarce languages spoken in South Africa as a case study. The researchers contrast a cross-language transfer approach (using a well-trained system from a different language) with the development of new language-specific corpora and systems, and evaluate the effectiveness of both approaches. It was found that limited speech corpora (3 to 8 hours of data from around 200 speakers) are sufficient for the development of reasonably accurate recognisers. Error rates are in the range 2% to 12% for a tenword task, where vocabulary words are excluded from training to simulate vocabulary-independent performance. This approach is substantially more accurate than cross-language transfer, and sufficient for the development of basic spoken dialogue systems.	en
dc.language.iso	en	en
dc.publisher	International Speech Communication Association	en
dc.subject	Speech recognition	en
dc.subject	Spoken dialogue systems	en
dc.subject	SDS	en
dc.subject	Accurate speech recognisers	en
dc.subject	ASR	en
dc.subject	Resource scarce languages	en
dc.subject	Human language technologies	en
dc.subject	Interspeech 2009	en
dc.subject	Speech communication	en
dc.subject	Small-vocabulary speaker-independent ASR systems	en
dc.subject	Cross-language transfer	en
dc.title	Basic speech recognition for spoken dialogues	en
dc.type	Conference Presentation	en
dc.identifier.apacitation	Van Heerden, C., Barnard, E., & Davel, M. (2009). Basic speech recognition for spoken dialogues. International Speech Communication Association. http://hdl.handle.net/10204/3649	en_ZA
dc.identifier.chicagocitation	Van Heerden, C, E Barnard, and M Davel. "Basic speech recognition for spoken dialogues." (2009): http://hdl.handle.net/10204/3649	en_ZA
dc.identifier.vancouvercitation	Van Heerden C, Barnard E, Davel M, Basic speech recognition for spoken dialogues; International Speech Communication Association; 2009. http://hdl.handle.net/10204/3649 .	en_ZA
dc.identifier.ris	TY - Conference Presentation AU - Van Heerden, C AU - Barnard, E AU - Davel, M AB - Spoken dialogue systems (SDSs) have great potential for information access in the developing world. However, the realisation of that potential requires the solution of several challenging problems, including the development of sufficiently accurate speech recognisers for a diverse multitude of languages. The paper investigates the feasibility of developing small-vocabulary speaker-independent ASR systems designed for use in a telephone-based information system, using ten resource-scarce languages spoken in South Africa as a case study. The researchers contrast a cross-language transfer approach (using a well-trained system from a different language) with the development of new language-specific corpora and systems, and evaluate the effectiveness of both approaches. It was found that limited speech corpora (3 to 8 hours of data from around 200 speakers) are sufficient for the development of reasonably accurate recognisers. Error rates are in the range 2% to 12% for a tenword task, where vocabulary words are excluded from training to simulate vocabulary-independent performance. This approach is substantially more accurate than cross-language transfer, and sufficient for the development of basic spoken dialogue systems. DA - 2009-09 DB - ResearchSpace DP - CSIR KW - Speech recognition KW - Spoken dialogue systems KW - SDS KW - Accurate speech recognisers KW - ASR KW - Resource scarce languages KW - Human language technologies KW - Interspeech 2009 KW - Speech communication KW - Small-vocabulary speaker-independent ASR systems KW - Cross-language transfer LK - https://researchspace.csir.co.za PY - 2009 SM - 1990-9772 T1 - Basic speech recognition for spoken dialogues TI - Basic speech recognition for spoken dialogues UR - http://hdl.handle.net/10204/3649 ER -	en_ZA

Files in this item

Name: Van Heerden_d2_20 ...

Size: 231.6Kb

Format: PDF

View/Open

This item appears in the following Collection(s)

Conference Publications

Show simple item record

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.