Continuous speech recognition with sparse coding

Smit, WJ; Barnard, E

dc.contributor.author	Smit, WJ
dc.contributor.author	Barnard, E
dc.date.accessioned	2012-02-14T10:15:14Z
dc.date.available	2012-02-14T10:15:14Z
dc.date.issued	2009-04
dc.identifier.citation	Smit, WJ and Barnard, E. 2009. Continuous speech recognition with sparse coding. Computer Speech and Language, vol. 23(2), pp 200-219	en_US
dc.identifier.issn	0885-2308
dc.identifier.uri	http://www.sciencedirect.com/science/article/pii/S0885230808000375
dc.identifier.uri	http://hdl.handle.net/10204/5565
dc.description	Copyright: Elsevier 2009. This is an ABSTRACT ONLY.	en_US
dc.description.abstract	Sparse coding is an efficient way of coding information. In a sparse code most of the code elements are zero; very few are active. Sparse codes are intended to correspond to the spike trains with which biological neurons communicate. In this article, we show how sparse codes can be used to do continuous speech recognition. We use the TIDIGITS dataset to illustrate the process. First a waveform is transformed into a spectrogram, and a sparse code for the spectrogram is found by means of a linear generative model. The spike train is classified by making use of a spike train model and dynamic programming. It is computationally expensive to find a sparse code. We use an iterative subset selection algorithm with quadratic programming for this process. This algorithm finds a sparse code in reasonable time if the input is limited to a fairly coarse spectral resolution. At this resolution, our system achieves a word error rate of 19%, whereas a system based on Hidden Markov Models achieves a word error rate of 15% at the same resolution.	en_US
dc.language.iso	en	en_US
dc.publisher	Elsevier	en_US
dc.subject	Sparse coding	en_US
dc.subject	Spike train	en_US
dc.subject	Speech recognition	en_US
dc.subject	Linear generative model	en_US
dc.title	Continuous speech recognition with sparse coding	en_US
dc.type	Article	en_US
dc.identifier.apacitation	Smit, W., & Barnard, E. (2009). Continuous speech recognition with sparse coding. http://hdl.handle.net/10204/5565	en_ZA
dc.identifier.chicagocitation	Smit, WJ, and E Barnard "Continuous speech recognition with sparse coding." (2009) http://hdl.handle.net/10204/5565	en_ZA
dc.identifier.vancouvercitation	Smit W, Barnard E. Continuous speech recognition with sparse coding. 2009; http://hdl.handle.net/10204/5565.	en_ZA
dc.identifier.ris	TY - Article AU - Smit, WJ AU - Barnard, E AB - Sparse coding is an efficient way of coding information. In a sparse code most of the code elements are zero; very few are active. Sparse codes are intended to correspond to the spike trains with which biological neurons communicate. In this article, we show how sparse codes can be used to do continuous speech recognition. We use the TIDIGITS dataset to illustrate the process. First a waveform is transformed into a spectrogram, and a sparse code for the spectrogram is found by means of a linear generative model. The spike train is classified by making use of a spike train model and dynamic programming. It is computationally expensive to find a sparse code. We use an iterative subset selection algorithm with quadratic programming for this process. This algorithm finds a sparse code in reasonable time if the input is limited to a fairly coarse spectral resolution. At this resolution, our system achieves a word error rate of 19%, whereas a system based on Hidden Markov Models achieves a word error rate of 15% at the same resolution. DA - 2009-04 DB - ResearchSpace DP - CSIR KW - Sparse coding KW - Spike train KW - Speech recognition KW - Linear generative model LK - https://researchspace.csir.co.za PY - 2009 SM - 0885-2308 T1 - Continuous speech recognition with sparse coding TI - Continuous speech recognition with sparse coding UR - http://hdl.handle.net/10204/5565 ER -	en_ZA

Files in this item

Name: Smit_2009_ABSTRACT ...

Size: 79.11Kb

Format: PDF

View/Open

This item appears in the following Collection(s)

Journal Articles

Show simple item record

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.