dc.contributor.author |
Smit, WJ
|
|
dc.contributor.author |
Barnard, E
|
|
dc.date.accessioned |
2012-02-14T10:15:14Z |
|
dc.date.available |
2012-02-14T10:15:14Z |
|
dc.date.issued |
2009-04 |
|
dc.identifier.citation |
Smit, WJ and Barnard, E. 2009. Continuous speech recognition with sparse coding. Computer Speech and Language, vol. 23(2), pp 200-219 |
en_US |
dc.identifier.issn |
0885-2308 |
|
dc.identifier.uri |
http://www.sciencedirect.com/science/article/pii/S0885230808000375
|
|
dc.identifier.uri |
http://hdl.handle.net/10204/5565
|
|
dc.description |
Copyright: Elsevier 2009. This is an ABSTRACT ONLY. |
en_US |
dc.description.abstract |
Sparse coding is an efficient way of coding information. In a sparse code most of the code elements are zero; very few are active. Sparse codes are intended to correspond to the spike trains with which biological neurons communicate. In this article, we show how sparse codes can be used to do continuous speech recognition. We use the TIDIGITS dataset to illustrate the process. First a waveform is transformed into a spectrogram, and a sparse code for the spectrogram is found by means of a linear generative model. The spike train is classified by making use of a spike train model and dynamic programming. It is computationally expensive to find a sparse code. We use an iterative subset selection algorithm with quadratic programming for this process. This algorithm finds a sparse code in reasonable time if the input is limited to a fairly coarse spectral resolution. At this resolution, our system achieves a word error rate of 19%, whereas a system based on Hidden Markov Models achieves a word error rate of 15% at the same resolution. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
Elsevier |
en_US |
dc.subject |
Sparse coding |
en_US |
dc.subject |
Spike train |
en_US |
dc.subject |
Speech recognition |
en_US |
dc.subject |
Linear generative model |
en_US |
dc.title |
Continuous speech recognition with sparse coding |
en_US |
dc.type |
Article |
en_US |
dc.identifier.apacitation |
Smit, W., & Barnard, E. (2009). Continuous speech recognition with sparse coding. http://hdl.handle.net/10204/5565 |
en_ZA |
dc.identifier.chicagocitation |
Smit, WJ, and E Barnard "Continuous speech recognition with sparse coding." (2009) http://hdl.handle.net/10204/5565 |
en_ZA |
dc.identifier.vancouvercitation |
Smit W, Barnard E. Continuous speech recognition with sparse coding. 2009; http://hdl.handle.net/10204/5565. |
en_ZA |
dc.identifier.ris |
TY - Article
AU - Smit, WJ
AU - Barnard, E
AB - Sparse coding is an efficient way of coding information. In a sparse code most of the code elements are zero; very few are active. Sparse codes are intended to correspond to the spike trains with which biological neurons communicate. In this article, we show how sparse codes can be used to do continuous speech recognition. We use the TIDIGITS dataset to illustrate the process. First a waveform is transformed into a spectrogram, and a sparse code for the spectrogram is found by means of a linear generative model. The spike train is classified by making use of a spike train model and dynamic programming. It is computationally expensive to find a sparse code. We use an iterative subset selection algorithm with quadratic programming for this process. This algorithm finds a sparse code in reasonable time if the input is limited to a fairly coarse spectral resolution. At this resolution, our system achieves a word error rate of 19%, whereas a system based on Hidden Markov Models achieves a word error rate of 15% at the same resolution.
DA - 2009-04
DB - ResearchSpace
DP - CSIR
KW - Sparse coding
KW - Spike train
KW - Speech recognition
KW - Linear generative model
LK - https://researchspace.csir.co.za
PY - 2009
SM - 0885-2308
T1 - Continuous speech recognition with sparse coding
TI - Continuous speech recognition with sparse coding
UR - http://hdl.handle.net/10204/5565
ER -
|
en_ZA |