ResearchSpace

Continuous speech recognition with sparse coding

Show simple item record

dc.contributor.author Smit, WJ
dc.contributor.author Barnard, E
dc.date.accessioned 2012-02-14T10:15:14Z
dc.date.available 2012-02-14T10:15:14Z
dc.date.issued 2009-04
dc.identifier.citation Smit, WJ and Barnard, E. 2009. Continuous speech recognition with sparse coding. Computer Speech and Language, vol. 23(2), pp 200-219 en_US
dc.identifier.issn 0885-2308
dc.identifier.uri http://www.sciencedirect.com/science/article/pii/S0885230808000375
dc.identifier.uri http://hdl.handle.net/10204/5565
dc.description Copyright: Elsevier 2009. This is an ABSTRACT ONLY. en_US
dc.description.abstract Sparse coding is an efficient way of coding information. In a sparse code most of the code elements are zero; very few are active. Sparse codes are intended to correspond to the spike trains with which biological neurons communicate. In this article, we show how sparse codes can be used to do continuous speech recognition. We use the TIDIGITS dataset to illustrate the process. First a waveform is transformed into a spectrogram, and a sparse code for the spectrogram is found by means of a linear generative model. The spike train is classified by making use of a spike train model and dynamic programming. It is computationally expensive to find a sparse code. We use an iterative subset selection algorithm with quadratic programming for this process. This algorithm finds a sparse code in reasonable time if the input is limited to a fairly coarse spectral resolution. At this resolution, our system achieves a word error rate of 19%, whereas a system based on Hidden Markov Models achieves a word error rate of 15% at the same resolution. en_US
dc.language.iso en en_US
dc.publisher Elsevier en_US
dc.subject Sparse coding en_US
dc.subject Spike train en_US
dc.subject Speech recognition en_US
dc.subject Linear generative model en_US
dc.title Continuous speech recognition with sparse coding en_US
dc.type Article en_US
dc.identifier.apacitation Smit, W., & Barnard, E. (2009). Continuous speech recognition with sparse coding. http://hdl.handle.net/10204/5565 en_ZA
dc.identifier.chicagocitation Smit, WJ, and E Barnard "Continuous speech recognition with sparse coding." (2009) http://hdl.handle.net/10204/5565 en_ZA
dc.identifier.vancouvercitation Smit W, Barnard E. Continuous speech recognition with sparse coding. 2009; http://hdl.handle.net/10204/5565. en_ZA
dc.identifier.ris TY - Article AU - Smit, WJ AU - Barnard, E AB - Sparse coding is an efficient way of coding information. In a sparse code most of the code elements are zero; very few are active. Sparse codes are intended to correspond to the spike trains with which biological neurons communicate. In this article, we show how sparse codes can be used to do continuous speech recognition. We use the TIDIGITS dataset to illustrate the process. First a waveform is transformed into a spectrogram, and a sparse code for the spectrogram is found by means of a linear generative model. The spike train is classified by making use of a spike train model and dynamic programming. It is computationally expensive to find a sparse code. We use an iterative subset selection algorithm with quadratic programming for this process. This algorithm finds a sparse code in reasonable time if the input is limited to a fairly coarse spectral resolution. At this resolution, our system achieves a word error rate of 19%, whereas a system based on Hidden Markov Models achieves a word error rate of 15% at the same resolution. DA - 2009-04 DB - ResearchSpace DP - CSIR KW - Sparse coding KW - Spike train KW - Speech recognition KW - Linear generative model LK - https://researchspace.csir.co.za PY - 2009 SM - 0885-2308 T1 - Continuous speech recognition with sparse coding TI - Continuous speech recognition with sparse coding UR - http://hdl.handle.net/10204/5565 ER - en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record