Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Sparse coding for speech recognition

Thesis (PhD)--University of Pretoria, 2008.

Saved in:
Bibliographic Details
Other Authors: Barnard, E.
Format: Thesis
Published: University of Pretoria 2013
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613612309217280
access_status_str Open Access
author2 Barnard, E.
author_browse Barnard, E.
author_facet Barnard, E.
collection Thesis
dc_rights_str_mv © University of Pretoria 2008 D535/
description Thesis (PhD)--University of Pretoria, 2008.
format Thesis
id oai:repository.up.ac.za:2263/29409
institution University of Pretoria (South Africa)
last_indexed 2026-06-10T12:38:54.752Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from UPSpace — University of Pretoria Institutional Repository
publishDate 2013
publishDateRange 2013
publishDateSort 2013
publisher University of Pretoria
publisherStr University of Pretoria
record_format dspace
source_str UPSpace — University of Pretoria Institutional Repository
spelling oai:repository.up.ac.za:2263/29409 Sparse coding for speech recognition Barnard, E. willie.smit@gmail.com Smit, Willem Jacobus Mathematical optimization Spike train classification Spike train Speech recognition Sparse code Linear generative model Sparse code measurement Dictionary training Overcomplete dictionary Spectrogram UCTD Thesis (PhD)--University of Pretoria, 2008. The brain is a complex organ that is computationally strong. Recent research in the field of neurobiology help scientists to better understand the working of the brain, especially how the brain represents or codes external signals. The research shows that the neural code is sparse. A sparse code is a code in which few neurons participate in the representation of a signal. Neurons communicate with each other by sending pulses or spikes at certain times. The spikes send between several neurons over time is called a spike train. A spike train contains all the important information about the signal that it codes. This thesis shows how sparse coding can be used to do speech recognition. The recognition process consists of three parts. First the speech signal is transformed into a spectrogram. Thereafter a sparse code to represent the spectrogram is found. The spectrogram serves as the input to a linear generative model. The output of themodel is a sparse code that can be interpreted as a spike train. Lastly a spike train model recognises the words that are encoded in the spike train. The algorithms that search for sparse codes to represent signals require many computations. We therefore propose an algorithm that is more efficient than current algorithms. The algorithm makes it possible to find sparse codes in reasonable time if the spectrogram is fairly coarse. The system achieves a word error rate of 19% with a coarse spectrogram, while a system based on Hidden Markov Models achieves a word error rate of 15% on the same spectrograms. Electrical, Electronic and Computer Engineering unrestricted 2013-09-07T15:36:20Z 2008-12-11 2013-09-07T15:36:20Z 2008-09-02 2008-12-11 2008-11-11 Thesis a 2008 D535/gm http://hdl.handle.net/2263/29409 http://upetd.up.ac.za/thesis/available/etd-11112008-151309/ © University of Pretoria 2008 D535/ application/pdf application/pdf application/pdf application/pdf application/pdf University of Pretoria
spellingShingle Mathematical optimization
Spike train classification
Spike train
Speech recognition
Sparse code
Linear generative model
Sparse code measurement
Dictionary training
Overcomplete dictionary
Spectrogram
UCTD
Sparse coding for speech recognition
title Sparse coding for speech recognition
title_full Sparse coding for speech recognition
title_fullStr Sparse coding for speech recognition
title_full_unstemmed Sparse coding for speech recognition
title_short Sparse coding for speech recognition
title_sort sparse coding for speech recognition
topic Mathematical optimization
Spike train classification
Spike train
Speech recognition
Sparse code
Linear generative model
Sparse code measurement
Dictionary training
Overcomplete dictionary
Spectrogram
UCTD
url http://hdl.handle.net/2263/29409
http://upetd.up.ac.za/thesis/available/etd-11112008-151309/