Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

A comparison of features for large population speaker identification

Bibliography: leaves 95-104.

Saved in:
Bibliographic Details
Main Author: Baloyi, Norman Tinyiko
Other Authors: Mashao, Daniel
Format: Thesis
Language:English
Published: Department of Electrical Engineering 2015
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613237341585409
access_status_str Open Access
author Baloyi, Norman Tinyiko
author2 Mashao, Daniel
author_browse Baloyi, Norman Tinyiko
Mashao, Daniel
author_facet Mashao, Daniel
Baloyi, Norman Tinyiko
author_sort Baloyi, Norman Tinyiko
collection Thesis
description Bibliography: leaves 95-104.
format Thesis
id oai:open.uct.ac.za:11427/13875
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:32:57.328Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2015
publishDateRange 2015
publishDateSort 2015
publisher Department of Electrical Engineering
publisherStr Department of Electrical Engineering
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/13875 A comparison of features for large population speaker identification Baloyi, Norman Tinyiko Mashao, Daniel Communications Engineering Bibliography: leaves 95-104. Speech recognition systems all have one criterion in common; they perform better in a controlled environment using clean speech. Though performance can be excellent, even exceeding human capabilities for clean speech, systems fail when presented with speech data from more realistic environments such as telephone channels. The differences using a recognizer in clean and noisy environments are extreme, and this causes one of the major obstacles in producing commercial recognition systems to be used in normal environments. It is the lack of performance of speaker recognition systems with telephone channels that this work addresses. The human auditory system is a speech recognizer with excellent performance, especially in noisy environments. Since humans perform well at ignoring noise more than any machine, auditory-based methods are the promising approaches since they attempt to model the working of the human auditory system. These methods have been shown to outperform more conventional signal processing schemes for speech recognition, speech coding, word-recognition and phone classification tasks. Since speaker identification has received lot of attention in speech processing because of its waiting real-world applications, it is attractive to evaluate the performance using auditory models as features. Firstly, this study rums at improving the results for speaker identification. The improvements were made through the use of parameterized feature-sets together with the application of cepstral mean removal for channel equalization. The study is further extended to compare an auditory-based model, the Ensemble Interval Histogram, with mel-scale features, which was shown to perform almost error-free in clean speech. The previous studies of Elli to be more robust to noise were conducted on speaker dependent, small population, isolated words and now are extended to speaker independent, larger population, continuous speech. This study investigates whether the Elli representation is more resistant to telephone noise than mel-cepstrum as was shown in the previous studies, when now for the first time, it is applied for speaker identification task using the state-of-the-art Gaussian mixture model system. 2015-09-14T18:01:45Z 2015-09-14T18:01:45Z 2000 Master Thesis Masters MSc http://hdl.handle.net/11427/13875 eng application/pdf Department of Electrical Engineering Faculty of Engineering and the Built Environment University of Cape Town
spellingShingle Communications Engineering
Baloyi, Norman Tinyiko
A comparison of features for large population speaker identification
thesis_degree_str Master's
title A comparison of features for large population speaker identification
title_full A comparison of features for large population speaker identification
title_fullStr A comparison of features for large population speaker identification
title_full_unstemmed A comparison of features for large population speaker identification
title_short A comparison of features for large population speaker identification
title_sort comparison of features for large population speaker identification
topic Communications Engineering
url http://hdl.handle.net/11427/13875
work_keys_str_mv AT baloyinormantinyiko acomparisonoffeaturesforlargepopulationspeakeridentification
AT baloyinormantinyiko comparisonoffeaturesforlargepopulationspeakeridentification