Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Language modelling for code-switched automatic speech recognition in five South African languages

Thesis (PhD)--Stellenbosch University, 2018.

Saved in:

Bibliographic Details
Main Author:	Van der Westhuizen, Ewald
Other Authors:	Niesler, T. R.
Format:	Thesis
Language:	en_ZA
Published:	Stellenbosch : Stellenbosch University 2018
Subjects:	UCTD Code switching (Linguistics) Automatic speech recognition Diglossia (Linguistics) Acoustic models Grammar, Comparative and general > Augmentatives
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867614096298344448
access_status_str	Open Access
author	Van der Westhuizen, Ewald
author2	Niesler, T. R.
author_browse	Niesler, T. R. Van der Westhuizen, Ewald
author_facet	Niesler, T. R. Van der Westhuizen, Ewald
author_sort	Van der Westhuizen, Ewald
collection	Thesis
dc_rights_str_mv	Stellenbosch University
description	Thesis (PhD)--Stellenbosch University, 2018.
format	Thesis
id	oai:scholar.sun.ac.za:10019.1/104997
institution	Stellenbosch University (South Africa)
language	en_ZA
last_indexed	2026-06-10T12:46:36.532Z
license_str	Other — see source repository
provenance_str_mv	Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate	2018
publishDateRange	2018
publishDateSort	2018
publisher	Stellenbosch : Stellenbosch University
publisherStr	Stellenbosch : Stellenbosch University
record_format	dspace
source_str	SUNScholar — Stellenbosch University Repository
spelling	oai:scholar.sun.ac.za:10019.1/104997 Language modelling for code-switched automatic speech recognition in five South African languages Van der Westhuizen, Ewald Niesler, T. R. Stellenbosch University. Faculty of Engineering. Dept. of Electrical and Electronic Engineering. UCTD Code switching (Linguistics) Automatic speech recognition Diglossia (Linguistics) Acoustic models Grammar, Comparative and general -- Augmentatives Thesis (PhD)--Stellenbosch University, 2018. ENGLISH ABSTRACT: Code-switching refers to natural, spontaneous language alternation by multilingual speakers during a conversation or utterance, and is prevalent in everyday conversations by multilingual South Africans. Automatic speech recognition systems are generally highly optimised for monolingual input and performance deteriorates when presented with mixed-language speech. This thesis addresses the automatic recognition of speech containing code-switching between English and four South African Bantu languages, focussing specifically on the language modelling of English-isiZulu, English-isiXhosa, English- Setswana and English-Sesotho. Due to the severe scarcity of code-switched speech data in South African languages, it was necessary to first develop a representative corpus. This new and unique 35-hour corpus contains segmented and transcribed code-switched speech from conversations in South African soap operas, which exhibit spontaneous utterances with regular code-switching in the target languages. Insertional, alternational, and intraword intrasentential code-switching are all represented in the data, as are some other special characteristics of fast, spontaneous Bantu speech such as postlexical deletion. The distribution of language switches is extremely sparse, however. In this thesis, a number of data-driven modelling approaches were investigated and applied to address the sparsity by augmenting the training data with synthetically generated data. Postlexical deletion was successfully modelled statistically with joint-sequence models, and these models were used to generate synthetic pronunciations which were demonstrated to lead to improved automatic speech recognition performance. Two new code-switched language modelling approaches were proposed to address data sparsity. First, parallel language-dependent language modelling (PLDLM), which consists of two monolingual language models with explicit language transitions, was demonstrated to outperform a conventional language-independent language model in terms of recognition word error rate. Second, language models in which word embeddings were used to synthesise probable unseen code-switched bigrams were considered. It was possible to achieve a reduction of up to 31% in language model perplexity across a language switch boundary by including such synthesised code-switch bigrams. Although smaller, improvements in the recognition word error rate were also observed. AFRIKAANSE OPSOMMING: Kodewisseling behels die natuurlike, spontane skakeling tussen tale deur veeltalige sprekers gedurende ’n gesprek of uiting en kom alledaags voor in gesprekke van veeltalige Suid-Afrikaners. Outomatiese spraakherkenningstelsels is in die algemeen spesifiek geoptimeer vir die hantering van eentalige spraak en toon swak werkverrigting in die hantering van meertalige spraak. Hierdie tesis spreek die outomatiese herkenning van spraak met kodewisseling tussen Engels en vier Suid-Afrikaanse Bantoe-tale aan. Die taalmodellering van Engels-IsiZulu, Engels-IsiXhosa, Engels-Setswana en Engels-Sesotho spraak met kodewisseling word spesifiek aangespreek. Weens die skaarste van spraakdata in Suid-Afrikaanse tale wat kodewisseling bevat, was dit nodig om ’n verteenwoordigende spraakkorpus saam te stel. Hierdie nuwe en unieke korpus bestaan uit 35-uur se gesegmenteerde en getranskribeerde spraak wat kodewisseling bevat. Die data is onttrek uit gesprekke in Suid-Afrikaanse sepie-TVreekse, wat spontane spraak met gereelde kodewisseling toon in die voorge noemde tale. Verskeie kodewisselingsvorme kom in die data voor, waaronder intersentensiële kodewisseling as ’n insetsel (insertional), as ’n alternerende sinsdeel (alternational) of intern tot ’n woord (intraword) kan voorkom. Die verspreiding van kodewisselingvoorbeelde in die data is egter besonder yl. ’n Aantal datagedrewe modelleringstegnieke is ondersoek om yl afrigdata met sintetiese data aan te vul. Vokaaldelesie, ’n kenmerkende verskynsel in spontane spraak met ’n hoë tempo, word ook onder die Afrikatale waargeneem. Vokaaldelesie is suksesvol gemodelleer met gesamentlike-sekwensiemodelle. Hierdie modelle is gebruik om sintetiese uitsprake te skep wat gelei het tot verbeterde woordfouttempo met die outomatiese spraakherkenner. Twee nuwe benaderings tot die taalmodellering van kodewisseling is ondersoek. Die eerste is ’n parallelle taalafhanklike taalmodel wat twee eentalige taalmodelle met eksplisiete taaloorgangskakels verbind. Dit is bewys dat hierdie benadering ’n beter woordfouttempo as die konvensionele taalonafhanklike taalmodel kon lewer. Die tweede benadering het taalmodelle ondersoek waarby woordinbedding toegepas is om waarskynlike kodewisselingsbigramme te sintetiseer. Dit is moontlik om ’n afname van tot 31% in die perpleksiteit by ’n taalskakelingspunt te bewerkstellig deur die sintetiese kodewisselingsbigramme by die taalmodelle in te sluit. ’n Verbetering in woordfouttempo is ook waargeneem, alhoewel kleiner. Doctoral 2018-11-22T09:08:36Z 2018-12-07T06:54:28Z 2018-11-22T09:08:36Z 2018-12-07T06:54:28Z 2018-12 Thesis http://hdl.handle.net/10019.1/104997 en_ZA Stellenbosch University 209 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle	UCTD Code switching (Linguistics) Automatic speech recognition Diglossia (Linguistics) Acoustic models Grammar, Comparative and general -- Augmentatives Van der Westhuizen, Ewald Language modelling for code-switched automatic speech recognition in five South African languages
title	Language modelling for code-switched automatic speech recognition in five South African languages
title_full	Language modelling for code-switched automatic speech recognition in five South African languages
title_fullStr	Language modelling for code-switched automatic speech recognition in five South African languages
title_full_unstemmed	Language modelling for code-switched automatic speech recognition in five South African languages
title_short	Language modelling for code-switched automatic speech recognition in five South African languages
title_sort	language modelling for code switched automatic speech recognition in five south african languages
topic	UCTD Code switching (Linguistics) Automatic speech recognition Diglossia (Linguistics) Acoustic models Grammar, Comparative and general -- Augmentatives
url	http://hdl.handle.net/10019.1/104997
work_keys_str_mv	AT vanderwesthuizenewald languagemodellingforcodeswitchedautomaticspeechrecognitioninfivesouthafricanlanguages

Full Text Available

Language modelling for code-switched automatic speech recognition in five South African languages

Similar Items