Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Data-driven augmentation of pronunciation dictionaries

Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2010.

Saved in:

Bibliographic Details
Main Author:	Loots, Linsen
Other Authors:	Niesler, T. R.
Format:	Thesis
Language:	English
Published:	Stellenbosch : University of Stellenbosch 2010
Subjects:	English accents Pronunciation dictionaries Grapheme-to-phoneme (G2P) Decision trees Dissertations > Electronic engineering Theses > Electronic engineering Electronic dictionaries > Pronunciation
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613805574356992
access_status_str	Open Access
author	Loots, Linsen
author2	Niesler, T. R.
author_browse	Loots, Linsen Niesler, T. R.
author_facet	Niesler, T. R. Loots, Linsen
author_sort	Loots, Linsen
collection	Thesis
dc_rights_str_mv	University of Stellenbosch
description	Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2010.
format	Thesis
id	oai:scholar.sun.ac.za:10019.1/4212
institution	Stellenbosch University (South Africa)
language	English
last_indexed	2026-06-10T12:41:59.323Z
license_str	Other — see source repository
provenance_str_mv	Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate	2010
publishDateRange	2010
publishDateSort	2010
publisher	Stellenbosch : University of Stellenbosch
publisherStr	Stellenbosch : University of Stellenbosch
record_format	dspace
source_str	SUNScholar — Stellenbosch University Repository
spelling	oai:scholar.sun.ac.za:10019.1/4212 Data-driven augmentation of pronunciation dictionaries Loots, Linsen Niesler, T. R. University of Stellenbosch. Faculty of Engineering. Dept. of Electrical and Electronic Engineering. English accents Pronunciation dictionaries Grapheme-to-phoneme (G2P) Decision trees Dissertations -- Electronic engineering Theses -- Electronic engineering Electronic dictionaries -- Pronunciation Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2010. ENGLISH ABSTRACT: This thesis investigates various data-driven techniques by which pronunciation dictionaries can be automatically augmented. First, well-established grapheme-to-phoneme (G2P) conversion techniques are evaluated for Standard South African English (SSAE), British English (RP) and American English (GenAm) by means of four appropriate dictionaries: SAEDICT, BEEP, CMUDICT and PRONLEX. Next, the decision tree algorithm is extended to allow the conversion of pronunciations between different accents by means of phoneme-to-phoneme (P2P) and grapheme-andphoneme- to-phoneme (GP2P) conversion. P2P conversion uses the phonemes of the source accent as input to the decision trees. GP2P conversion further incorporates the graphemes into the decision tree input. Both P2P and GP2P conversion are evaluated using the four dictionaries. It is found that, when the pronunciation is needed for a word not present in the target accent, it is substantially more accurate to modify an existing pronunciation from a different accent, than to derive it from the word’s spelling using G2P conversion. When converting between accents, GP2P conversion provides a significant further increase in performance above P2P. Finally, experiments are performed to determine how large a training dictionary is required in a target accent for G2P, P2P and GP2P conversion. It is found that GP2P conversion requires less training data than P2P and substantially less than G2P conversion. Furthermore, it is found that very little training data is needed for GP2P to perform at almost maximum accuracy. The bulk of the accuracy is achieved within the initial 500 words, and after 3000 words there is almost no further improvement. Some specific approaches to compiling the best training set are also considered. By means of an iterative greedy algorithm an optimal ranking of words to be included in the training set is discovered. Using this set is shown to lead to substantially better GP2P performance for the same training set size in comparison with alternative approaches such as the use of phonetically rich words or random selections. A mere 25 words of training data from this optimal set already achieve an accuracy within 1% of that of the full training dictionary. AFRIKAANSE OPSOMMING: Hierdie tesis ondersoek verskeie data-gedrewe tegnieke waarmee uitspraakwoordeboeke outomaties aangevul kan word. Eerstens word gevestigde grafeem-na-foneem (G2P) omskakelingstegnieke ge¨evalueer vir Standaard Suid-Afrikaanse Engels (SSAE), Britse Engels (RP) en Amerikaanse Engels (GenAm) deur middel van vier geskikte woordeboeke: SAEDICT, BEEP, CMUDICT en PRONLEX. Voorts word die beslissingsboomalgoritme uitgebrei om die omskakeling van uitsprake tussen verskillende aksente moontlik te maak, deur middel van foneem-na-foneem (P2P) en grafeem-en-foneem-na-foneem (GP2P) omskakeling. P2P omskakeling gebruik die foneme van die bronaksent as inset vir die beslissingsbome. GP2P omskakeling inkorporeer verder die grafeme by die inset. Beide P2P en GP2P omskakeling word evalueer deur middel van die vier woordeboeke. Daar word bevind dat wanneer die uitspraak benodig word vir ’n woord wat nie in die teikenaksent teenwoordig is nie, dit bepaald meer akkuraat is om ’n bestaande uitspraak van ’n ander aksent aan te pas, as om dit af te lei vanuit die woord se spelling met G2P omskakeling. Wanneer daar tussen aksente omgeskakel word, gee GP2P omskakeling ’n verdere beduidende verbetering in akkuraatheid bo P2P. Laastens word eksperimente uitgevoer om die grootte te bepaal van die afrigtingswoordeboek wat benodig word in ’n teikenaksent vir G2P, P2P en GP2P omskakeling. Daar word bevind dat GP2P omskakeling minder afrigtingsdata as P2P en substansieel minder as G2P benodig. Verder word dit bevind dat baie min afrigtingsdata benodig word vir GP2P om teen bykans maksimum akkuraatheid te funksioneer. Die oorwig van die akkuraatheid word binne die eerste 500 woorde bereik, en n´a 3000 woorde is daar amper geen verdere verbetering nie. ’n Aantal spesifieke benaderings word ook oorweeg om die beste afrigtingstel saam te stel. Deur middel van ’n iteratiewe, gulsige algoritme word ’n optimale rangskikking van woorde bepaal vir insluiting by die afrigtingstel. Daar word getoon dat deur hierdie stel te gebruik, substansieel beter GP2P gedrag verkry word vir dieselfde grootte afrigtingstel in vergelyking met alternatiewe benaderings soos die gebruik van foneties-ryke woorde of lukrake seleksies. ’n Skamele 25 woorde uit hierdie optimale stel gee reeds ’n akkuraatheid binne 1% van di´e van die volle afrigtingswoordeboek. 2010-02-16T10:11:27Z 2010-08-13T15:00:13Z 2010-02-16T10:11:27Z 2010-08-13T15:00:13Z 2010-03 Thesis http://hdl.handle.net/10019.1/4212 en University of Stellenbosch 79 p. : ill. application/pdf Stellenbosch : University of Stellenbosch
spellingShingle	English accents Pronunciation dictionaries Grapheme-to-phoneme (G2P) Decision trees Dissertations -- Electronic engineering Theses -- Electronic engineering Electronic dictionaries -- Pronunciation Loots, Linsen Data-driven augmentation of pronunciation dictionaries
title	Data-driven augmentation of pronunciation dictionaries
title_full	Data-driven augmentation of pronunciation dictionaries
title_fullStr	Data-driven augmentation of pronunciation dictionaries
title_full_unstemmed	Data-driven augmentation of pronunciation dictionaries
title_short	Data-driven augmentation of pronunciation dictionaries
title_sort	data driven augmentation of pronunciation dictionaries
topic	English accents Pronunciation dictionaries Grapheme-to-phoneme (G2P) Decision trees Dissertations -- Electronic engineering Theses -- Electronic engineering Electronic dictionaries -- Pronunciation
url	http://hdl.handle.net/10019.1/4212
work_keys_str_mv	AT lootslinsen datadrivenaugmentationofpronunciationdictionaries

Full Text Available

Data-driven augmentation of pronunciation dictionaries

Similar Items