Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Speech generation in a spoken dialogue system

Thesis (MScIng)--University of Stellenbosch, 2004.

Saved in:

Bibliographic Details
Main Author:	Visagie, Albertus Sybrand
Other Authors:	Du Preez, J. A.
Format:	Thesis
Language:	en_ZA
Published:	Stellenbosch : University of Stellenbosch 2011
Subjects:	Speech processing systems Speech synthesis Theses > Electronic engineering Dissertations > Electronic engineering
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867614018384953344
access_status_str	Open Access
author	Visagie, Albertus Sybrand
author2	Du Preez, J. A.
author_browse	Du Preez, J. A. Visagie, Albertus Sybrand
author_facet	Du Preez, J. A. Visagie, Albertus Sybrand
author_sort	Visagie, Albertus Sybrand
collection	Thesis
dc_rights_str_mv	University of Stellenbosch
description	Thesis (MScIng)--University of Stellenbosch, 2004.
format	Thesis
id	oai:scholar.sun.ac.za:10019.1/16460
institution	Stellenbosch University (South Africa)
language	en_ZA
last_indexed	2026-06-10T12:45:21.489Z
license_str	Other — see source repository
provenance_str_mv	Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate	2011
publishDateRange	2011
publishDateSort	2011
publisher	Stellenbosch : University of Stellenbosch
publisherStr	Stellenbosch : University of Stellenbosch
record_format	dspace
source_str	SUNScholar — Stellenbosch University Repository
spelling	oai:scholar.sun.ac.za:10019.1/16460 Speech generation in a spoken dialogue system Visagie, Albertus Sybrand Du Preez, J. A. University of Stellenbosch. Faculty of Engineering. Dept. of Electrical and Electronic Engineering. Speech processing systems Speech synthesis Theses -- Electronic engineering Dissertations -- Electronic engineering Thesis (MScIng)--University of Stellenbosch, 2004. ENGLISH ABSTRACT: Spoken dialogue systems accessed over the telephone network are rapidly becoming more popular as a means to reduce call-centre costs and improve customer experience. It is now technologically feasible to delegate repetitive and relatively simple tasks conducted in most telephone calls to automatic systems. Such a system uses speech recognition to take input from users. This work focuses on the speech generation component that a specific prototype system uses to convey audible speech output back to the user. Many commercial systems contain general text-to-speech synthesisers. Text-to-speech synthesis is a very active branch of speech processing. It aims to build machines that read text aloud. In some languages this has been a reality for almost two decades. While these synthesisers are often very understandable, they almost never sound natural. The output quality of synthetic speech is considered to be a very important factor in the user’s perception of the quality and usability of spoken dialogue systems. The static nature of the spoken dialogue system is exploited to produce a custom speech synthesis component that provides very high quality output speech for the particular application. To this end the current state of the art in speech synthesis is surveyed and summarised. A unit-selection synthesiser is produced that functions in Afrikaans, English and Xhosa. The unit-selection synthesiser selects short waveforms from a recorded speech corpus, and concatenates them to produce the required utterances. Techniques are developed for designing a compact corpus and processing it to produce a unit-selection database. Speech modification methods were researched to build a framework for natural-sounding speech concatenation. This framework also provides pitch and duration modification capabilities that will enable research in languages such as Afrikaans and Xhosa where text-to-speech capabilities are relatively immature. AFRIKAANSE OPSOMMING: Telefoniese, spraakgebaseerde dialoogstelsels word steeds meer algemeen, en is ’n doeltreffende metode om oproepsentrumkostes te verlaag. Dit is tans tegnologies moontlik om ’n groot aantal eenvoudige transaksies met automatiese stelsels te hanteer. Sulke stelsels gebruik spraakherkenning om intree van die gebruiker te ontvang. Hierdie werk fokus op die spraakgenerasiekomponent wat ’n spesifieke prototipestelsel gebruik om afvoer aan die gebruiker terug te speel. Vele kommersi¨ele stelsels gebruik generiese teks-na-spraak sintetiseerders. Sulke teksna- spraak sintetiseerders is steeds ’n baie aktiewe veld in spraaknavorsing. In die algemeen poog navorsing om teks te kan lees en om te sit in verstaanbare spraak. Sulke stelsels bestaan nou al vir ten minste twee dekades. Alhoewel heeltemal verstaanbaar, klink hierdie stelsels onnatuurlik. In telefoniese spraakgebaseerde dialoogstelsels is kwaliteit van die sintetiese spraak belangrik vir die gebruiker se persepsie van die stelsel se kwaliteit en bruikbaarheid. Die dialoog is meestal staties van aard en hierdie eienskap word benut om ho¨e kwaliteit spraak in ’n bepaalde toepassing te sintetiseer. Om dit reg te kry is die huidige stand van sake in hierdie veld bestudeer en opgesom. ’n Knip-en-plak sintetiseerder is gebou wat werk in Afrikaans, Engels en Xhosa. Die sintetiseerder selekteer kort stukkies spraakgolfvorms vanuit ’n spraakkorpus, en las dit aanmekaar om die vereiste spraak te produseer. Outomatiese tegnieke is ontwikkel om ’n kompakte korpus te ontwerp wat steeds alles bevat wat die sintetiseerder sal nodig hˆe om sy taak te verrig. Verdere tegnieke prosesseer die korpus tot ’n bruikbare vorm vir sintese. Metodes van spraakmodifikasie is ondersoek ten einde die aanmekaargelaste stukkies spraak meer natuurlik te laat klink en die intonasie en tempo daarvan te korrigeer. Dit verskaf infrastruktuur vir navorsing in tale soos Afrikaans en Xhosa waar teks-na-spraak vermo¨ens nog onvolwasse is. 2011-09-28T07:35:39Z 2011-09-28T07:35:39Z 2004-12 Thesis http://hdl.handle.net/10019.1/16460 en_ZA University of Stellenbosch xv, 144 leaves : ill. application/pdf Stellenbosch : University of Stellenbosch
spellingShingle	Speech processing systems Speech synthesis Theses -- Electronic engineering Dissertations -- Electronic engineering Visagie, Albertus Sybrand Speech generation in a spoken dialogue system
title	Speech generation in a spoken dialogue system
title_full	Speech generation in a spoken dialogue system
title_fullStr	Speech generation in a spoken dialogue system
title_full_unstemmed	Speech generation in a spoken dialogue system
title_short	Speech generation in a spoken dialogue system
title_sort	speech generation in a spoken dialogue system
topic	Speech processing systems Speech synthesis Theses -- Electronic engineering Dissertations -- Electronic engineering
url	http://hdl.handle.net/10019.1/16460
work_keys_str_mv	AT visagiealbertussybrand speechgenerationinaspokendialoguesystem

Full Text Available

Speech generation in a spoken dialogue system

Similar Items