Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Speech generation in a spoken dialogue system

Thesis (MScIng)--University of Stellenbosch, 2004.

Saved in:
Bibliographic Details
Main Author: Visagie, Albertus Sybrand
Other Authors: Du Preez, J. A.
Format: Thesis
Language:en_ZA
Published: Stellenbosch : University of Stellenbosch 2011
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614018384953344
access_status_str Open Access
author Visagie, Albertus Sybrand
author2 Du Preez, J. A.
author_browse Du Preez, J. A.
Visagie, Albertus Sybrand
author_facet Du Preez, J. A.
Visagie, Albertus Sybrand
author_sort Visagie, Albertus Sybrand
collection Thesis
dc_rights_str_mv University of Stellenbosch
description Thesis (MScIng)--University of Stellenbosch, 2004.
format Thesis
id oai:scholar.sun.ac.za:10019.1/16460
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:45:21.489Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2011
publishDateRange 2011
publishDateSort 2011
publisher Stellenbosch : University of Stellenbosch
publisherStr Stellenbosch : University of Stellenbosch
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/16460 Speech generation in a spoken dialogue system Visagie, Albertus Sybrand Du Preez, J. A. University of Stellenbosch. Faculty of Engineering. Dept. of Electrical and Electronic Engineering. Speech processing systems Speech synthesis Theses -- Electronic engineering Dissertations -- Electronic engineering Thesis (MScIng)--University of Stellenbosch, 2004. ENGLISH ABSTRACT: Spoken dialogue systems accessed over the telephone network are rapidly becoming more popular as a means to reduce call-centre costs and improve customer experience. It is now technologically feasible to delegate repetitive and relatively simple tasks conducted in most telephone calls to automatic systems. Such a system uses speech recognition to take input from users. This work focuses on the speech generation component that a specific prototype system uses to convey audible speech output back to the user. Many commercial systems contain general text-to-speech synthesisers. Text-to-speech synthesis is a very active branch of speech processing. It aims to build machines that read text aloud. In some languages this has been a reality for almost two decades. While these synthesisers are often very understandable, they almost never sound natural. The output quality of synthetic speech is considered to be a very important factor in the user’s perception of the quality and usability of spoken dialogue systems. The static nature of the spoken dialogue system is exploited to produce a custom speech synthesis component that provides very high quality output speech for the particular application. To this end the current state of the art in speech synthesis is surveyed and summarised. A unit-selection synthesiser is produced that functions in Afrikaans, English and Xhosa. The unit-selection synthesiser selects short waveforms from a recorded speech corpus, and concatenates them to produce the required utterances. Techniques are developed for designing a compact corpus and processing it to produce a unit-selection database. Speech modification methods were researched to build a framework for natural-sounding speech concatenation. This framework also provides pitch and duration modification capabilities that will enable research in languages such as Afrikaans and Xhosa where text-to-speech capabilities are relatively immature. AFRIKAANSE OPSOMMING: Telefoniese, spraakgebaseerde dialoogstelsels word steeds meer algemeen, en is ’n doeltreffende metode om oproepsentrumkostes te verlaag. Dit is tans tegnologies moontlik om ’n groot aantal eenvoudige transaksies met automatiese stelsels te hanteer. Sulke stelsels gebruik spraakherkenning om intree van die gebruiker te ontvang. Hierdie werk fokus op die spraakgenerasiekomponent wat ’n spesifieke prototipestelsel gebruik om afvoer aan die gebruiker terug te speel. Vele kommersi¨ele stelsels gebruik generiese teks-na-spraak sintetiseerders. Sulke teksna- spraak sintetiseerders is steeds ’n baie aktiewe veld in spraaknavorsing. In die algemeen poog navorsing om teks te kan lees en om te sit in verstaanbare spraak. Sulke stelsels bestaan nou al vir ten minste twee dekades. Alhoewel heeltemal verstaanbaar, klink hierdie stelsels onnatuurlik. In telefoniese spraakgebaseerde dialoogstelsels is kwaliteit van die sintetiese spraak belangrik vir die gebruiker se persepsie van die stelsel se kwaliteit en bruikbaarheid. Die dialoog is meestal staties van aard en hierdie eienskap word benut om ho¨e kwaliteit spraak in ’n bepaalde toepassing te sintetiseer. Om dit reg te kry is die huidige stand van sake in hierdie veld bestudeer en opgesom. ’n Knip-en-plak sintetiseerder is gebou wat werk in Afrikaans, Engels en Xhosa. Die sintetiseerder selekteer kort stukkies spraakgolfvorms vanuit ’n spraakkorpus, en las dit aanmekaar om die vereiste spraak te produseer. Outomatiese tegnieke is ontwikkel om ’n kompakte korpus te ontwerp wat steeds alles bevat wat die sintetiseerder sal nodig hˆe om sy taak te verrig. Verdere tegnieke prosesseer die korpus tot ’n bruikbare vorm vir sintese. Metodes van spraakmodifikasie is ondersoek ten einde die aanmekaargelaste stukkies spraak meer natuurlik te laat klink en die intonasie en tempo daarvan te korrigeer. Dit verskaf infrastruktuur vir navorsing in tale soos Afrikaans en Xhosa waar teks-na-spraak vermo¨ens nog onvolwasse is. 2011-09-28T07:35:39Z 2011-09-28T07:35:39Z 2004-12 Thesis http://hdl.handle.net/10019.1/16460 en_ZA University of Stellenbosch xv, 144 leaves : ill. application/pdf Stellenbosch : University of Stellenbosch
spellingShingle Speech processing systems
Speech synthesis
Theses -- Electronic engineering
Dissertations -- Electronic engineering
Visagie, Albertus Sybrand
Speech generation in a spoken dialogue system
title Speech generation in a spoken dialogue system
title_full Speech generation in a spoken dialogue system
title_fullStr Speech generation in a spoken dialogue system
title_full_unstemmed Speech generation in a spoken dialogue system
title_short Speech generation in a spoken dialogue system
title_sort speech generation in a spoken dialogue system
topic Speech processing systems
Speech synthesis
Theses -- Electronic engineering
Dissertations -- Electronic engineering
url http://hdl.handle.net/10019.1/16460
work_keys_str_mv AT visagiealbertussybrand speechgenerationinaspokendialoguesystem