Keywords
voice conversion, text to speech, speech synthesis, gaussian mixture modeling, voice, speech processing, digital signal processing
Abstract
Today's world consists of many ways to communicate information. One of the most effective ways to communicate is through the use of speech. Unfortunately many lose the ability to converse. This in turn leads to a large negative psychological impact. In addition, skills such as lecturing and singing must now be restored via other methods. The usage of text-to-speech synthesis has been a popular resolution of restoring the capability to use oral speech. Text to speech synthesizers convert text into speech. Although text to speech systems are useful, they only allow for few default voice selections that do not represent that of the user. In order to achieve total restoration, voice conversion must be introduced. Voice conversion is a method that adjusts a source voice to sound like a target voice. Voice conversion consists of a training and converting process. The training process is conducted by composing a speech corpus to be spoken by both source and target voice. The speech corpus should encompass a variety of speech sounds. Once training is finished, the conversion function is employed to transform the source voice into the target voice. Effectively, voice conversion allows for a speaker to sound like any other person. Therefore, voice conversion can be applied to alter the voice output of a text to speech system to produce the target voice. The thesis investigates how one approach, specifically the usage of voice conversion using Gaussian mixture modeling, can be applied to alter the voice output of a text to speech synthesis system. Researchers found that acceptable results can be obtained from using these methods. Although voice conversion and text to speech synthesis are effective in restoring voice, a sample of the speaker before voice loss must be used during the training process. Therefore it is vital that voice samples are made to combat voice loss.
Notes
If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu
Graduation Date
2007
Semester
Summer
Advisor
Mikhael, Wasfy
Degree
Master of Science in Electrical Engineering (M.S.E.E.)
College
College of Engineering and Computer Science
Department
Electrical Engineering and Computer Science
Degree Program
Electrical Engineering
Format
application/pdf
Identifier
CFE0001793
URL
http://purl.fcla.edu/fcla/etd/CFE0001793
Language
English
Release Date
September 2007
Length of Campus-only Access
None
Access Status
Masters Thesis (Open Access)
STARS Citation
Alverio, Gustavo, "Discussion On Effective Restoration Of Oral Speech Using Voice Conversion Techniques Based On Gaussian Mixture Modeling" (2007). Electronic Theses and Dissertations. 3060.
https://stars.library.ucf.edu/etd/3060