As the field of affect recognition has progressed, many researchers have shifted from having unimodal approaches to multimodal ones. In particular, the trends in paralinguistic speech affect recognition domain have been to integrate other modalities such as facial expression, body posture, gait, and linguistic speech. Our work focuses on integrating contextual knowledge into paralinguistic speech affect recognition. We hypothesize that a framework to recognize affect through paralinguistic features of speech can improve its performance by integrating relevant contextual knowledge. This dissertation describes our research to integrate contextual knowledge into the paralinguistic affect recognition process from acoustic features of speech. We conceived, built, and tested a two-phased system called the Context-Based Paralinguistic Affect Recognition System (CxBPARS). The first phase of this system is context-free and uses the AdaBoost classifier that applies data on the acoustic pitch, jitter, shimmer, Harmonics-to-Noise Ratio (HNR), and the Noise-to-Harmonics Ratio (NHR) to make an initial judgment about the emotion most likely exhibited by the human elicitor. The second phase then adds context modeling to improve upon the context-free classifications from phase I. CxBPARS was inspired by a human subject study performed as part of this work where test subjects were asked to classify an elicitor's emotion strictly from paralinguistic sounds, and then subsequently provided with contextual information to improve their selections. CxBPARS was rigorously tested and found to, at the worst case, improve the success rate from the state-of-the-art's 42% to 53%.
Doctor of Philosophy (Ph.D.)
College of Engineering and Computer Science
Electrical and Computer Engineering
Length of Campus-only Access
Doctoral Dissertation (Campus-only Access)
Marpaung, Andreas, "Context-Centric Affect Recognition From Paralinguistic Features of Speech" (2019). Electronic Theses and Dissertations, 2004-2019. 6802.