Linguistic cues and memory for synthetic and natural speech
Abbreviated Journal Title
INTELLIGIBILITY; RULE; COMPREHENSION; PERCEPTION; Behavioral Sciences; Engineering, Industrial; Ergonomics; Psychology, ; Applied; Psychology
Past research has demonstrated that there are cognitive processing costs associated with comprehension of speech generated by text-to-speech synthesizers, relative to comprehension of natural speech. This finding has important performance implications for the many applications that use such systems. The purpose of this study was to ascertain whether certain characteristics of synthetic speech slow on-line, real-time cognitive processing. Whereas past research has focused on the phonemic acoustic structure of synthetic speech, we manipulated prosodic, syntactic, and semantic cues in a task requiring participants to recall sentences spoken either by a human ol by one of two speech synthesizers. The findings were interpreted to suggest that inappropriate prosodic modeling in synthetic speech was the major source of a performance differential between natural and synthetic speech. Prosodic cues, along with others, guide the parsing of speech and provide redundancy. When these cues are absent or inaccurate, the additional burden placed on working memory may exceed its capacity, particularly in time-limited, demanding tasks. Actual or potential applications of this research include improvement of text-to-speech output systems in warning systems, feedback devices in aerospace vehicles, educational and training modules, aids for the handicapped, consumer products, and technologies designed to increase the functional independence of older adults.
"Linguistic cues and memory for synthetic and natural speech" (2000). Faculty Bibliography 2000s. 2733.