spatial audio, human system integration, training, interactive interfaces, evaluation


Designing auditory interfaces is a challenge for current human-systems developers. This is largely due to a lack of theoretical guidance for directing how best to use sounds in today's visually-rich graphical user interfaces. This dissertation provided a framework for guiding the design of audio interfaces to enhance human-systems performance. This doctoral research involved reviewing the literature on conveying temporal and spatial information using audio, using this knowledge to build three theoretical models to aid the design of auditory interfaces, and empirically validating select components of the models. The three models included an audio integration model that outlines an end-to-end process for adding sounds to interactive interfaces, a temporal audio model that provides a framework for guiding the timing for integration of these sounds to meet human performance objectives, and a spatial audio model that provides a framework for adding spatialization cues to interface sounds. Each model is coupled with a set of design guidelines theorized from the literature, thus combined, the developed models put forward a structured process for integrating sounds in interactive interfaces. The developed models were subjected to a three phase validation process that included review by Subject Matter Experts (SMEs) to assess the face validity of the developed models and two empirical studies. For the SME review, which assessed the utility of the developed models and identified opportunities for improvement, a panel of three audio experts was selected to respond to a Strengths, Weaknesses, Opportunities, and Threats (SWOT) validation questionnaire. Based on the SWOT analysis, the main strengths of the models included that they provide a systematic approach to auditory display design and that they integrate a wide variety of knowledge sources in a concise manner. The main weaknesses of the models included the lack of a structured process for amending the models with new principles, some branches were not considered parallel or completely distinct, and lack of guidance on selecting interface sounds. The main opportunity identified by the experts was the ability of the models to provide a seminal body of knowledge that can be used for building and validating auditory display designs. The main threats identified by the experts were that users may not know where to start and end with each model, the models may not provide comprehensive coverage of all uses of auditory displays, and the models may act as a restrictive influence on designers or they may be used inappropriately. Based on the SWOT analysis results, several changes were made to the models prior to the empirical studies. Two empirical evaluation studies were conducted to test the theorized design principles derived from the revised models. The first study focused on assessing the utility of audio cues to train a temporal pacing task and the second study combined both temporal (i.e., pace) and spatial audio information, with a focus on examining integration issues. In the pace study, there were four different auditory conditions used for training pace: 1) a metronome, 2) non-spatial auditory earcons, 3) a spatialized auditory earcon, and 4) no audio cues for pace training. Sixty-eight people participated in the study. A pre- post between subjects experimental design was used, with eight training trials. The measure used for assessing pace performance was the average deviation from a predetermined desired pace. The results demonstrated that a metronome was not effective in training participants to maintain a desired pace, while, spatial and non-spatial earcons were effective strategies for pace training. Moreover, an examination of post-training performance as compared to pre-training suggested some transfer of learning. Design guidelines were extracted for integrating auditory cues for pace training tasks in virtual environments. In the second empirical study, combined temporal (pacing) and spatial (location of entities within the environment) information were presented. There were three different spatialization conditions used: 1) high fidelity using subjective selection of a "best-fit" head related transfer function, 2) low fidelity using a generalized head-related transfer function, and 3) no spatialization. A pre- post between subjects experimental design was used, with eight training trials. The performance measures were average deviation from desired pace and time and accuracy to complete the task. The results of the second study demonstrated that temporal, non-spatial auditory cues were effective in influencing pace while other cues were present. On the other hand, spatialized auditory cues did not result in significantly faster task completion. Based on these results, a set of design guidelines was proposed that can be used to direct the integration of spatial and temporal auditory cues for supporting training tasks in virtual environments. Taken together, the developed models and the associated guidelines provided a theoretical foundation from which to direct user-centered design of auditory interfaces.


If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at

Graduation Date





Stanney, Kay


Doctor of Philosophy (Ph.D.)


College of Engineering and Computer Science


Industrial Engineering and Management Systems

Degree Program

Industrial Engineering








Release Date

July 2008

Length of Campus-only Access


Access Status

Doctoral Dissertation (Open Access)