augmented reality, color constancy, shape tracking, face tracking, active appearance model
The need for real-time video analysis is rapidly increasing in today's world. The decreasing cost of powerful processors and the proliferation of affordable cameras, combined with needs for security, methods for searching the growing collection of video data, and an appetite for high-tech entertainment, have produced an environment where video processing is utilized for a wide variety of applications. Tracking is an element in many of these applications, for purposes like detecting anomalous behavior, classifying video clips, and measuring athletic performance. In this dissertation we focus on augmented reality, but the methods and conclusions are applicable to a wide variety of other areas. In particular, our work deals with achieving real-time performance while tracking with augmented reality systems using a minimum set of commercial hardware. We have built prototypes that use both existing technologies and new algorithms we have developed. While performance improvements would be possible with additional hardware, such as multiple cameras or parallel processors, we have concentrated on getting the most performance with the least equipment. Tracking is a broad research area, but an essential component of an augmented reality system. Tracking of some sort is needed to determine the location of scene augmentation. First, we investigated the effects of illumination on the pixel values recorded by a color video camera. We used the results to track a simple solid-colored object in our first augmented reality application. Our second augmented reality application tracks complex non-rigid objects, namely human faces. In the color experiment, we studied the effects of illumination on the color values recorded by a real camera. Human perception is important for many applications, but our focus is on the RGB values available to tracking algorithms. Since the lighting in most environments where video monitoring is done is close to white, (e.g., fluorescent lights in an office, incandescent lights in a home, or direct and indirect sunlight outside,) we looked at the response to "white" light sources as the intensity varied. The red, green, and blue values recorded by the camera can be converted to a number of other color spaces which have been shown to be invariant to various lighting conditions, including view angle, light angle, light intensity, or light color, using models of the physical properties of reflection. Our experiments show how well these derived quantities actually remained constant with real materials, real lights, and real cameras, while still retaining the ability to discriminate between different colors. This color experiment enabled us to find color spaces that were more invariant to changes in illumination intensity than the ones traditionally used. The first augmented reality application tracks a solid colored rectangle and replaces the rectangle with an image, so it appears that the subject is holding a picture instead. Tracking this simple shape is both easy and hard; easy because of the single color and the shape that can be represented by four points or four lines, and hard because there are fewer features available and the color is affected by illumination changes. Many algorithms for tracking fixed shapes do not run in real time or require rich feature sets. We have created a tracking method for simple solid colored objects that uses color and edge information and is fast enough for real-time operation. We also demonstrate a fast deinterlacing method to avoid "tearing" of fast moving edges when recorded by an interlaced camera, and optimization techniques that usually achieved a speedup of about 10 from an implementation that already used optimized image processing library routines. Human faces are complex objects that differ between individuals and undergo non-rigid transformations. Our second augmented reality application detects faces, determines their initial pose, and then tracks changes in real time. The results are displayed as virtual objects overlaid on the real video image. We used existing algorithms for motion detection and face detection. We present a novel method for determining the initial face pose in real time using symmetry. Our face tracking uses existing point tracking methods as well as extensions to Active Appearance Models (AAMs). We also give a new method for integrating detection and tracking data and leveraging the temporal coherence in video data to mitigate the false positive detections. While many face tracking applications assume exactly one face is in the image, our techniques can handle any number of faces. The color experiment along with the two augmented reality applications provide improvements in understanding the effects of illumination intensity changes on recorded colors, as well as better real-time methods for detection and tracking of solid shapes and human faces for augmented reality. These techniques can be applied to other real-time video analysis tasks, such as surveillance and video analysis.
Doctor of Philosophy (Ph.D.)
College of Engineering and Computer Science
Length of Campus-only Access
Doctoral Dissertation (Open Access)
Spencer, Lisa, "Real-time Monocular Vision-based Tracking For Interactive Augmented Reality" (2006). Electronic Theses and Dissertations. 975.