Mixed Reality, Augmentation in Visual Reality


Human eyes, as the organs for sensing light and processing visual information, enable us to see the real world. Though invaluable, they give us no way to "edit" the received visual stream or to "switch" to a different channel. The invention of motion pictures and computer technologies in the last century enables us to add an extra layer of modifications between the real world and our eyes. There are two major approaches to modifications that we consider here, offline augmentation and online augmentation. The movie industry has pushed offline augmentation to an extreme level; audiences can experience visual surprises that they have never seen in their real lives, even though it may take a few months or years for the production of the special visual effects. On the other hand, online augmentation requires that modifications be performed in real time. This dissertation addresses problems in both offline and online augmentation. The first offline problem addressed here is the generation of plausible video sequences after removing relatively large objects from the original videos. In order to maintain temporal coherence among the frames, a motion layer segmentation method is applied. From this, a set of synthesized layers is generated by applying motion compensation and a region completion algorithm. Finally, a plausibly realistic new video, in which the selected object is removed, is rendered given the synthesized layers and the motion parameters. The second problem we address is to construct a blue screen key for video synthesis or blending for Mixed Reality (MR) applications. As a well researched area, blue screen keying extracts a range of colors, typically in the blue spectrum, from a captured video sequence to enable the compositing of multiple image sources. Under ideal conditions with uniform lighting and background color, a high quality key can be generated through commercial products, even in real time. However, A Mixed Realty application typically involves a head-mounted display (HMD) with poor camera quality. This in turn requires the keying algorithm to be robust in the presence of noise. We conduct a three stage keying algorithm to reduce the noise in the key output. First a standard blue screen keying algorithm is applied to the input to get a noisy key; second the image gradient information and the corresponding region are compared with the result in the first step to remove noise in the blue screen area; and finally a matting approach is applied on the boundary of the key to improve the key quality. Another offline problem we address in this dissertation is the acquisition of correct transformation between the different coordinate frames in a Mixed Reality (MR) application. Typically an MR system includes at least one tracking system. Therefore the 3D coordinate frames that need to be considered include the cameras, the tracker, the tracker system and a world. Accurately deriving the transformation between the head-mounted display camera and the affixed 6-DOF tracker is critical for mixed reality applications. This transformation brings the HMD cameras into the tracking coordinate frame, which in turn overlaps with a virtual coordinate frame to create a plausible mixed visual experience. We carry out a non-linear optimization method to recover the camera-tracker transformation with respect to the image reprojection error. For online applications, we address a problem to extend the luminance range in mixed reality environments. We achieve this by introducing Enhanced Dynamic Range Video, a technique based on differing brightness settings for each eye of a video see-through head mounted display (HMD). We first construct a Video-Driven Time-Stamped Ball Cloud (VDTSBC), which serves as a guideline and a means to store temporal color information for stereo image registration. With the assistance of the VDTSBC, we register each pair of stereo images, taking into account confounding issues of occlusion occurring within one eye but not the other. Finally, we apply luminance enhancement on the registered image pairs to generate an Enhanced Dynamic Range Video.


If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at

Graduation Date





Hughes, Charles


Doctor of Philosophy (Ph.D.)


College of Engineering and Computer Science


Electrical Engineering and Computer Science

Degree Program

Computer Science








Release Date

July 2008

Length of Campus-only Access


Access Status

Doctoral Dissertation (Open Access)