A great deal of research has been done to improve the performance of Facial Expression Recognition (FER) algorithms, but extracting optimal features to represent expressions remains a challenging task. The biggest drawback is that most work on FER ignores the inter-subject variations in facial attributes of individuals present in data. Hence, the representation extracted for the recognition of expressions is polluted by identity-related features that negatively affect the generalization capability of a FER technique on unseen identities. To overcome the effect of subject-identity bias, previous research shows the effectiveness of extracting identity-invariant expression features for FER. However, most of those identity-invariant expression representation learning methods rely on hand-engineered feature extraction techniques. Apart from the inter-subject variations, other challenges in learning optimal FER representation are illumination and head-pose variation present in data. We believe the key to dealing with these problems present in facial expression datasets lies in FER techniques that disentangle the expression representation from the identity features. Therefore, in this dissertation, we first discuss our Reenactment-based Expression-Representation Learning Generative Adversarial Network (REL-GAN) that disentangles expression features from the identity information by transferring the expression of one image to the identity of another image (known as face reenactment). Second, we present our Human-to-Animation conditional Generative Adversarial Network (HA-GAN) that overcomes the challenges posed by the illumination and identity variations present in these datasets by estimating a many-to-one identity mapping function employing adversarial learning. Third, we present a Transfer-based Expression Recognition Generative Adversarial Network (TER-GAN) that learns an identity-invariant expression representation without requiring any hand-engineered identity-invariant feature extraction technique. Fourth, we discuss the effectiveness of using 3D expression parameters in optimal expression feature learning algorithms. We then present our Action Unit-based Attention Net (AUA-Net) which is trained in a weakly supervised manner to generate expression attention maps for FER.


If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu.

Graduation Date





Hughes, Charlie


Doctor of Philosophy (Ph.D.)


College of Engineering and Computer Science


Electrical and Computer Engineering

Degree Program

Computer Engineering







Release Date

December 2021

Length of Campus-only Access


Access Status

Doctoral Dissertation (Open Access)