Abstract

Still image emotion recognition (ER) has been receiving increasing attention in recent years due to the tremendous amount of social media content on the Web. Many works offer both categorical and dimensional methods to detect image sentiments, while others focus on extracting the true social signals, such as happiness and anger. Deep learning architectures have delivered great suc- cess, however, their dependency on large-scale datasets labeled with (1) emotion, and (2) valence, arousal and dominance, in categorical and dimensional domains respectively, introduce challenges the community tries to tackle. Emotions offer dissimilar semantics when aroused in different con- texts, however "context-sensitive" ER has been by and large discarded in the literature so far. Moreover, while dimensional methods deliver higher accuracy, they have been less attended due to (1) lack of reliable large-scale labeled datasets, and (2) challenges involved in architecting un- supervised solutions to the problem. Owing to the success offered by multi-modal ER, still image ER in the single-modal domain; i.e. using only still images, remains less resorted to. In this work, (1) we first architect a novel fully automated dataset collection pipeline, equipped with a built-in semantic sanitizer, (2) we then build UCF-ER with 50K images, and LUCFER, the largest labeled ER dataset in the literature with more than 3.6M images, both datasets labeled with emotion and context, (3) next, we build a single-modal context-sensitive ER CNN model, fine-tuned on UCF-ER and LUCFER, (4) we then claim and show empirically that infusing context to the unified training process helps achieve a more balanced precision and recall, while boosting performance, yielding an overall classification accuracy of 73.12% compared to the state of the art 58.3%, (5) next, we propose an unsupervised approach for ranking of continuous emotions in images using canonical polyadic (CP) decomposition, providing theoretical proof that rank-1 CP decomposition can be used as a ranking machine, (6) finally, we provide empirical proof that our method generates a Pearson Correlation Coefficient, outperforming the state of the art by a large margin; i.e. 65.13% (difference) in one experiment and 104.08% (difference) in another, when applied to valence rank estimation.

Notes

If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date

2020

Semester

Fall

Advisor

Foroosh, Hassan

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Computer Science

Degree Program

Computer Science

Format

application/pdf

Identifier

CFE0008296; DP0023733

Language

English

Release Date

December 2020

Length of Campus-only Access

None

Access Status

Doctoral Dissertation (Open Access)

Share

COinS