Abstract
Visual saliency is the ability to select the most relevant data in the scene and reduce the amount of data that needs to be processed. We propose a novel unsupervised approach to detect visual saliency in videos. For this, we employ a hierarchical segmentation technique to obtain supervoxels of a video, and simultaneously, we build a dictionary from cuboids of the video. Then we create a feature matrix from coefficients of dictionary elements. Next, we decompose this matrix into sparse and redundant parts and obtain salient regions using group lasso. Our experiments provide promising results in terms of predicting eye movement. Moreover, we apply our method on action recognition task and achieve better results. Saliency detection only highlights important regions, in Semantic Segmentation, the aim is to assign a semantic label to each pixel in the image. Even though semantic segmentation can be achieved by simply applying classifiers to each pixel or a region, the results may not be desirable since general context information is not considered. To address this issue, we propose two supervised methods. First, an approach to discover interactions between labels and regions using a sparse estimation of precision matrix obtained by graphical lasso. Second, a knowledge-based method to incorporate dependencies among regions in the image during inference. High-level knowledge rules - such as co-occurrence- are extracted from training data and transformed into constraints in Integer Programming formulation. A difficulty in the most supervised semantic segmentation approaches is the lack of enough training data. To address this, a semi-supervised learning approach to exploit the plentiful amount of available unlabeled, as well as synthetic images generated via Generative Adversarial Networks (GAN), is presented. Furthermore, an extension of the proposed model to use additional weakly labeled data is proposed. We demonstrate our approaches on three challenging bench-marking datasets.
Notes
If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu
Graduation Date
2017
Semester
Fall
Advisor
Shah, Mubarak
Degree
Doctor of Philosophy (Ph.D.)
College
College of Engineering and Computer Science
Department
Computer Science
Degree Program
Computer Science
Format
application/pdf
Identifier
CFE0006918
URL
http://purl.fcla.edu/fcla/etd/CFE0006918
Language
English
Release Date
December 2017
Length of Campus-only Access
None
Access Status
Doctoral Dissertation (Open Access)
STARS Citation
Souly, Nasim, "Visual Saliency Detection and Semantic Segmentation" (2017). Electronic Theses and Dissertations. 5682.
https://stars.library.ucf.edu/etd/5682