Visual saliency is the ability to select the most relevant data in the scene and reduce the amount of data that needs to be processed. We propose a novel unsupervised approach to detect visual saliency in videos. For this, we employ a hierarchical segmentation technique to obtain supervoxels of a video, and simultaneously, we build a dictionary from cuboids of the video. Then we create a feature matrix from coefficients of dictionary elements. Next, we decompose this matrix into sparse and redundant parts and obtain salient regions using group lasso. Our experiments provide promising results in terms of predicting eye movement. Moreover, we apply our method on action recognition task and achieve better results. Saliency detection only highlights important regions, in Semantic Segmentation, the aim is to assign a semantic label to each pixel in the image. Even though semantic segmentation can be achieved by simply applying classifiers to each pixel or a region, the results may not be desirable since general context information is not considered. To address this issue, we propose two supervised methods. First, an approach to discover interactions between labels and regions using a sparse estimation of precision matrix obtained by graphical lasso. Second, a knowledge-based method to incorporate dependencies among regions in the image during inference. High-level knowledge rules - such as co-occurrence- are extracted from training data and transformed into constraints in Integer Programming formulation. A difficulty in the most supervised semantic segmentation approaches is the lack of enough training data. To address this, a semi-supervised learning approach to exploit the plentiful amount of available unlabeled, as well as synthetic images generated via Generative Adversarial Networks (GAN), is presented. Furthermore, an extension of the proposed model to use additional weakly labeled data is proposed. We demonstrate our approaches on three challenging bench-marking datasets.
Doctor of Philosophy (Ph.D.)
College of Engineering and Computer Science
Length of Campus-only Access
Doctoral Dissertation (Open Access)
Souly, Nasim, "Visual Saliency Detection and Semantic Segmentation" (2017). Electronic Theses and Dissertations. 5682.