Is Noise Always Harmful? Visual Learning From Weakly-Related Data
Keywords
deep learning; semi-supervised learning; Weakly-related data
Abstract
Noise exists universally in multimedia data, especially in Internet era. For example, tags from web users are often incomplete, arbitrary, and low relevant with the visual information. Intuitively, noise in the dataset is harmful to learning tasks, which implies that huge volumes of image tags from social media can't be utilized directly. To collect the reliable training dataset, labor-intensive manual labeling and various learning based outlier detection techniques are widely used. This paper intends to discuss whether such kind of preprocessing is always needed. We focus on a very normal case in image classification that the available dataset includes a large amount of images weakly related to any target classes. We use deep models as the platform and design a series of experiments to compare the semi-supervised learning performance with/without weakly related unlabeled data. Fortunately, we validate that weakly related data is not always harmful, which is an encouraging finding for research on web image learning.
Publication Date
6-22-2016
Publication Title
Proceedings of 2015 International Conference on Orange Technologies, ICOT 2015
Number of Pages
181-184
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/ICOT.2015.7498518
Copyright Status
Unknown
Socpus ID
84980349667 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/84980349667
STARS Citation
Zhong, Sheng Hua; Liu, Yan; Hua, Kien A.; and Wu, Songtao, "Is Noise Always Harmful? Visual Learning From Weakly-Related Data" (2016). Scopus Export 2015-2019. 4304.
https://stars.library.ucf.edu/scopus2015/4304