Is Noise Always Harmful? Visual Learning From Weakly-Related Data

Keywords

deep learning; semi-supervised learning; Weakly-related data

Abstract

Noise exists universally in multimedia data, especially in Internet era. For example, tags from web users are often incomplete, arbitrary, and low relevant with the visual information. Intuitively, noise in the dataset is harmful to learning tasks, which implies that huge volumes of image tags from social media can't be utilized directly. To collect the reliable training dataset, labor-intensive manual labeling and various learning based outlier detection techniques are widely used. This paper intends to discuss whether such kind of preprocessing is always needed. We focus on a very normal case in image classification that the available dataset includes a large amount of images weakly related to any target classes. We use deep models as the platform and design a series of experiments to compare the semi-supervised learning performance with/without weakly related unlabeled data. Fortunately, we validate that weakly related data is not always harmful, which is an encouraging finding for research on web image learning.

Publication Date

6-22-2016

Publication Title

Proceedings of 2015 International Conference on Orange Technologies, ICOT 2015

Number of Pages

181-184

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/ICOT.2015.7498518

Socpus ID

84980349667 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/84980349667

This document is currently not available here.

Share

COinS