Fast Zero-Shot Image Tagging
Abstract
The well-known word analogy experiments show that the recent word vectors capture fine-grained linguistic regularities in words by linear vector offsets, but it is unclear how well the simple vector offsets can encode visual regularities over words. We study a particular image-word relevance relation in this paper. Our results show that the word vectors of relevant tags for a given image rank ahead of the irrelevant tags, along a principal direction in the word vector space. Inspired by this observation, we propose to solve image tagging by estimating the principal direction for an image. Particularly, we exploit linear mappings and nonlinear deep neural networks to approximate the principal direction from an input image. We arrive at a quite versatile tagging model. It runs fast given a test image, in constant time w.r.t. The training set size. It not only gives superior performance for the conventional tagging task on the NUSWIDE dataset, but also outperforms competitive baselines on annotating images with previously unseen tags.
Publication Date
12-9-2016
Publication Title
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume
2016-December
Number of Pages
5985-5994
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/CVPR.2016.644
Copyright Status
Unknown
Socpus ID
84986272569 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/84986272569
STARS Citation
Zhang, Yang; Gong, Boqing; and Shah, Mubarak, "Fast Zero-Shot Image Tagging" (2016). Scopus Export 2015-2019. 4493.
https://stars.library.ucf.edu/scopus2015/4493