Action Localization In Videos Through Context Walk
Abstract
This paper presents an efficient approach for localizing actions by learning contextual relations, in the form of relative locations between different video regions. We begin by over-segmenting the videos into supervoxels, which have the ability to preserve action boundaries and also reduce the complexity of the problem. Context relations are learned during training which capture displacements from all the supervoxels in a video to those belonging to foreground actions. Then, given a testing video, we select a supervoxel randomly and use the context information acquired during training to estimate the probability of each supervoxel belonging to the foreground action. The walk proceeds to a new supervoxel and the process is repeated for a few steps. This "context walk" generates a conditional distribution of an action over all the supervoxels. A Conditional Random Field is then used to find action proposals in the video, whose confidences are obtained using SVMs. We validated the proposed approach on several datasets and show that context in the form of relative displacements between supervoxels can be extremely useful for action localization. This also results in significantly fewer evaluations of the classifier, in sharp contrast to the alternate sliding window approaches.
Publication Date
2-17-2015
Publication Title
Proceedings of the IEEE International Conference on Computer Vision
Volume
2015 International Conference on Computer Vision, ICCV 2015
Number of Pages
3280-3288
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/ICCV.2015.375
Copyright Status
Unknown
Socpus ID
84973931629 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/84973931629
STARS Citation
Soomro, Khurram; Idrees, Haroon; and Shah, Mubarak, "Action Localization In Videos Through Context Walk" (2015). Scopus Export 2015-2019. 1946.
https://stars.library.ucf.edu/scopus2015/1946