Title

View invariant action recognition using weighted fundamental ratios

Authors

Authors

N. Ashraf; Y. P. Shen; X. C. Cao;H. Foroosh

Comments

Authors: contact us about adding a copy of your work at STARS@ucf.edu

Abbreviated Journal Title

Comput. Vis. Image Underst.

Keywords

View invariance; Pose transition; Action recognition; Action alignment; Fundamental ratios; HUMAN MOTION ANALYSIS; TRACKING PEOPLE; HUMAN MOVEMENT; SPACE; REPRESENTATION; CAPTURE; FLOW; Computer Science, Artificial Intelligence; Engineering, Electrical &; Electronic

Abstract

In this paper, we fully investigate the concept of fundamental ratios, demonstrate their application and significance in view-invariant action recognition, and explore the importance of different body parts in action recognition. A moving plane observed by a fixed camera induces a fundamental matrix F between two frames, where the ratios among the elements in the upper left 2 x 2 submatrix are herein referred to as the fundamental ratios. We show that fundamental ratios are invariant to camera internal parameters and orientation, and hence can be used to identify similar motions of line segments from varying viewpoints. By representing the human body as a set of points, we decompose a body posture into a set of line segments. The similarity between two actions is therefore measured by the motion of line segments and hence by their associated fundamental ratios. We further investigate to what extent a body part plays a role in recognition of different actions and propose a generic method of assigning weights to different body points. Experiments are performed on three categories of data: the controlled CMU MoCap dataset, the partially controlled IXMAS data, and the more challenging uncontrolled UCF-CIL dataset collected on the internet. Extensive experiments are reported on testing (i) view-invariance, (ii) robustness to noisy localization of body points, (iii) effect of assigning different weights to different body points, (iv) effect of partial occlusion on recognition accuracy, and (v) determining how soon our method recognizes an action correctly from the starting point of the query video. (c) 2013 Elsevier Inc. All rights reserved.

Journal Title

Computer Vision and Image Understanding

Volume

117

Issue/Number

6

Publication Date

1-1-2013

Document Type

Article

Language

English

First Page

587

Last Page

602

WOS Identifier

WOS:000317538500002

ISSN

1077-3142

Share

COinS