Convolutional Dlstm For Crowd Scene Understanding

Keywords

CNN; Crowd Scene; end-to-end; LSTM

Abstract

With the growth of crowd phenomena in the real world, crowd scene understanding is becoming an important task in anomaly detection and public security. Visual ambiguities and occlusions, high density, low mobility and scene semantics, however, make this problem a great challenge. In this paper, we propose an end-to-end deep architecture, Convolutional DLSTM (ConvDLSTM), for crowd scene understanding. ConvDLSTM consists of GoogleNet Inception V3 convolutional neural networks (CNN) and stacked differential long short-term memory (DLSTM) networks. Different from traditional non-end-to-end solutions which separate the steps of feature extraction and parameter learning, ConvDLSTM utilizes a unified deep model to optimize the parameters of CNN and RNN hand in hand. It thus has the potential of generating a more harmonious model. The proposed architecture takes sequential raw image data as input, and does not rely on tracklet or trajectory detection. It thus has clear advantages over the traditional flow-based and trajectory-based methods, especially in challenging crowd scenarios of high density and low mobility. Taking advantage of the semantic representation of CNN and the memory states of LSTM, ConvDLSTM can effectively analyze both the crowd scene and motion information. Existing LSTM-based crowd scene solutions explore deep temporal information and are claimed to be 'deep in time'. ConvDLSTM, however, models the spatial and temporal information in a unified architecture and achieves 'deep in space and time'. Extensive performance studies on the Violent-Flows and CUHK Crowd datasets show that the proposed technique significantly outperforms state-of-the-art methods.

Publication Date

12-28-2017

Publication Title

Proceedings - 2017 IEEE International Symposium on Multimedia, ISM 2017

Volume

2017-January

Number of Pages

61-68

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/ISM.2017.19

Socpus ID

85045876703 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85045876703

This document is currently not available here.

Share

COinS