Multi-object tracking is one of the fundamental problems in computer vision. Almost all multi-object tracking systems consist of two main components; detection and data association. In the detection step, object hypotheses are generated in each frame of a sequence. Later, detections that belong to the same target are linked together to form final trajectories. The latter step is called data association. There are several challenges that render this problem difficult, such as occlusion, background clutter and pose changes. This dissertation aims to address these challenges by tackling the data association component of tracking and contributes three novel methods for solving data association. Firstly, this dissertation will present a new framework for multi-target tracking that uses a novel data association technique using the Generalized Maximum Clique Problem (GMCP) formulation. The majority of current methods, such as bipartite matching, incorporate a limited temporal locality of the sequence into the data association problem. This makes these methods inherently prone to ID-switches and difficulties caused by long-term occlusions, a cluttered background and crowded scenes. On the other hand, our approach incorporates both motion and appearance in a global manner. Unlike limited temporal locality methods which incorporate a few frames into the data association problem, this method incorporates the whole temporal span and solves the data association problem for one object at a time. Generalized Minimum Clique Graph (GMCP) is used to solve the optimization problem of our data association method. The proposed method is supported by superior results on several benchmark sequences. GMCP leads us to a more accurate approach to multi-object tracking by considering all the pairwise relationships in a batch of frames; however, it has some limitations. Firstly, it finds target trajectories one-by-one, missing joint optimization. Secondly, for optimization we use a greedy solver, based on local neighborhood search, making our optimization prone to local minimas. Finally GMCP tracker is slow, which is a burden when dealing with time-sensitive applications. In order to address these problems, we propose a new graph theoretic problem, called Generalized Maximum Multi Clique Problem (GMMCP). GMMCP tracker has all the advantages of the GMCP tracker while addressing its limitations. A solution is presented to GMMCP where no simplification is assumed in problem formulation or problem optimization. GMMCP is NP hard but it can be formulated through a Binary-Integer Program where the solution to small- and medium-sized tracking problems can be found efficiently. To improve speed, Aggregated Dummy Nodes are used for modeling occlusions and miss detections. This also reduces the size of the input graph without using any heuristics. We show that using the speed-up method, our tracker lends itself to a real-time implementation, increasing its potential usefulness in many applications. In test against several tracking datasets, we show that the proposed method outperforms competitive methods. Thus far we have assumed that the number of people do not exceed a few dozens. However, this is not always the case. In many scenarios such as, marathon, political rallies or religious rites, the number of people in a frame may reach few hundreds or even few thousands. Tracking in high-density crowd sequences is a challenging problem due to several reasons. Human detection methods often fail to localize objects correctly in extremely crowded scenes. This limits the use of data association based tracking methods. Additionally, it is hard to extend existing multi-target tracking to track targets in highly-crowded scenes, because the large number of targets increases the computational complexity. Furthermore, the small apparent target size makes it challenging to extract features to discriminate targets from their surroundings. Finally, we present a tracker that addresses the above-mentioned problems. We formulate online crowd tracking as a Binary Quadratic Programing, where both detection and data association problems are solved together. Our formulation employs target's individual information in the form of appearance and motion as well as contextual cues in the form of neighborhood motion, spatial proximity and grouping constraints. Due to large number of targets, state-of-the-art commercial quadratic programing solvers fail to efficiently find the solution to the proposed optimization. In order to overcome the computational complexity of available solvers, we propose to use the most recent version of Modified Frank-Wolfe algorithms with SWAP steps. The proposed tracker can track hundreds of targets efficiently and improves state-of-the-art results by significant margin on high density crowd sequences.


If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date





Shah, Mubarak


Doctor of Philosophy (Ph.D.)


College of Engineering and Computer Science


Computer Science

Degree Program

Computer Science









Release Date

May 2016

Length of Campus-only Access


Access Status

Doctoral Dissertation (Open Access)