In recent years, supervised deep learning has achieved remarkable success in solving a wide range of visual recognition problems. Large-scale labeled datasets have been crucial for this success and the progress has primarily been limited to controlled environments. In this dissertation, we present methods to improve the annotation efficiency of deep visual recognition models and also propose methods to improve the performance of annotation-efficient models in unconstrained open-world settings. To address the annotation bottleneck in supervised learning, we introduce a pseudo-labeling framework for semi-supervised learning. While consistency regularization methods dominate the field, they heavily rely on domain-specific data augmentations, limiting their applicability. We argue that even though pseudo-labeling is a general approach, it performs poorly due to high-confidence predictions from poorly calibrated models, leading to noisy training. To overcome this, we propose an uncertainty-aware pseudo-label selection method that greatly reduces the amount of noisy pseudo-labels. Furthermore, our proposed framework generalizes the pseudo-labeling process, allowing for the creation of negative pseudo-labels; these negative pseudo-labels can be used for multi-label classification as well as negative learning to improve the single-label classification. Even though the above semi-supervised learning method is very effective in reducing annotation costs, it is not suitable for real-world scenarios where there is limited control over the data collection process. Hence, our next focus is on open-world semi-supervised learning, which assumes that labeled and unlabeled data come from different distributions and unlabeled data may contain samples from unknown classes. We propose a method that utilizes a pairwise similarity loss to discover novel classes by implicitly clustering them while recognizing samples from known classes. Using a bi-level optimization rule this pairwise similarity loss exploits the information available in the labeled set. After discovering novel classes, our proposed method transforms the open-world semi-supervised learning problem into a standard semi-supervised learning problem to achieve additional performance gains using existing semi-supervised learning methods. Despite being effective the above solution relies on multiple objective functions, requires prior knowledge of the number of unknown classes, and only works on class-balanced data. To overcome these limitations, we propose a second, more practical, and streamlined solution for open-world semi-supervised learning. Our proposed solution utilizes sample uncertainty and incorporates prior knowledge about class distribution to generate reliable class-distribution-aware pseudo-labels for unlabeled data belonging to both known and unknown classes. Our constrained class-distribution-aware pseudo-label generation is an instance of the optimal transport problem which we solve using the Sinkhorn-Knopp algorithm. This method works with any distribution and does not require knowing the number of novel classes, making it more practical for deployment. In the above two works, we assume that samples from novel classes are available during training. However, this is difficult to satisfy in many real-world scenarios. Therefore, to study how models can achieve human-like performance in open-world settings, i.e., identifying new concepts with only a few examples, we shift our focus to few-shot learning. Learning good generalizable features is crucial for solving this problem. As a result, we propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations. Simultaneous optimization for both of these contrasting objectives allows the model to jointly learn features that are not only independent of the input transformation but also encode the structure of geometric transformations. These complementary sets of features help generalize well to novel classes with only a few labeled data samples. We achieve additional improvements by incorporating a novel self-supervised distillation objective.
If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu
Doctor of Philosophy (Ph.D.)
College of Engineering and Computer Science
Length of Campus-only Access
Doctoral Dissertation (Open Access)
Rizve, Mamshad Nayeem, "Annotation Efficient Visual Recognition: from Semi-Supervised to Few-Shot Learning" (2023). Electronic Theses and Dissertations, 2020-. 1776.