Keywords
Efficient Deep Neural Networks, Pruning and Compression
Abstract
Recent advances in computer vision tasks are mainly due to the success of large deep neural networks. The current state-of-the-art models have high computational costs during inference and suffer from a high memory footprint. Therefore, deploying these large networks on edge devices remains a serious concern. Furthermore, training these over-parameterized networks is computationally expensive and requires a longer training time. Thus, there is a demand to develop techniques that can efficiently reduce training costs and also be able to deploy neural networks on mobile and embedded devices. This dissertation presents practices like designing a lightweight network architecture and increasing network resource utilization. These solutions improve the efficiency of large networks during training and inference.
We first propose an efficient micro-architecture (slim modules) to construct a light-weight Slim-CNN to predicting face attributes. Slim modules uses depthwise separable convolutions with pointwise convolutions, making them computationally efficient for embedded applications. Next, we investigate the problem of obtaining a compact pruned model from an untrained original network in a single-stage process. We introduce our RAPID framework that distills knowledge to a pruned student model from a teacher model under online settings. Next, we analyze the phenomena of inactive channels in a trained neural network. We take a deep dive into the gradient updates of these channels and discover that these channels have no weight update after a few early epochs. Thus, we present our channel regeneration technique that reinitializes batch normalization gamma values of all inactive channels. The gradient updates of these channels improve after the regeneration step, resulting in an increase in the contribution of these channels to the network performance.
Finally, we introduce a method to improve computational efficiency in pre-trained vision transformers by reducing redundancy in visual data. Our method selects image windows or regions with high objectness measures, as these regions may contain an object of any class. Across all works in this dissertation, we extensively evaluate our proposed methods and demonstrate that our techniques improve the computational efficiency of deep neural networks during training and inference.
Completion Date
2023
Semester
Fall
Committee Chair
Foroosh, Hassan
Degree
Doctor of Philosophy (Ph.D.)
College
College of Engineering and Computer Science
Department
Computer Science
Degree Program
Computer Science
Format
application/pdf
Identifier
DP0028091
URL
https://purls.library.ucf.edu/go/DP0028091
Language
English
Release Date
December 2023
Length of Campus-only Access
None
Access Status
Doctoral Dissertation (Open Access)
Campus Location
Orlando (Main) Campus
STARS Citation
Sharma, Ankit, "Optimizing Deep Neural Networks Performance: Efficient Techniques For Training and Inference" (2023). Graduate Thesis and Dissertation 2023-2024. 44.
https://stars.library.ucf.edu/etd2023/44