Abstract

Deep Convolutional Neural Networks have achieved remarkable performance on visual recognition problems, and have been extensively adopted in real-world applications, such as Apple's Face ID security system, autonomous driving cars, and automatic image tagging in online album services. One major concern in the development of CNNs is that their computational complexity grows along with the increase in their accuracy. Therefore, there is a continuous demand to find the right balance between accuracy and complexity in the design of CNN models. This dissertation focuses on designing various novel structures to enhance the performance of CNNs and their efficiency. Our efforts fall into two categories. One is to explore the redundancy in the standard convolutional neural networks so that comparable learning capability can be achieved with lower computational complexity. The second is to improve network performance with distinctive structures that can learn better feature representations, yielding negligible computational complexity by themselves. To explore the redundancy in CNNs and reduce the computational complexity, we propose three exclusive designs: Single Intra-Channel Convolutional (SIC) Layer, topological sub-divisioning, and spatial ``bottleneck'' structure. The SIC layer reduces the redundancy from the disentanglement between spatial 2D convolution and linear projection. Topological sub-divisioning is introduced to reduce the density of connections between input and output channels. The Spatial ``bottleneck'' structure takes advantage of the correlation between adjacent pixels in the spatial dimension to reduce the complexity of linear channel projection without reducing the spatial resolution of the subsequent layer. Building models based on these structures can achieve comparable performance against the counterpart state-of-the-art models on different computer vision tasks with several times fewer computational complexity, parameters, as well as actual running time. Since the most straightforward approach for boosting network performance from the non-linearity perspective is to design a more powerful activation function, we design a unique Look-up Table Unit activation function that learns the shape of the activation function from the data and provides sufficient non-linearity to the network to learn more complex feature representations. We also propose a novel layer structure, referred to as a Wide Hidden Expansion (WHE) layer, to substantially increase the number of activation functions along with the implicit hidden-channel increase, enhancing the performance of different network architectures.

Notes

If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date

2020

Semester

Spring

Advisor

Foroosh, Hassan

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Computer Science

Degree Program

Computer Science

Format

application/pdf

Identifier

CFE0008053; DP0023193

URL

https://purls.library.ucf.edu/go/DP0023193

Language

English

Release Date

May 2020

Length of Campus-only Access

None

Access Status

Doctoral Dissertation (Open Access)

Share

COinS