Keywords
3D Computer Vision, LiDAR Perception, Deep Learning, Autonomous Driving
Abstract
In recent years, LiDAR has emerged as a crucial perception tool for robotics and autonomous vehicles. However, most LiDAR perception methods are adapted from 2D image-based deep learning methods, which are not well-suited to the unique geometric structure of LiDAR point cloud data. This domain gap poses challenges for the fast-growing LiDAR perception tasks. This dissertation aims to investigate suitable deep network structures tailored for LiDAR point cloud data, and therefore design a more efficient and robust LiDAR perception framework. Our approach to address this challenge is twofold. First, we recognize that LiDAR point cloud data is characterized by an imbalanced and sparse distribution in the 3D space, which is not effectively captured by traditional voxel-based convolution methods that treat the 3D map uniformly. To address this issue, we aim to develop a more efficient feature extraction method by either counteracting the imbalanced feature distribution or incorporating global contextual information using a transformer decoder. Second, besides the gap between the 2D and 3D domains, we acknowledge that different LiDAR perception tasks have unique requirements and therefore require separate network designs, resulting in significant network redundancy. To address this, we aim to improve the efficiency of the network design by developing a unified multi-task network that shares the feature-extracting stage and performs different tasks using specific heads. More importantly, we aim to enhance the accuracy of different tasks by leveraging the multi-task learning framework to enable mutual improvements. We propose different models based on these motivations and evaluate them on several large-scale LiDAR point cloud perception datasets, achieving state-of-the-art performance. Lastly, we summarize the key findings of this dissertation and propose future research directions.
Completion Date
2023
Semester
Fall
Committee Chair
Foroosh, Hassan
Degree
Doctor of Philosophy (Ph.D.)
College
College of Engineering and Computer Science
Department
Compute Science
Format
application/pdf
Identifier
DP0028106
URL
https://purls.library.ucf.edu/go/DP0028106
Language
English
Release Date
December 2023
Length of Campus-only Access
None
Access Status
Doctoral Dissertation (Open Access)
Campus Location
Orlando (Main) Campus
STARS Citation
Zhou, Zixiang, "Towards a Robust and Efficient Deep Neural Network for the Lidar Point Cloud Perception" (2023). Graduate Thesis and Dissertation 2023-2024. 52.
https://stars.library.ucf.edu/etd2023/52