ORCID

0009-0005-9279-9220

Keywords

accelerator, deep neural network, graph neural network, data reuse, dataflow, sparsity

Abstract

The use of machine learning (ML) is pervasive in numerous application domains, such as autonomous driving, scientific computing, robotics, and among others. However, the continuous growth of ML model complexity and data size is posing unprecedented computation and communication demands on current computing systems, especially in the era of large language models. The problem is further compounded by unstructured data and technology limitations.

To address these challenges, this dissertation research explores novel accelerator designs tailored for a wide range of machine learning applications. First, this research investigates a flexible communication fabric that can enable efficient training for deep learning applications in chiplet-based accelerators. Furthermore, this research explores an efficient accelerator architecture that can dynamically handle irregular sparsity in accelerating graph convolutional neural networks. Furthermore, the dissertation uncovers the extensive intermediate feature data reuse opportunities and their communication bottlenecks in complex graph neural network models. These innovative accelerator designs can deliver high-performance, energy-efficient, and scalable solutions for emerging machine learning workloads and advance their practical deployment at an unprecedented scale.

Completion Date

2024

Semester

Fall

Committee Chair

Zheng, Hao

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Department of Electrical and Computer Engineering

Degree Program

Computer Engineering

Format

PDF

Identifier

DP0029054

Language

English

Release Date

12-15-2024

Access Status

Dissertation

Campus Location

Orlando (Main) Campus

Accessibility Status

PDF accessibility verified using Adobe Acrobat Pro Accessibility Checker

Share

COinS