Abstract

Modern information era gives rise to the persistent generation of large amounts of data with rapid speed and broad geographical distribution. Obtaining knowledge and understanding via analysis and learning from such data have invaluable worth. Features of such data analytical tasks commonly include: data can be large scale and geographically distributed; computing capability demand can be enormous; tasks can be time-critical; some data can be private; participants can have heterogeneous capabilities and non-IID data; and multiple simultaneously submitted data analytical tasks can be possible. These bring challenges to contemporary computing infrastructure and learning models. In view of this, we develop techniques with the purpose of tackling above challenges together towards more efficient collaborative distributed data analysis and learning. We propose a hierarchical framework that supports data analytics on multiple Apache Spark clusters. We propose reinforcement learning based resource management approaches to improve overall efficiency and reduce deadline violations for scheduling general and time-critical data analytical workflows among computing resources. We establish a new hybrid framework for efficient privacy-preserving federated learning and further propose an algorithm upon it for improving asynchronous federated learning of heterogeneous participants having non-IID data. We also propose an asynchronous stochastic gradient descent algorithm for general distributed learning of heterogeneous participants having non-IID data with convergence analysis. Experiments have shown the efficacy of our proposed approaches.

Notes

If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date

2022

Semester

Spring

Advisor

Wang, Liqiang

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Computer Science

Degree Program

Computer Science

Format

application/pdf

Identifier

CFE0009448; DP0027171

URL

https://purls.library.ucf.edu/go/DP0027171

Language

English

Release Date

November 2023

Length of Campus-only Access

1 year

Access Status

Doctoral Dissertation (Open Access)

Share

COinS