Transformers, Deep learning, Assistive technology, Human activity, Computer vision, Medical image


This dissertation presents a comprehensive exploration and implementation of attention mechanisms and transformers on several healthcare-related and assistive applications. The overarching goal is to demonstrate successful implementation of the state-of-the-art approaches and provide validated models with their superior performance to inform future research and development. In Chapter 1, attention mechanisms are harnessed for the fine-grained classification of white blood cells (WBCs), showcasing their efficacy in medical diagnostics. The proposed multi-attention framework ensures accurate WBC subtype classification by capturing discriminative features from various layers, leading to superior performance compared to other existing approaches used in previous work. More importantly, the attention-based method showed consistently better results than without attention in all three backbone architectures tested (ResNet, XceptionNet and Efficient- Net). Chapter 2 introduces a self-supervised framework leveraging vision transformers for object detection, semantic and custom algorithms for collision prediction in application to assistive technology for visually impaired. In addition, Multimodal sensory feedback system was designed and fabricated to convey environmental information and potential collisions to the user for real-time navigation and grasping assistance. Chapter 3 presents implementation of transformer-based method for operation-relevant human activity recognition (HAR) and demonstrated its performance over other deep learning model, long-short term memory (LSTM). In addition, feature engineering was used (principal component analysis) to extract most discriminatory and representative motion features from the instrumented sensors, indicating that the joint angle features are more important than body segment orientations. Further, identification of a minimal number and placement of wearable sensors for use in real-world data collections and activity recognitions, addressing the critical gap found in the respective field to enhance the practicality and utility of wearable sensors for HAR. The premise and efficacy of attention-based mechanisms and transformers was confirmed through its demonstrated performance in classification accuracy as compared to LSTM. These research outcomes from three distinct applications of attention-based mechanisms and trans- formers and demonstrated performance over existing models and methods support their utility and applicability across various biomedical and human activity research fields. By sharing the custom designed model architectures, implementation methods, and resulting classification performance has direct impact in the related field by allowing direct adoption and implementation of the developed methods.

Completion Date




Committee Chair

Park, Joon-Hyuk


Doctor of Philosophy (Ph.D.)


College of Engineering and Computer Science


Electrical and Computer Engineering

Degree Program

Electrical Engineering









In copyright

Release Date

May 2029

Length of Campus-only Access

5 years

Access Status

Doctoral Dissertation (Campus-only Access)

Campus Location

Orlando (Main) Campus

Accessibility Status

Meets minimum standards for ETDs/HUTs

Restricted to the UCF community until May 2029; it will then be open access.