ORCID

https://orcid.org/0000-0002-4537-2620

Keywords

Representation space, Contrastive multi-modal learning, Multi-task learning, Out-of-distribution detection, LLLM tokenizer optimization, Neuro-symbolic representaion

Abstract

The deployment of real-world artificial intelligence necessitates models capable of processing diverse data modalities while maintaining strict standards of computational efficiency and operational reliability. This thesis investigates the structured multi-modal representation space by systematically advancing how embedding and tokenization spaces are constructed, evaluated, and utilized across domains. To address the inherent challenges of combining shared and task-specific learning objectives in multi-modal environments, we first propose a multi-task contrastive learning framework that strategically partitions the embedding space. This structuring accommodates diverse classification and regression requirements, significantly improving overall accuracy and generalization while retaining the robust feature alignment of contrastive learning. Building upon these foundational representations, we ensure system reliability by developing an angular distance-based out-of-distribution detection methodology. By formulating a distance transformation compliant with the training processes, this approach accurately identifies when a model operates outside its knowledge limits, establishing robust safety boundaries without requiring computationally expensive model retraining. Furthermore, to improve data utilization efficiency, we introduce a neuro-symbolic framework for 3D scene representation. This method automatically converts dense 3D point clouds into compact, hybrid formats by substituting recognized neural entities with verified symbolic objects by optimizing parameter space. This drastically reduces the computational overhead for downstream tasks. Finally, we optimize the semantic representation space within Large Language Models to improve the inference speed and model generation quality. By introducing a Bayesian model of token importance, calculated via domain, specific token frequency and gradients, we update the tokenization space to enhance semantic understandability while minimizing memory requirements. By addressing these fundamental challenges, this dissertation lays the groundwork for creating reliable, and transparent AI systems for real-world applications.

Completion Date

2026

Semester

Spring

Committee Chair

Dr. Hao Zheng

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Department of Electrical and Computer Engineering

Format

PDF

Document Type

Dissertation

Identifier

DP0053181

Release Date

5-15-2027

Available for download on Saturday, May 15, 2027

Share

COinS
 

Accessibility Statement

This item was created or digitized prior to April 24, 2027, or is a reproduction of legacy media created before that date. It is preserved in its original, unmodified state specifically for research, reference, or historical recordkeeping. In accordance with the ADA Title II Final Rule, the University Libraries provides accessible versions of archival materials upon request. To request an accommodation for this item, please submit an accessibility request form.