ORCID

0000-0003-2755-4727

Keywords

Machine Learning, Artificial Intelligence, Transformers, Drug-Drug Interaction, Proteomics, Bioinformatics

Abstract

A significant catalyst of the recent Artificial Intelligence and Machine Learning revolution is attributed to large language models (LLMs), which include many prominent AI systems such as ChatGPT, Bard, and Gemini. The majority of high-performance LLM models are based on the transformer architecture and the primary factor behind the recent enhancement in the performance of these models is attributed to the increase of layer quantity and dimensionality, resulting in an exponential rise in the total number of learnable parameters, necessitating substantial data for effective training. This may pose a significant barrier in certain application domains, such as bioinformatics, pharmaceutical research, and medicinal applications, where data is constrained by its sensitive nature, confidentiality restrictions, and personal privacy laws. The purpose of this study is to design architectural level and training loop modifications to the transformer architecture to make it more suitable for low-data applications. Our proposed modifications include selective transfer learning, cross transfer learning, hierarchical system design, convolutional transformers, knowledge graph fusion models, etc. To test the efficacy of the proposed techniques, we have applied the ideas to various applications, including Enzyme Commission (EC) number prediction from fullscale protein sequences. Here, we have linked four transformer ProtBert modules in a hierarchical manner and selectively pretrained them to obtain an accuracy increase from 89.93% to 97.88% and an F1 score boost from 0.9323 to 0.9787 compared to conventional transformer architectures. We have also designed a multimodal, knowledge graph integrated transformer model that could predict fetal drug-drug interaction events from drug SMILES and achieve accuracy increases of up to 6.78% on inductive test settings. Lastly, we have developed a parallel transformer modular architecture for predicting Gene Ontology (GO) terms from full-scale protein sequences and obtain better results with less training data and computational resources compared to other conventional transformer models.

Completion Date

2024

Semester

Fall

Committee Chair

Yuan, Jiann-Shiun

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Electrical and Computer Engineering

Format

PDF

Identifier

DP0029040

Language

English

Release Date

12-15-2024

Document Type

Dissertation

Campus Location

Orlando (Main) Campus

Subjects

Transfer learning (Machine learning); Artificial intelligence--Biological applications; Artificial intelligence--Medical applications--Research; Machine learning--Experiments; Neural networks (Computer science)--Models

STARS Citation

Tamir, Azwad, "Architectural and Training Regime Modifications to Transformer Models for low data Applications" (2024). Graduate Thesis and Dissertation post-2024. 73.
https://stars.library.ucf.edu/etd2024/73

Accessibility Status

PDF accessibility verified using Adobe Acrobat Pro Accessibility Checker

Download

COinS

Accessibility Statement

This item was created or digitized prior to April 24, 2027, or is a reproduction of legacy media created before that date. It is preserved in its original, unmodified state specifically for research, reference, or historical recordkeeping. In accordance with the ADA Title II Final Rule, the University Libraries provides accessible versions of archival materials upon request. To request an accommodation for this item, please submit an accessibility request form.

Graduate Thesis and Dissertation post-2024

Architectural and Training Regime Modifications to Transformer Models for low data Applications

ORCID

Keywords

Abstract

Completion Date

Semester

Committee Chair

Degree

College

Department

Format

Identifier

Language

Release Date

Document Type

Campus Location

Subjects

STARS Citation

Accessibility Status

Accessibility Statement

Browse Advisors

Explore

Connect

Graduate Thesis and Dissertation post-2024

Architectural and Training Regime Modifications to Transformer Models for low data Applications

Author

ORCID

Keywords

Abstract

Completion Date

Semester

Committee Chair

Degree

College

Department

Format

Identifier

Language

Release Date

Document Type

Campus Location

Subjects

STARS Citation

Accessibility Status

Share

Accessibility Statement

Browse Advisors

Explore

Connect