Abstract

Powered by the high-throughput genomic technologies, the RNA sequencing (RNA-Seq) method is capable of measuring transcriptome-wide mRNA expressions and molecular activities in cells. Elucidation of gene expressions at the isoform resolution enables the detection of better molecular signatures for phenotype prediction, and the identified biomarkers may provide insights into the functional consequences of disease. This dissertation research focuses on developing advanced machine learning algorithms for mining large-scale RNA-Seq data in cancer transcriptome analysis. A platform-integrated model for transcript quantification (IntMTQ) is developed to improve the performance of RNA-Seq on isoform expression estimation. IntMTQ provides more precise RNA-Seq-based isoform quantification, and the gene expressions learned by IntMTQ consistently provide more and better molecular features for downstream analyses. In light of recent challenges posted by the COVID-19 pandemic, computational methods are developed and applied to RNA-Seq data of lung cancer cell lines to detect novel molecular signatures that are highly correlated with SARS-CoV-2 pathogenesis and prognosis for COVID-19 studies. The results from the data analyses demonstrate that post-transcriptional gene regulations provide additional molecular signatures for COVID-19 therapeutic targets compared to the transcriptional signatures. To further investigate post-transcriptional regulations, a pan-cancer analysis is performed to reveal discrete intronic polyadenylation in human cancer transcriptome. The identified intronic APA profile can add additional prognostic and predictive power beyond conventional gene expression profiles in cancer survival analysis and phenotype prediction. In view of this, a biological pathway encoded transformer model is proposed to maximize the use of RNA-Seq data for cancer phenotype prediction.

Notes

If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu.

Graduation Date

2023

Semester

Spring

Advisor

Zhang, Wei

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Computer Science

Degree Program

Computer Science

Identifier

CFE0009868; DP0028138

URL

https://purls.library.ucf.edu/go/DP0028138

Language

English

Release Date

November 2023

Length of Campus-only Access

None

Access Status

Doctoral Dissertation (Open Access)

Share

COinS