ORCID

0000-0002-4319-0444

Keywords

RNA 3D Structure, RNA Structural Motifs, Non-redundant Dataset, Motif Clustering, Graph Neural Network, Motif Families

Abstract

Non-coding RNAs perform diverse and essential cellular functions, and the study of RNA structures is expanding rapidly. The growing volume of RNA 3D structures highlights the need for efficient computational methods to identify recurring structural motifs, which are often conserved and critical to RNA function. However, the complexity and redundancy of RNA structures in existing databases pose significant challenges for motif discovery and structural analysis. To address these challenges, this dissertation presents three complementary computational approaches. First, a non-redundant RNA structural dataset, RNA-NRD, was developed to systematically reduce redundancy in RNA structural data and provide a reliable foundation for downstream analysis. RNA-NRD includes an update pipeline that integrates newly solved RNA structures, ensuring the dataset remains current as new data becomes available. Second, a semi-supervised RNA structural motif (RSM) clustering tool, GINClus, was developed to identify novel RNA structural motifs and motif families by clustering structurally similar RNA motifs. GINClus leverages known motif information to guide the clustering process, enabling the discovery of recurring structural patterns that may underpin RNA function. Finally, to overcome the limitations posed by the scarcity of annotated motifs for semi-supervised learning, a self-supervised RSM clustering and identification tool, Auto-GINClus, was designed. Auto-GINClus clusters all loop regions from the RNA-NRD dataset solely based on structural similarity, thereby enabling the discovery of novel motifs without relying on pre-existing annotations. Overall, this work enhances our understanding of RNA structure-function relationships and provides powerful tools for RNA structure analysis and functional annotation.

Completion Date

2026

Semester

Spring

Committee Chair

Zhang, Shaojie

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Computer Science

Format

PDF

Document Type

Dissertation

Identifier

DP0053202

Share

COinS
 

Accessibility Statement

This item was created or digitized prior to April 24, 2027, or is a reproduction of legacy media created before that date. It is preserved in its original, unmodified state specifically for research, reference, or historical recordkeeping. In accordance with the ADA Title II Final Rule, the University Libraries provides accessible versions of archival materials upon request. To request an accommodation for this item, please submit an accessibility request form.