Keywords
Identity by descent, Local ancestry
Abstract
Identity by descent (IBD) and local ancestry are essential for population genetic inference, such as understanding genealogical relationships and demographic history through genomic data. With the availability of large and high-resolution datasets, new opportunities and computational challenges are brought for efficient and accurate methods to infer IBD segments and local ancestry. This dissertation presents three computational methods designed to assist in population genetic inference. First, RaPID-Query is introduced to efficiently query IBD segments for individual haplotypes over a large genotype dataset. This method is based on the positional Burrows-Wheeler transform (PBWT) algorithm and utilizes a random projection approach. RaPID-Query is able to extract IBD segments with a high accuracy rate, making it useful for genealogical search analysis. Second, Recomb-Mix is introduced as an efficient local ancestry inference (LAI) method for admixed individual haplotypes using reference population panels. It is based on the Li and Stephens model and graph optimization formulation. Recomb-Mix is capable of inferring local ancestry labels in diverse sets of scenarios while being competitive in terms of resource efficiency. The high-quality results it produces prove beneficial for analyzing population demographic history. Finally, this dissertation examines the definitions of IBD segments in ancestral recombination graphs (ARGs) and promotes a recombination-based definition called identity by direct descent (IBDD). An ARG-based PBWT algorithm, referred to as TS-PBWT, is presented to efficiently extract IBDD segments from the tree sequence. The IBDD segments demonstrate robustness against IBD coverage inflation in the centromere region of Chromosome 1 and may be more useful for analyzing distant population demographic history compared to IBD segments defined by the most recent common ancestor. Overall, these computational methods enhance the efficiency of acquiring IBD segments and local ancestries with high resolution, leading to advancing population genetic studies.
Completion Date
2025
Semester
Summer
Committee Chair
Zhang, Shaojie
Degree
Doctor of Philosophy (Ph.D.)
College
College of Engineering and Computer Science
Department
Computer Science
Format
Identifier
DP0029626
Language
English
Document Type
Thesis
Campus Location
Orlando (Main) Campus
STARS Citation
Wei, Yuan, "Computational Methods for Population Genetic Inference Using Identity by Descent and Local Ancestry" (2025). Graduate Thesis and Dissertation post-2024. 388.
https://stars.library.ucf.edu/etd2024/388