Abstract

Metagenomics is a cultivation-independent approach for obtaining the genomic composition of microbial communities. Microbial communities are ubiquitous in nature. Microbes which are associated with the human body play important roles in human health and disease. These roles span from protecting us against infections from other bacteria, to being the causes of these diseases. A deeper understanding of these communities and how they function inside our bodies allows for advancements in treatments and preventions for these diseases. Recent developments in metagenomics have been driven by the emergence of Next-Generation Sequencing technologies and Third-Generation Sequencing technologies that have enabled cost-effective DNA sequencing and the generation of large volumes of genomic data. These technologies have allowed for the introduction of hybrid DNA assembly techniques to recover the genomes of the constituent microbes. While Next-Generation Sequencing technologies use paired-end sequencing reads from DNA fragments into short reads and have a relatively lower error rate, Third-Generation Sequencing technologies use much longer DNA fragments to generate longer reads, bringing contigs together for larger scaffolds with a higher error rate. Hybrid assemblers leverage both short and long read sequencing technologies and can be a critical step in the advancements of metagenomics, combining these technologies to allow for longer assemblies of DNA with lower error rates. We evaluate the strengths and weaknesses of the hybrid assembly framework using several state-of-the-art assemblers and simulated human microbiome datasets. Our work provides insights into metagenomic assembly and genome recovery, an important step towards a deeper understanding of the microbial communities that influence our well-being.

Thesis Completion

2022

Semester

Spring

Thesis Chair/Advisor

Yooseph, Shibu

Degree

Bachelor of Science (B.S.)

College

College of Engineering and Computer Science

Department

Computer Science

Degree Program

Computer Science

Language

English

Access Status

Open Access

Release Date

5-1-2022

Share

COinS