Abstract
Metagenomics is a cultivation-independent approach for obtaining the genomic composition of microbial communities. Microbial communities are ubiquitous in nature. Microbes which are associated with the human body play important roles in human health and disease. These roles span from protecting us against infections from other bacteria, to being the causes of these diseases. A deeper understanding of these communities and how they function inside our bodies allows for advancements in treatments and preventions for these diseases. Recent developments in metagenomics have been driven by the emergence of Next-Generation Sequencing technologies and Third-Generation Sequencing technologies that have enabled cost-effective DNA sequencing and the generation of large volumes of genomic data. These technologies have allowed for the introduction of hybrid DNA assembly techniques to recover the genomes of the constituent microbes. While Next-Generation Sequencing technologies use paired-end sequencing reads from DNA fragments into short reads and have a relatively lower error rate, Third-Generation Sequencing technologies use much longer DNA fragments to generate longer reads, bringing contigs together for larger scaffolds with a higher error rate. Hybrid assemblers leverage both short and long read sequencing technologies and can be a critical step in the advancements of metagenomics, combining these technologies to allow for longer assemblies of DNA with lower error rates. We evaluate the strengths and weaknesses of the hybrid assembly framework using several state-of-the-art assemblers and simulated human microbiome datasets. Our work provides insights into metagenomic assembly and genome recovery, an important step towards a deeper understanding of the microbial communities that influence our well-being.
Thesis Completion
2022
Semester
Spring
Thesis Chair/Advisor
Yooseph, Shibu
Degree
Bachelor of Science (B.S.)
College
College of Engineering and Computer Science
Department
Computer Science
Degree Program
Computer Science
Language
English
Access Status
Open Access
Release Date
5-1-2022
Recommended Citation
Pavini Franco Ferreira, Matheus, "Comparative Evaluation of Assemblers for Metagenomic Data Analysis" (2022). Honors Undergraduate Theses. 1185.
https://stars.library.ucf.edu/honorstheses/1185