Abbreviated Journal Title
BMC Bioinformatics
Keywords
Metagenomics; Binning; Taxonomy-independent; EM Algorithm; Markov; properties; INTERPOLATED MARKOV-MODELS; PHYLOGENETIC CLASSIFICATION; TAXONOMIC; CLASSIFICATION; MAXIMUM-LIKELIHOOD; GENOMIC FRAGMENTS; DNA-SEQUENCES; MICROBIAL GENOMES; L-TUPLES; ALGORITHM; CHALLENGES; Biochemical Research Methods; Biotechnology & Applied Microbiology; Mathematical & Computational Biology
Abstract
Background: Binning environmental shotgun reads is one of the most fundamental tasks in metagenomic studies, in which mixed reads from different species or operational taxonomical units (OTUs) are separated into different groups. While dozens of binning methods are available, there is still room for improvement. Results: We developed a novel taxonomy-independent approach called MBBC (Metagenomic Binning Based on Clustering) to cluster environmental shotgun reads, by considering k-mer frequency in reads and Markov properties of the inferred OTUs. Tested on twelve simulated datasets, MBBC reliably estimated the species number, the genome size, and the relative abundance of each species, independent of whether there are errors in reads. Tested on multiple experimental datasets, MBBC outperformed two state-of-the-art taxonomy-independent methods, in terms of the accuracy of the estimated species number, genome sizes, and percentages of correctly assigned reads, among other metrics. Conclusions: We have developed a novel method for binning metagenomic reads based on clustering. This method is demonstrated to reliably predict species numbers, genome sizes, relative species abundances, and k-mer coverage in simple datasets. Our method also has a high accuracy in read binning. The MBBC software is freely available at http://eecs.ucf.edu/similar to xiaoman/MBBC/MBBC.html.
Journal Title
Bmc Bioinformatics
Volume
16
Publication Date
1-1-2015
Document Type
Article
Language
English
First Page
11
WOS Identifier
ISSN
1471-2105
Recommended Citation
Wang, Ying; Hu, Haiyan; and Li, Xiaoman, "MBBC: an efficient approach for metagenomic binning based on clustering" (2015). Faculty Bibliography 2010s. 6860.
https://stars.library.ucf.edu/facultybib2010/6860
Comments
Authors: contact us about adding a copy of your work at STARS@ucf.edu