Title

Data partitions and complex models in Bayesian analysis: The phylogeny of Gymnophthalmid lizards

Authors

Authors

T. A. Castoe; T. M. Doan;C. L. Parkinson

Comments

Authors: contact us about adding a copy of your work at STARS@ucf.edu

Abbreviated Journal Title

Syst. Biol.

Keywords

Autocorrelated gamma; Bayesian analysis; combining data; Gymnophthalmidae; likelihood models; partitioning data; Reptilia; site-specific gamma; SITE RATE VARIATION; MAXIMUM-LIKELIHOOD-ESTIMATION; MITOCHONDRIAL; RIBOSOMAL DNA; NUCLEOTIDE SUBSTITUTION; MOLECULAR SYSTEMATICS; SECONDARY; STRUCTURE; BOOTSTRAP MEASURES; GENE-SEQUENCES; EMPIRICAL-DATA; RNA; STRUCTURES; Evolutionary Biology

Abstract

Phylogenetic studies incorporating multiple loci, and multiple genomes, are becoming increasingly common. Coincident with this trend in genetic sampling, model-based likelihood techniques including Bayesian phylogenetic methods continue to gain popularity. Few studies, however, have examined model fit and sensitivity to such potentially heterogeneous data partitions within combined data analyses using empirical data. Here we investigate the relative model fit and sensitivity of Bayesian phylogenetic methods when alternative site-specific partitions of among-site rate variation (with and without autocorrelated rates) are considered. Our primary goal in choosing a best-fit model was to employ the simplest model that was a good fit to the data while optimizing topology and/or Bayesian posterior probabilities. Thus, we were not interested in complex models that did not practically affect our interpretation of the topology under study. We applied these alternative models to a four-gene data set including one protein-coding nuclear gene (c-mos), one protein-coding mitochondrial gene (ND4), and two mitochondrial rRNA genes (12S and 16S) for the diverse yet poorly known lizard family Gymnophthalmidae. Our results suggest that the best-fit model partitioned among-site rate variation separately among the c-mos, ND4, and 12S + 16S gene regions. We found this model yielded identical topologies to those from analyses based on the GTR+I+G model, but significantly changed posterior probability estimates of clade support. This partitioned model also produced more precise (less variable) estimates of posterior probabilities across generations of long Bayesian runs, compared to runs employing a GTR+I+G model estimated for the combined data. We use this three-way gamma partitioning in Bayesian analyses to reconstruct a robust phylogenetic hypothesis for the relationships of genera within the lizard family Gymnophthalmidae. We then reevaluate the higher-level taxonomic arrangement of the Gymnophthalmidae. Based on our findings, we discuss the utility of nontraditional parameters for modeling among-site rate variation and the implications and future directions for complex model building and testing.

Journal Title

Systematic Biology

Volume

53

Issue/Number

3

Publication Date

1-1-2004

Document Type

Article

Language

English

First Page

448

Last Page

469

WOS Identifier

WOS:000222351000006

ISSN

1063-5157

Share

COinS