Thousands of Cis-Regulatory Sequence Combinations Are Shared by Arabidopsis and Poplar
Abbreviated Journal Title
FACTOR-BINDING SITES; DNA-BINDING; GENE-EXPRESSION; STATISTICAL; SIGNIFICANCE; TRANSCRIPTION FACTORS; FUNCTIONAL CLUSTERS; TRANSGENIC; TOBACCO; PROMOTER SEQUENCES; MODULES REVEALS; PHYTOCHROME-A; Plant Sciences
The identification of cis-regulatory modules (CRMs) can greatly advance our understanding of gene regulatory mechanisms. Despite the existence of binding sites of more than three transcription factors (TFs) in a CRM, studies in plants often consider only the cooccurrence of binding sites of one or two TFs. In addition, CRM studies in plants are limited to combinations of only a few families of TFs. It is thus not clear how widespread plant TFs work together, which TFs work together to regulate plant genes, and how the combinations of these TFs are shared by different plants. To fill these gaps, we applied a frequent pattern-mining-based approach to identify frequently used cis-regulatory sequence combinations in the promoter sequences of two plant species, Arabidopsis (Arabidopsis thaliana) and poplar (Populus trichocarpa). A cis-regulatory sequence here corresponds to a DNA motif bound by a TF. We identified 18,638 combinations composed of two to six cis-regulatory sequences that are shared by the two plant species. In addition, with known cis-regulatory sequence combinations, gene function annotation, gene expression data, and known functional gene sets, we showed that the functionality of at least 96.8% and 65.2% of these shared combinations in Arabidopsis are partially supported, under a false discovery rate of 0.1 and 0.05, respectively. Finally, we discovered that 796 of the 18,638 combinations might relate to functions that are important in bioenergy research. Our work will facilitate the study of gene transcriptional regulation in plants.
"Thousands of Cis-Regulatory Sequence Combinations Are Shared by Arabidopsis and Poplar" (2012). Faculty Bibliography 2010s. 2490.