A Memory Efficient Method for Structure-Based RNA Multiple Alignment

Authors

    Authors

    D. DeBlasio; J. Bruand;S. J. Zhang

    Comments

    Authors: contact us about adding a copy of your work at STARS@ucf.edu

    Abbreviated Journal Title

    IEEE-ACM Trans. Comput. Biol. Bioinform.

    Keywords

    RNA multiple alignment; RNA secondary structure; RNA sequence-structure; alignment; iterative alignment; SECONDARY STRUCTURE PREDICTION; SEQUENCE ALIGNMENT; NONCODING RNAS; CLUSTAL-W; PROGRAMS; BENCHMARK; ACCURACY; DATABASE; GENOMES; Biochemical Research Methods; Computer Science, Interdisciplinary; Applications; Mathematics, Interdisciplinary Applications; Statistics &; Probability

    Abstract

    Structure-based RNA multiple alignment is particularly challenging because covarying mutations make sequence information alone insufficient. Existing tools for RNA multiple alignment first generate pairwise RNA structure alignments and then build the multiple alignment using only sequence information. Here we present PMFastR, an algorithm which iteratively uses a sequence-structure alignment procedure to build a structure-based RNA multiple alignment from one sequence with known structure and a database of sequences from the same family. PMFastR also has low memory consumption allowing for the alignment of large sequences such as 16S and 23S rRNA. The algorithm also provides a method to utilize a multicore environment. We present results on benchmark data sets from BRAliBase, which shows PMFastR performs comparably to other state-of-the-art programs. Finally, we regenerate 607 Rfam seed alignments and show that our automated process creates multiple alignments similar to the manually curated Rfam seed alignments. Thus, the techniques presented in this paper allow for the generation of multiple alignments using sequence-structure guidance, while limiting memory consumption. As a result, multiple alignments of long RNA sequences, such as 16S and 23S rRNAs, can easily be generated locally on a personal computer. The software and supplementary data are available at http://genome.ucf.edu/PMFastR.

    Journal Title

    Ieee-Acm Transactions on Computational Biology and Bioinformatics

    Volume

    9

    Issue/Number

    1

    Publication Date

    1-1-2012

    Document Type

    Article

    Language

    English

    First Page

    1

    Last Page

    11

    WOS Identifier

    WOS:000296782200001

    ISSN

    1545-5963

    Share

    COinS