Title

A Memory Efficient Method for Structure-Based RNA Multiple Alignment

Authors

Authors

D. DeBlasio; J. Bruand;S. J. Zhang

Comments

Authors: contact us about adding a copy of your work at STARS@ucf.edu

Abbreviated Journal Title

IEEE-ACM Trans. Comput. Biol. Bioinform.

Keywords

RNA multiple alignment; RNA secondary structure; RNA sequence-structure; alignment; iterative alignment; SECONDARY STRUCTURE PREDICTION; SEQUENCE ALIGNMENT; NONCODING RNAS; CLUSTAL-W; PROGRAMS; BENCHMARK; ACCURACY; DATABASE; GENOMES; Biochemical Research Methods; Computer Science, Interdisciplinary; Applications; Mathematics, Interdisciplinary Applications; Statistics &; Probability

Abstract

Structure-based RNA multiple alignment is particularly challenging because covarying mutations make sequence information alone insufficient. Existing tools for RNA multiple alignment first generate pairwise RNA structure alignments and then build the multiple alignment using only sequence information. Here we present PMFastR, an algorithm which iteratively uses a sequence-structure alignment procedure to build a structure-based RNA multiple alignment from one sequence with known structure and a database of sequences from the same family. PMFastR also has low memory consumption allowing for the alignment of large sequences such as 16S and 23S rRNA. The algorithm also provides a method to utilize a multicore environment. We present results on benchmark data sets from BRAliBase, which shows PMFastR performs comparably to other state-of-the-art programs. Finally, we regenerate 607 Rfam seed alignments and show that our automated process creates multiple alignments similar to the manually curated Rfam seed alignments. Thus, the techniques presented in this paper allow for the generation of multiple alignments using sequence-structure guidance, while limiting memory consumption. As a result, multiple alignments of long RNA sequences, such as 16S and 23S rRNAs, can easily be generated locally on a personal computer. The software and supplementary data are available at http://genome.ucf.edu/PMFastR.

Journal Title

Ieee-Acm Transactions on Computational Biology and Bioinformatics

Volume

9

Issue/Number

1

Publication Date

1-1-2012

Document Type

Article

Language

English

First Page

1

Last Page

11

WOS Identifier

WOS:000296782200001

ISSN

1545-5963

Share

COinS