Title

New algorithms for finding monad patterns in DNA sequences

Authors

Authors

R. V. Satya;A. Mukherjee

Comments

Authors: contact us about adding a copy of your work at STARS@ucf.edu

Keywords

Computer Science, Artificial Intelligence; Computer Science, Theory &; Methods

Abstract

In this paper, we present two new algorithms for discovering monad patterns in DNA sequences. Monad patterns are of the form (l,d)-k, where l is the length of the pattern, d is the maximum number of mismatches allowed, and k is the minimum number of times the pattern is repeated in the given sample. The time-complexity of some of the best known algorithms to date is O(nt(2)l(d)sigma(d)), where t is the number of input sequences, n is the length of each input sequence, and sigma = \Sigma\ is the size of the alphabet. The first algorithm that we present in this paper takes O(n(2)t(2)l(d/2)) time and O(ntl(d/2)sigma(d/2)) space; and the second algorithm takes O(n(3)t(3)l(d/2)sigma(d/2)) time using O(l(d/2)sigma(d/2)) space. In practice, our algorithms have much better performance provided the d/l ratio is small. The second algorithm performs very well even for large values l and d as long as the d/l ratio is small.

Journal Title

String Processing and Information Retrieval, Proceedings

Volume

3246

Publication Date

1-1-2004

Document Type

Article

Language

English

First Page

273

Last Page

285

WOS Identifier

WOS:000224377200040

ISSN

0302-9743; 3-540-23210-9

Share

COinS