Faster Computation Of Genome Mappability
Keywords
Genome mappability; Hamming distance; Heavy path decomposition; Suffix tree
Abstract
The k-mappability problem is defined as follows: Given a sequence S[1, n] of length n over a constant alphabet Σ, and two integers k andm ≤ n, compute an array Fk , such that: Fk [i] = |{j , i | dH (S[i, i +m - 1], S[j, j +m - 1]) ≤ k}| The function dH (,) denotes the hamming distance. Derrien et al. [2] introduced this problem in the context of genome analysis. We propose a provably efficient algorithm for 1-mappability with O(n logn) worst case run time. The previous best known bound is O(n log2 n) [1].
Publication Date
8-15-2018
Publication Title
ACM-BCB 2018 - Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
Number of Pages
537-
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1145/3233547.3233645
Copyright Status
Unknown
Socpus ID
85056108198 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/85056108198
STARS Citation
Hooshmand, Sahar; Abedin, Paniz; Gibney, Daniel; Aluru, Srinivas; and Thankachan, Sharma V., "Faster Computation Of Genome Mappability" (2018). Scopus Export 2015-2019. 10119.
https://stars.library.ucf.edu/scopus2015/10119