Faster Computation Of Genome Mappability

Keywords

Genome mappability; Hamming distance; Heavy path decomposition; Suffix tree

Abstract

The k-mappability problem is defined as follows: Given a sequence S[1, n] of length n over a constant alphabet Σ, and two integers k andm ≤ n, compute an array Fk , such that: Fk [i] = |{j , i | dH (S[i, i +m - 1], S[j, j +m - 1]) ≤ k}| The function dH (,) denotes the hamming distance. Derrien et al. [2] introduced this problem in the context of genome analysis. We propose a provably efficient algorithm for 1-mappability with O(n logn) worst case run time. The previous best known bound is O(n log2 n) [1].

Publication Date

8-15-2018

Publication Title

ACM-BCB 2018 - Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

Number of Pages

537-

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1145/3233547.3233645

Socpus ID

85056108198 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85056108198

This document is currently not available here.

Share

COinS