Space-Time Trade-Offs For The Shortest Unique Substring Problem

Keywords

Probabilistic z-fast trie; Rabin-Karp fingerprint; Sparsification; Succinct data-structures; Suffix tree

Abstract

Given a string X[1, n] and a position k ∈ [1, n], the Shortest Unique Substring of X covering k, denoted by Sk, is a substring X[i, j] of X which satisfies the following conditions: (i) i ≤ k ≤ j, (ii) i is the only position where there is an occurrence of X[i, j], and (iii) j - i is minimized. The best-known algorithm [Hon et al., ISAAC 2015] can find Sk for all k ∈ [1, n] in time O(n) using the string X and additional 2n words of working space. Let τ be a given parameter. We present the following new results. For any given k ∈ [1, n], we can compute Sk via a deterministic algorithm in O(nτ2 logn n/τ) time using X and additional O(n/τ) words of working space. For every k ∈ [1, n], we can compute Sk via a deterministic algorithm in O(nτ2 log n) time using X and additional O(n/τ) words and 4n + o(n) bits of working space. For both problems above, we present an O(nτ logc+1 n)-time randomized algorithm that uses n/ logc n words in addition to that mentioned above, where c ≥ 0 is an arbitrary constant. In this case, the reported string is unique and covers k, but with probability at most n-O(1), may not be the shortest. As a consequence of our techniques, we also obtain similar space-and-time tradeoffs for a related problem of finding Maximal Unique Matches of two strings [Delcher et al., Nucleic Acids Research 1999].

Publication Date

12-1-2016

Publication Title

Leibniz International Proceedings in Informatics, LIPIcs

Volume

64

Number of Pages

34.1-34.13

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.4230/LIPIcs.ISAAC.2016.34

Socpus ID

85010726080 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85010726080

This document is currently not available here.

Share

COinS