Space–Time Trade-Offs For Finding Shortest Unique Substrings And Maximal Unique Matches

Keywords

Probabilistic z-fast trie; Rabin–Karp fingerprint; Sparsification; Succinct Data-Structures; Suffix tree

Abstract

Given a string X[1,n] and a position k∈[1,n], a Shortest Unique Substring of X covering k, denoted by Sk, is a substring X[i,j] of X which satisfies the following conditions: (i) i≤k≤j, (ii) i is the only position where there is an occurrence of X[i,j], and (iii) j−i is minimized. Current best-known algorithms for finding Sk require Θ(n) words of working space, and O(n) time. Let τ be a given parameter. We present the following new results. • Given a k∈[1,n], we can compute Sk in O(nτ2log⁡[Formula presented]) time using X and an additional O(n/τ) words of working space.• For every k∈[1,n], we can compute Sk in O(nτ2log⁡n) time using X, and an additional O(n/τ) words and 4n+o(n) bits of working space.• We present an O(nτlogc+1⁡n)-time randomized algorithm that uses n/logc⁡n words in addition to that mentioned above, where c≥0 is an arbitrary constant. In this case, the reported string is unique and covers k, but, with probability at most n−O(1), may not be the shortest. By choosing τ=ω(1), our results imply the first sub-linear space (in addition to the input string) solution to these problems. We also present the following two results. • An algorithm that finds Sk for every k∈[1,n] using O(nlog⁡σ) bits of working space in O(nlog⁡n) time, where σ≤n is the number of distinct symbols in X.• A 4n+o(n)-bit index that can report Sk for any k in O(1) time.As a consequence of our techniques, we also obtain similar space-and-time tradeoffs for a related problem of finding Maximal Unique Matches of two strings (Delcher et al., 1999 [14]).

Publication Date

11-14-2017

Publication Title

Theoretical Computer Science

Volume

700

Number of Pages

75-88

Document Type

Article

Personal Identifier

scopus

DOI Link

https://doi.org/10.1016/j.tcs.2017.08.002

Socpus ID

85028863507 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85028863507

This document is currently not available here.

Share

COinS