In-Place Algorithms For Exact And Approximate Shortest Unique Substring Problems
Keywords
In-place algorithms; Shortest unique substring; String pattern matching
Abstract
We revisit the exact shortest unique substring (SUS) finding problem, and propose its approximate version where mismatches are allowed, due to its applications in subfields such as computational biology. We design a generic in-place framework that fits to solve both the exact and approximate k-mismatch SUS finding, using the minimum 2n memory words, each of ⌈log2(n)⌉ bits, plus n bytes space, where n is the input string size. By using the in-place framework, we can find the exact and approximate k-mismatch SUS for every string position using a total of O(n) and O(n2) time, respectively, regardless of the value of k. Our framework does not involve any compressed or succinct data structures and thus is practical and easy to implement. Experimental study shows that the peak memory usage of our proposal is consistently 9n bytes for any string of size n, validating the claim that our solution is in-place. Further, our proposal uses much less memory and is much faster than the currently best work that has implementation for exact SUS finding.
Publication Date
8-22-2017
Publication Title
Theoretical Computer Science
Volume
690
Number of Pages
12-25
Document Type
Article
Personal Identifier
scopus
DOI Link
https://doi.org/10.1016/j.tcs.2017.05.032
Copyright Status
Unknown
Socpus ID
85020810018 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/85020810018
STARS Citation
Hon, Wing Kai; Thankachan, Sharma V.; and Xu, Bojian, "In-Place Algorithms For Exact And Approximate Shortest Unique Substring Problems" (2017). Scopus Export 2015-2019. 4784.
https://stars.library.ucf.edu/scopus2015/4784