Title
Dictionary-Based Fast Transform For Text Compression
Abstract
In this paper we present StarNT, a dictionary-based fast lossless text transform algorithm. With a static generic dictionary, StarNT achieves a superior compression ratio than almost all the other recent efforts based on BWT and PPM. This algorithm utilizes ternary search tree to expedite transform encoding. Experimental results show that the average compression time has improved by orders of magnitude compared with our previous algorithm LIPT and the additional time overhead it introduced to the backend compressor is unnoticeable. Based on StarNT, we propose StarZip, a domain-specific lossless text compression utility. Using domain-specific static dictionaries embedded in the system, StarZip achieves an average improvement in compression performance (in terms of BPC) of 13% over bzip2-9, 19% over gzip-9, and 10% over PPMD.
Publication Date
1-1-2003
Publication Title
Proceedings ITCC 2003, International Conference on Information Technology: Computers and Communications
Number of Pages
176-182
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/ITCC.2003.1197522
Copyright Status
Unknown
Socpus ID
84978916786 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/84978916786
STARS Citation
Sun, Weifeng; Zhang, Nan; and Mukherjee, A., "Dictionary-Based Fast Transform For Text Compression" (2003). Scopus Export 2000s. 1927.
https://stars.library.ucf.edu/scopus2000/1927