Title
Lipt: A Lossless Text Transform To Improve Compression
Abstract
We propose an approach to develop a dictionary based reversible lossless text transformation, called LIFT (length index preserving transform), which can be applied to a source text to improve the existing algorithm's ability to compress. In LIFT, the length of the input word and the offset of the words in the dictionary are denoted with alphabets. Our encoding scheme makes use of the recurrence of same length words in the English language to create context in the transformed text that the entropy coders can exploit. LIFT also achieves some compression at the preprocessing stage and retains enough context and redundancy for the compression algorithms to give better results. Bzip2 with LIFT gives 5.24% improvement in average BPC over Bzip2 without LIPT, and PPMD with LIPT gives 4.46% improvement in average BPC over PPMD without LIFT, for our test corpus.
Publication Date
1-1-2001
Publication Title
Proceedings - International Conference on Information Technology: Coding and Computing, ITCC 2001
Number of Pages
452-460
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/ITCC.2001.918838
Copyright Status
Unknown
Socpus ID
84961783374 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/84961783374
STARS Citation
Awan, F. S. and Mukherjee, A., "Lipt: A Lossless Text Transform To Improve Compression" (2001). Scopus Export 2000s. 336.
https://stars.library.ucf.edu/scopus2000/336