Title: A universal compression strategy using sorting transformation
Authors: Bo Liu; Xi Huang; Xiaoguang Liu; Gang Wang; Ming Xu
Addresses: College of Computer and Control Engineering, Nankai University, Tianjin, 300350, China ' College of Computer and Control Engineering, Nankai University, Tianjin, 300350, China ' College of Computer and Control Engineering, Nankai University, Tianjin, 300350, China ' College of Computer and Control Engineering, Nankai University, Tianjin, 300350, China ' QIHU360 Inc., Beijing, 100020, China
Abstract: Although traditional universal compression algorithms can effectively utilise repetition located in a slide window, they cannot take their own advantages for some message source in which similar messages are distributed uniformly. In this paper, we come up with a universal segmenting-sorting compression algorithm to solve this problem. The key idea is to reorder the message source before compressing it with Lz77 algorithm. We design transformation methods for two common data types, corpus of webpages and access log. The experimental results show that segmenting-sorting transformation is truly beneficial to compression ratio. Our new algorithm is able to make compression ratio 20% to 50% lower than naive Lz77 algorithm does and takes almost the same decompression time. For some read-heavy source segmenting-sorting compression can reduce space cost while guaranteeing throughput.
Keywords: segmenting; sorting; Lz77; compression; universal compression method.
DOI: 10.1504/IJCSE.2019.098531
International Journal of Computational Science and Engineering, 2019 Vol.18 No.3, pp.211 - 216
Received: 23 Jun 2016
Accepted: 30 Aug 2016
Published online: 26 Mar 2019 *