## Engineering the Compression of Massive Tables: An Experimental Approach (2000)

@MISC{Buchsbaum00engineeringthe,

author = {Adam L. Buchsbaum and Donald F. Caldwell and S. Muthukrishnan},

title = {Engineering the Compression of Massive Tables: An Experimental Approach},

year = {2000}

}

### Years of Citing Articles

### Abstract

We study the problem of compressing massive tables. We devise a novel compression paradigm---training for lossless compression--- which assumes that the data exhibit dependencies that can be learned by examining a small amount of training material. We develop an experimental methodology to test the approach. Our result is a system, pzip, which outperforms gzip by factors of two in compression size and both compression and uncompression time for various tabular data. Pzip is now in production use in an AT&T network traffic data warehouse.

### Citations

