## Data-Streams and Histograms (2001)

Citations: | 130 - 8 self |

### BibTeX

@INPROCEEDINGS{Guha01data-streamsand,

author = {Sudipto Guha and Nick Koudas and Kyuseok Shim},

title = {Data-Streams and Histograms},

booktitle = {},

year = {2001},

pages = {471--475}

}

### Years of Citing Articles

### OpenURL

### Abstract

Histograms have been used widely to capture data distribution, to represent the data by a small number of step functions. Dynamic programming algorithms which provide optimal construction of these histograms exist, albeit running in quadratic time and linear space. In this paper we provide linear time construction of 1 + epsilon approximation of optimal histograms, running in polylogarithmic space. Our results extend to the context of data-streams, and in fact generalize to give 1 + epsilon approximation of several problems in data-streams which require partitioning the index set into intervals. The only assumptions required are that the cost of an interval is monotonic under inclusion (larger interval has larger cost) and that the cost can be computed or approximated in small space. This exhibits a nice class of problems for which we can have near optimal data-stream algorithms.