Probabilistic Counting Algorithms for Data Base Applications (1985) [213 citations — 5 self]
Abstract:
This paper introduces a class of probabilistic counting lgorithms with which one can estimate the number of distinct elements in a large collection of data (typically a large file stored on disk) in a single pass using only a small additional storage (typically less than a hundred binary words) and only a few operations per element scanned. The algorithms are based on statistical observations made on bits of hashed values of records. They are by con- struction totally insensitive to the replicafive structure of elements in the file; they can be used in the context of distributed systems without any degradation of performances and prove especially useful in the context of data bases query optimisation. ; 1985 Academic Press, Inc
Citations
| 386 | Access path selection in a relational database management system – Selinger, Astrahan, et al. - 1979 |
| 31 | Approximate counting: a detailed analysis – Flajolet - 1985 |
| 29 | Counting large numbers of events in small registers – Morris - 1978 |
| 14 | Sorting and searching in multisets – Munro, Spira - 1976 |
| 12 | Handbuch der Laplace-Transformation – DOETSCH - 1950 |
| 6 | Key to address transformations: A fundamental study based on large existing format files – Lum, Yuen, et al. - 1971 |
| 1 | KNUTn, "The Art of Computer Programming: Sorting and Searching – E - 1973 |

