MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Quasi-Cubes: A space-efficient way to support approximate multidimensional databases (1998) [15 citations — 5 self]

by Daniel Barbara ,  Mark Sullivan
Add To MetaCart

Abstract:

A data cube is a popular organization for summary data. A cube is simply a multidimensional structure that contains at each point an aggregate value, i.e., the result of applying an aggregate function to an underlying relation. In practical situations, cubes can require a large amount of storage. The typical approach to reducing storage cost is to materialize parts of the cube on demand. Unfortunately, this lazy evaluation can be a time-consuming operation. In this paper, we propose an approximation technique that reduces the storage cost of the cube without incurring the run time cost of lazy evaluation. The idea is to characterize regions of the cube by using statistical models whose description take less space than the data itself. Then, the model parameters can be used to estimate the cube cells with a certain level of accuracy. To increase the accuracy, some of the "outliers," i.e., cells that incur in the largest errors when estimated can be retained. The storage taken by the mod...

Citations

3011 Pattern Classification and Scene Analysis – Duda, Hart - 1973
1588 A theory for multiresolution signal decomposition: The wavelet representation – Mallat - 1989
1182 Orthonormal bases of compactly supported wavelets – Daubechies - 1988
970 Principal Component Analysis – Jolliffe - 1986
529 Data Cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals – Gray, Bosworth, et al. - 1996
500 Categorical Data Analysis – Agresti - 1990
457 Linear Algebra and Its Applications – Strang - 1976
385 Implementing data cubes efficiently – Harinarayan, Rajaraman, et al. - 1996
244 Online aggregation – Hellerstein, Haas, et al. - 1997
215 Research problems in data warehousing – Widom - 1995
166 Non-negative Matrices and Markov Chains – Seneta - 1981
158 Latent semantic indexing: A probabilistic analysis – Papadimitriou, Tamaki, et al. - 1998
85 Adaptive Selectivity Estimation Using Query Feedback – Roussopoulos - 1994
81 Efficiently supporting ad hoc queries in large datasets of time sequences – Korn, Jagadish - 1997
79 Latent semantic indexing (LSI) and TREC-2 – Dumais - 1994
48 Indexing OLAP data – Sarawagi - 1997
48 Recursive Estimation and Time-Series Analysis – Young - 1984
45 relational and multidimensional database systems – OLAP - 1996
24 Introductory Statistics – Wonnacott, Wonnacott - 1972
20 Information retrieval from an incomplete data cube – Dyreson - 1996
18 Some approaches to index design for cube forests – Johnson, Shasha - 1997
5 Approximate Query Processing with Summary Tables in Statistical Databases – Abad-Mota - 1992
5 Bit string compressor with boolean operation processing capability – Glaser, DesJardins, et al. - 1991
3 The Data Warehouse Toolkit: How to Design Dimensional Data Warehouses – Kimball - 1996
3 Fast Computations of Sparse Cubes – Srivastava, Ross - 1997
2 Technology Group. Designing the Data Warehouse on Relational Databases. White Paper – Stanford