MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

On the Computation of Multidimensional Aggregates (1996) [183 citations — 17 self]

by Sameet Agarwal ,  Rakesh Agrawal ,  Prasad M. Deshpande ,  Ashish Gupta ,  Jeffrey F. Naughton ,  Raghu Ramakrishnan ,  Sunita Sarawagi
Add To MetaCart

Abstract:

At the heart of all OLAP or multidimensional data analysis applications is the ability to simultaneously aggregate across many sets of dimensions. Computing multidimensional aggregates is a performance bottleneck for these applications. This paper presents fast algorithms for computing a collection of groupbys. We focus on a special case of the aggregation problem --- computation of the CUBE operator. The CUBE operator requires computing group-bys on all possible combinations of a list of attributes, and is equivalent to the union of a number of standard group-by operations. We show how the structure of CUBE computation can be viewed in terms of a hierarchy of group-by operations. Our algorithms extend sort-based and hashbased grouping methods with several optimizations, like combining common operations across multiple group-bys, caching, and using pre-computed group-bys for computing other group-bys. Empirical evaluation shows that the resulting algorithms give much better performanc...

Citations

1040 An Introduction to Probability Theory and Its Applications, Volume I, 3rd Edition – Feller - 1968
853 Combinatorial Optimization: Algorithms and Complexity – Papadimitriou, Steiglitz - 1982
718 An Introduction to Probability Theory and Its – Feller - 1971
544 Query evaluation techniques for large databases – Graefe - 1993
385 Implementing data cubes efficiently – Harinarayan, Rajaraman, et al. - 1996
91 Sampling-based estimation of the number of distinct values of an attribute – Haas, Naughton, et al. - 1995
65 On computing the data cube – Sarawagi, Agrawal, et al. - 1996
55 Statistical databases: Characteristics, problems and some solutions – Shoshani - 1982
31 Adaptive Parallel Aggregation Algorithms – Shatdal, Naughton - 1995
29 Data Cube: A Relational Operator Generalizing Group-By, CrossTab and Sub-Totals – Gray, Bosworth, et al. - 1996
26 Hierarchically split cube forests for decision support: description and tuned design – Johnson, Shasha - 1996
22 The data model and access method of summary data management – Chen, McNamee - 1989
16 TBSAM: An access method for efficient processing of statistical queries – Srivastava, Tan, et al. - 1989
14 Sort versus hash revisited – Graefe, Linville, et al. - 1994
13 Statistical and Scientific Databases – Michalewicz, ed - 1991
10 Indexing for aggregation – Salzberg, Reuter - 1996
8 Understanding the Need for On-Line Analytical Servers – Finkelstein - 1995
4 Managing Multidimensional Data: Harnessing the Power – Weldon - 1995
2 Naughton and Raghu Ramakrishnan. Computation of Multidimensional Aggregates – Deshpande, Agarwal, et al. - 1996
2 Providing OLAP: An – Codd - 1993
2 Statistical and Scienti c Databases – Michalewicz - 1992
2 TBSAM: An access method for e cient processing of statistical queries – Srivastava, Tan, et al. - 1989
1 Techniques for Processing of Aggregates in Relational Database Systems – Epsteinr - 1979
1 Naughton and Karthik Ramasamy. Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies – Shukla, Deshpande, et al. - 1996
1 Rajaraman and Je Ullman. Implementing Data Cubes E ciently – Harinarayan, Anand - 1996
1 Venky Harinarayan, Anand Rajaraman and Je – Gupta - 1996
1 Je rey F. Naughton and Karthik Ramasamy. Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies – Shukla, Deshpande - 1996