Results 1  10
of
24
Data cube: A relational aggregation operator generalizing groupby, crosstab, and subtotals
, 1996
"... Abstract. Data analysis applications typically aggregate data across many dimensions looking for anomalies or unusual patterns. The SQL aggregate functions and the GROUP BY operator produce zerodimensional or onedimensional aggregates. Applications need the Ndimensional generalization of these op ..."
Abstract

Cited by 693 (7 self)
 Add to MetaCart
Abstract. Data analysis applications typically aggregate data across many dimensions looking for anomalies or unusual patterns. The SQL aggregate functions and the GROUP BY operator produce zerodimensional or onedimensional aggregates. Applications need the Ndimensional generalization of these operators. This paper defines that operator, called the data cube or simply cube. The cube operator generalizes the histogram, crosstabulation, rollup, drilldown, and subtotal constructs found in most report writers. The novelty is that cubes are relations. Consequently, the cube operator can be imbedded in more complex nonprocedural data analysis programs. The cube operator treats each of the N aggregation attributes as a dimension of Nspace. The aggregate of a particular set of attribute values is a point in this space. The set of points forms an Ndimensional cube. Superaggregates are computed by aggregating the Ncube to lower dimensional spaces. This paper (1) explains the cube and rollup operators, (2) shows how they fit in SQL, (3) explains how users can define new aggregate functions for cubes, and (4) discusses efficient techniques to compute the cube. Many of these features are being added to the SQL Standard.
On the computation of multidimensional aggregates
 IN PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON VERY LARGE DATABASES
, 1996
"... At the heart of all OLAP or multidimensional data analysis applications is the ability to simultaneously aggregate across many sets of dimensions. Computing multidimensional aggregates is a performance bottleneck for these applications. This paper presents fast algorithms for computing a collection ..."
Abstract

Cited by 205 (18 self)
 Add to MetaCart
At the heart of all OLAP or multidimensional data analysis applications is the ability to simultaneously aggregate across many sets of dimensions. Computing multidimensional aggregates is a performance bottleneck for these applications. This paper presents fast algorithms for computing a collection of groupbys. We focus on a special case of the aggregation problem  computation of the CUBE operator. The CUBE operator requires computing groupbys on all possible combinations of a list of attributes, and is equivalent to the union of a number of standard groupby operations. We show howthe structure of CUBE computation can be viewed in terms of a hierarchy of groupby operations. Our algorithms extend sortbased and hashbased grouping methods with several optimizations, like combining common operations across multiple groupbys, caching, and using precomputed groupbys for computing other groupbys. Empirical evaluation shows that the resulting algorithms give much better performance compared to straightforward methods. This paper combines work done concurrently on computing the data cube by two different teams as reported in [SAG96] and [DANR96].
PowerConserving Computation of OrderStatistics over Sensor Networks
 In PODS
, 2004
"... We study the problem of powerconserving computation of order statistics in sensor networks. Significant powerreducing optimizations have been devised for computing simple aggregate queries such as count, average, or max over sensor networks. In contrast, aggregate queries such as median have seen ..."
Abstract

Cited by 66 (1 self)
 Add to MetaCart
We study the problem of powerconserving computation of order statistics in sensor networks. Significant powerreducing optimizations have been devised for computing simple aggregate queries such as count, average, or max over sensor networks. In contrast, aggregate queries such as median have seen little progress over the brute force approach of forwarding all data to a central server. Moreover, battery life of current sensors seems largely determined by communication costs  therefore we aim to minimize the number of bytes transmitted. Unoptimized aggregate queries typically impose extremely high power consumption on a subset of sensors located near the server. Metrics such as total communication cost underestimate the penalty of such imbalance: network lifetime may be dominated by the worstcase replacement time for depleted batteries.
Maintaining Data Cubes under Dimension Updates
, 1999
"... OLAP systems support data analysis through a multidimensional data model, according to which data facts are viewed as points in a space of applicationrelated "dimensions", organized into levels which conform a hierarchy. The usual assumption is that the data points reflect the dynamic aspect of the ..."
Abstract

Cited by 45 (9 self)
 Add to MetaCart
OLAP systems support data analysis through a multidimensional data model, according to which data facts are viewed as points in a space of applicationrelated "dimensions", organized into levels which conform a hierarchy. The usual assumption is that the data points reflect the dynamic aspect of the data warehouse, while dimensions are relatively static. However, in practice, dimension updates are often necessary to adapt the multidimensional database to changing requirements. Structural updates can also take place, like addition of categories or modification of the hierarchical structure. When these updates are performed, the materialized aggregate views that are typically stored in OLAP systems must be efficiently maintained. These updates are poorly supported (or not supported at all) in current commercial systems, and have received little attention in the research literature. We present a formal model of dimension updates in a multidimensional model, a collection of primitive opera...
Reasoning about Summarizability in Heterogeneous Multidimensional Schemas
 In IEEE ICDT
, 2001
"... . In OLAP applications, data are modeled as points in a multidimensional space. Dimensions themselves have structure, described by a schema and an instance; the schema is basically a directed acyclic graph of granularity levels, and the instance consists of a set of elements for each level and m ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
. In OLAP applications, data are modeled as points in a multidimensional space. Dimensions themselves have structure, described by a schema and an instance; the schema is basically a directed acyclic graph of granularity levels, and the instance consists of a set of elements for each level and mappings between these elements, usually called rollup functions. Current dimension models restrict dimensions in various ways; for example, rollup functions are restricted to be total. We relax these restrictions, yielding what we call heterogeneous schemas, which describe more naturally and cleanly many practical situations. In the context of heterogeneous schemas, the notion of summarizability becomes more complex. An aggregate view defined at some granularity level is summarizable from a set of precomputed views defined at other levels if the rollup functions can be used to compute the first view from the set of views. In order to study summarizability in heterogeneous schemas, ...
Cardinalitybased Inference Control in Sumonly Data Cubes
 In Proceedings of the 7th European Symposium on Research in Computer Security (ESORICS 2002
, 2002
"... This paper deals with the inference problems in data warehouses and decision support systems such as online analytical processing (OLAP) systems. ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
This paper deals with the inference problems in data warehouses and decision support systems such as online analytical processing (OLAP) systems.
Modeling and Querying Multidimensional Databases: An Overview
, 1999
"... This paper presents some highlights about the concept of multidimensional database and OnLine Analytical Processing (OLAP), a technology used in the context of decision support. It mainly focuses on multidimensional data models and manipulations. We propose both an inventory and a classification of ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
This paper presents some highlights about the concept of multidimensional database and OnLine Analytical Processing (OLAP), a technology used in the context of decision support. It mainly focuses on multidimensional data models and manipulations. We propose both an inventory and a classification of the elementary operations underlying OLAP treatments. We describe several typical complex manipulations based on these elementary operations. Throughout the paper, we present the informal concepts stemming from users' needs and the formal proposals of research works. Hence it provides an entry point in the domain of OLAP modeling and querying.
Design and Implementation of OnLine Analytical Processing (OLAP) of Spatial Data
, 1997
"... Online analytical processing (OLAP) has gained its popularity in database industry. With a huge amount of data stored in spatial databases and the introduction of spatial components to many relational or objectrelational databases, it is important to study the methods for spatial data warehousing ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Online analytical processing (OLAP) has gained its popularity in database industry. With a huge amount of data stored in spatial databases and the introduction of spatial components to many relational or objectrelational databases, it is important to study the methods for spatial data warehousing and online analytical processing of spatial data. This thesis investigates methods for spatial OLAP, by integration of nonspatial online analytical processing (OLAP) methods with spatial database implementation techniques. A spatial data warehouse model, which consists of both spatial and nonspatial dimensions and measures, is proposed. Methods for computation of spatial data cubes and analytical processing on such spatial data cubes are studied, with several strategies proposed, including approximation and partial materialization of the spatial objects resulting from spatial OLAP operations. Some techniques for selective materialization of the spatial computation results are worked out, a...
Life under your feet: An endtoend soil ecology sensor network, database, web server, and analysis service
, 2006
"... Abstract 1: Wireless sensor networks can revolutionize soil ecology by providing measurements at temporal and spatial granularities previously impossible. This paper presents a soil monitoring system we developed and deployed at an urban forest in Baltimore as a first step towards realizing this vis ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Abstract 1: Wireless sensor networks can revolutionize soil ecology by providing measurements at temporal and spatial granularities previously impossible. This paper presents a soil monitoring system we developed and deployed at an urban forest in Baltimore as a first step towards realizing this vision. Motes in this network measure and save soil moisture and temperature in situ every minute. Raw measurements are periodically retrieved by a sensor gateway and stored in a central database where calibrated versions are derived and stored. The measurement database is published through Web Services interfaces. In addition, analysis tools let scientists analyze current and historical data and help manage the sensor network. The article describes the system design, what we learned from the deployment, and initial results obtained from the sensors. The system measures soil factors with unprecedented temporal precision. However, the deployment required devicelevel programming, sensor calibration across space and time, and crossreferencing measurements with external sources. The database, web server, and data analysis design required considerable innovation and expertise. So, the ratio of computerscientists to ecologists was 3:1. Before sensor networks can fulfill their potential as instruments that can be easily deployed by scientists, these technical problems must be addressed so that the ratio is one nerd per ten ecologists. 1.
Datacube: Its Implementation and Application in OLAP Mining
, 1998
"... With huge amounts of data collected in various kinds of applications, data warehouse is becoming a mainstream information repository for decision support and data analysis mainly because a data warehouse facilitates online analytical processing (OLAP). It is important to study methods for supportin ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
With huge amounts of data collected in various kinds of applications, data warehouse is becoming a mainstream information repository for decision support and data analysis mainly because a data warehouse facilitates online analytical processing (OLAP). It is important to study methods for supporting data warehouses, in particular its OLAP operations, efficiently. In this thesis, we investigate efficient methods for computing datacubes and for using datacubes to support OLAP and data mining. Currently, there are two popular datacube technologies: Relational OLAP (ROLAP) and Multidimensional OLAP (MOLAP). Many efficient algorithms have been designed for ROLAP systems, but not so many for the MOLAP ones. MOLAP systems, though may suffer from sparsity of data, are generally more efficient than ROLAP systems when the sparse datacube techniques are explored or when the data sets are small to medium sized. We have developed a MOLAP system which combines nice features of both MOLAP and ROLAP....