Results 1 -
8 of
8
MOVIE: An incremental maintenance system for materialized object views
, 2003
"... View materialization is an important technique for high performance query processing, data integration and replication. Solutions to the problem of incrementally maintaining materialized views are very relevant. So far, most work on this problem has been confined to relational settings and solutions ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
View materialization is an important technique for high performance query processing, data integration and replication. Solutions to the problem of incrementally maintaining materialized views are very relevant. So far, most work on this problem has been confined to relational settings and solutions have not been comprehensively evaluated. This paper describes MOVIE, a complete, implemented and evaluated solution to the problem of incrementally maintaining materialized OQL views in ODMG-compliant object databases. The evaluation throws light into how the e#ectiveness of incremental maintenance is a#ected by issues such as database size, and the complexity and selectivity of views.
Reduction of Materialized View Staleness Using Online Updates
, 1998
"... Updating the materialized views stored in data warehouses usually implies making the warehouse unavailable to users. We propose MAUVE , a new algorithm for online incremental view updates that uses timestamps and allows consistent read-only access to the warehouse while it being updated. The algorit ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Updating the materialized views stored in data warehouses usually implies making the warehouse unavailable to users. We propose MAUVE , a new algorithm for online incremental view updates that uses timestamps and allows consistent read-only access to the warehouse while it being updated. The algorithm propagates the updates to the views more often than the typical once a day in order to reduce view staleness. We have implemented MAUVE on top of the Informix Universal Server and used a synthetic workload generator to experiment with various update workloads and different view update frequencies. Our results show that, all kinds of update streams benefit from more frequent view updates, instead of just once a day. However, there is a clear maximum for the view update frequency, for which view staleness is minimal. 1 Introduction Data warehouses contain data replicated from several external sources, collected to answer decision support queries. The replicated data is often copied in re...
Efficient Refreshment of Materialized Views With Multiple Sources
- In the Proceedings of the International Conference on Information and Knowledge Management
, 1999
"... hva.ngQcsee.uq.edu.au mariaksee.uq.edu.au A data warehouse collects and maintains a large amount of data from multiple distributed and autonomous data sources. Often the data in it is stored in the form of materialized views in order to provide fast access to the integrated data. However, maintainin ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
hva.ngQcsee.uq.edu.au mariaksee.uq.edu.au A data warehouse collects and maintains a large amount of data from multiple distributed and autonomous data sources. Often the data in it is stored in the form of materialized views in order to provide fast access to the integrated data. However, maintaining a certain level consistency of warehouse data with the source data is challenging in a distributed multiple source environ-ment. Transactions containing multiple updates at one or more sources further complicate the consistency is-sue. Following the four level consistency definition of view in a warehouse, we first present a complete consistency algorithm for maintaining SP J-type materialized views incrementally. Our algorithm speed-ups the view re-freshment time, provided that some extra moderate space in the warehouse is available. We then give a variant of the proposed algorithm by taking the update frequen-cies of sources into account. We finally discuss the rela-tionship between a view’s certain level consistency and its refresh time. It is difficult to propose an incremen-tal maintenance algorithm such that the view is always kept at a certain level consistency with the source data and the view’s refresh time is as fast as possible. We trade-off these two factors by giving an algorithm with faster view refresh time, while the view maintained by the algorithm is strong consistency rather than com-plete consistency with the source data. 1
A holistic approach to the evaluation of data warehouse maintenance policies
, 2000
"... Abstract. The research community is addressing a number of issues in response to increased reliance of organisations on data warehousing. Most work addresses individual aspects related to incremental view maintenance, propagation algorithms, consistency requirements, performance of OLAP queries etc. ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. The research community is addressing a number of issues in response to increased reliance of organisations on data warehousing. Most work addresses individual aspects related to incremental view maintenance, propagation algorithms, consistency requirements, performance of OLAP queries etc. There remains a need to consolidate relevant results into a cohesive framework for data warehouse maintenance. Although data propagation policies, source database characteristics, and user requirements have been addressed individually, their co-dependencies and relationships have not been explored. In this paper, we present a comprehensive, cost-based framework for evaluating data propagation policies against data warehouse requirements and source database characteristics. We formalize data warehouse specification along the dimensions of freshness (or staleness), response time, storage, and computation cost, and classify source databases according to their data propagation capabilities. A detailed cost model is presented for a representative set of policies. A prototype implementation has allowed an exploration of the various trade-offs. The results presented in this paper are for a single source, but the approach and the framework are extensible. Current work is addressing a broader class of sources and a more detailed data warehouse specification that includes multiple sources. 1.
Data Integration in Heterogeneous Environments: Multi-Source Policies, Cost Model, and Implementation
"... Abstract. The research community is addressing a number of issues in response to an increased reliance of organisations on data warehousing. Most work addresses aspects related to the internal operation of a data warehouse server, such as selection of views to materialise, maintenance of aggregate v ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. The research community is addressing a number of issues in response to an increased reliance of organisations on data warehousing. Most work addresses aspects related to the internal operation of a data warehouse server, such as selection of views to materialise, maintenance of aggregate views and performance of OLAP queries. Issues related to data warehouse maintenance, i.e. how changes to autonomous sources should be detected and propagated to a warehouse, have been addressed in a fragmented manner. We have shown earlier that a number of maintenance policies based on source characteristics and timing are relevant and meaningful to single source views. In this report we detail how this work has been extended for multiple sources. We focus on exploring policies for data integration from heterogeneous sources. As the number of policies is very large, we first analyse their behaviour intuitively with respect to broader source and policy characteristics. Further, we extend the single source cost model to these policies and incorporate it into a Policy Analyser for Multiple sources (PAM). We use this to analyse the effect of source characteristics and join alternatives on various policies. We have developed a Testbed for Maintenance of Integrated Data (TMID). We report on experiments conducted to validate the policies that are recommended by the tool, and confirm our initial analysis. Finally, we distil a set of heuristics for the selection of multi-source policies based on quality of service and other requirements. 1.
Experimental Evaluation of Data Warehouse Configuration Algorithms
- In Proc. of the 9th Intl. Workshop on Database and Expert Systems Applications
, 1998
"... A Data Warehouse (DW) can be seen as a set of materialized views defined over relations that are stored in remote heterogeneous database systems. When a query is posed to the DW, it is evaluated locally, using only the materialized views. The DW configuration problem is the problem of selecting an o ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
A Data Warehouse (DW) can be seen as a set of materialized views defined over relations that are stored in remote heterogeneous database systems. When a query is posed to the DW, it is evaluated locally, using only the materialized views. The DW configuration problem is the problem of selecting an optimal set of views to materialize that answer a given set of queries. The objective is the minimization of the combination of the query evaluation and view maintenance costs. In this paper we report on the experimental evaluation of an exhaustive algorithm and we develop new greedy and heuristic algorithms that expand only a small fraction of the states produced by the exhaustive algorithm. The algorithms are described in terms of a state space search problem. Finally, we report on experimental results and discuss the observed behavior of the algorithms. 1. Introduction Data Warehouses (DWs) typically provide access to integrated data from a set of remote heterogeneous databases [16]. A D...
Optimizing Relational Queries by Materializing Natural Joins
- In Proc. Workshop on Information Technologies and Systems
, 1997
"... : Efficient evaluation of user queries is very important and critical in database applications involving very large amount of data. In these applications, especially the ones where query answers are expected in real time, performance of query evaluation in a DBMS may be poor; often in such cases, qu ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
: Efficient evaluation of user queries is very important and critical in database applications involving very large amount of data. In these applications, especially the ones where query answers are expected in real time, performance of query evaluation in a DBMS may be poor; often in such cases, query performance is extremely sensitive to the structure of the database schema. For example, if a query joins several very large relations (in terms of the number of tuples), the joins may be very expensive even with efficient algorithms. On the other hand, if such joins are "materialized", i.e. precomputed, stored, and properly indexed, the joins can be avoided at the query evaluation time. In our experiments, the performance improvement by join elimination is significant. Based on the analysis of several large operational database application systems and experimental results, we argue that normalized database schemas should remain for the sake of semantic integrity (upon updates) and that ...
Performance Analysis of WHIPS Incremental Maintenance
, 1998
"... Incremental maintenance incorporates new changes automatically and continuously into a data warehouse, and seems to be the best maintenance solution for very large warehouses. However, the performance of incremental maintenance algorithms is not well understood, and commercial incremental maintenanc ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Incremental maintenance incorporates new changes automatically and continuously into a data warehouse, and seems to be the best maintenance solution for very large warehouses. However, the performance of incremental maintenance algorithms is not well understood, and commercial incremental maintenance systems are still not widely available. In this paper, we study the performance of WHIPS, a prototype system developed at Stanford focusing on warehouse maintenance. We examine the efficiency of change propagation and study factors that determine the efficiency. We also compare the relative performance of different warehouse maintenance techniques. 1 Introduction A data warehouse stores integrated data from multiple, often distributed and autonomous data sources, and is typically used for decision support. One of the major challenges is to correctly and efficiently update the warehouse as the source data changes. There are basically two categories of techniques for this: refreshing and in...

