Results 1 - 10
of
64
Query evaluation techniques for large databases
- ACM COMPUTING SURVEYS
, 1993
"... Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On ..."
Abstract
-
Cited by 592 (7 self)
- Add to MetaCart
Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate it: In order to manipulate large sets of complex objects as efficiently as today’s database systems manipulate simple records, query processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and post-relational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
Querying Semi-Structured Data
, 1997
"... The amount of data of all kinds available electronically has increased dramatically in recent years. The data resides in different forms, ranging from unstructured data in file systems to highly structured in relational database systems. Data is accessible through a variety of interfaces including W ..."
Abstract
-
Cited by 467 (19 self)
- Add to MetaCart
The amount of data of all kinds available electronically has increased dramatically in recent years. The data resides in different forms, ranging from unstructured data in file systems to highly structured in relational database systems. Data is accessible through a variety of interfaces including Web browsers, database query languages, application-specific interfaces, or data exchange formats. Some of this data is raw data, e.g. images or sound. Some of it has structure even if the structure is often implicit, and not as rigid or regular as that found in standard database systems. Sometimes the structure exists but has to be extracted from the data. Sometimes also it exists but we prefer to ignore it for certain purposes such as browsing. We call here semi-structured data this data that is (from a particular viewpoint) neither raw data nor strictly typed, i.e., not table-oriented as in a relational model or sorted-graph as in object databases...
Answering Queries Using Views
, 1995
"... We consider the problem of computing answers to queries by using materialized views. Aside from its potential in optimizing query evaluation, the problem also arises in applications such as Global Information Systems, Mobile Computing and maintaining physical data independence. We consider the probl ..."
Abstract
-
Cited by 390 (30 self)
- Add to MetaCart
We consider the problem of computing answers to queries by using materialized views. Aside from its potential in optimizing query evaluation, the problem also arises in applications such as Global Information Systems, Mobile Computing and maintaining physical data independence. We consider the problem of finding a rewriting of a query that uses the materialized views, the problem of finding minimal rewritings, and finding complete rewritings (i.e., rewritings that use only the views). We show that all the possible rewritings can be obtained by considering containment mappings from the views to the query, and that the problems we consider are NP-complete when both the query and the views are conjunctive and don't involve built-in comparison predicates. We show that the problem has two independent sources of complexity (the number of possible containment mappings, and the complexity of deciding which literals from the original query can be deleted). We describe a polynomial time algorith...
Optimizing Queries with Materialized Views
, 1995
"... While much work has addressed the problem of maintaining materialized views, the important question of optimizing queries in the presence of materialized views has not been resolved. In this paper, we analyze the optimization question and provide a comprehensive and efficient solution. Our solution ..."
Abstract
-
Cited by 217 (4 self)
- Add to MetaCart
While much work has addressed the problem of maintaining materialized views, the important question of optimizing queries in the presence of materialized views has not been resolved. In this paper, we analyze the optimization question and provide a comprehensive and efficient solution. Our solution has the desirable property that it is a simple generalization of the traditional query optimization algorithm. 1 Introduction The idea of using materialized views for the benefit of improved query processing has been proposed in the literature more than a decade ago. In this context, problems such as definition of views, composition of views, maintenance of views [BC79, KP81, SI84, BLT86, CW91, Rou91, GMS93] have been researched but one topic has been conspicuous by its absence. This concerns the problem of the judicious use of materialized views in answering a query. It may seem that materialized views should be used to evaluate a query whenever they are applicable. In fact, blind applicat...
Multiple-Query Optimization
- ACM Transactions on Database Systems
, 1988
"... Some recently proposed extensions to relational database systems, as well as to deductive database systems, require support for multiple-query processing. For example, in a database system enhanced with inference capabilities, a simple query involving a rule with multiple definitions may expand to m ..."
Abstract
-
Cited by 176 (3 self)
- Add to MetaCart
Some recently proposed extensions to relational database systems, as well as to deductive database systems, require support for multiple-query processing. For example, in a database system enhanced with inference capabilities, a simple query involving a rule with multiple definitions may expand to more than one actual query that has to be run over the database. It is an interesting problem then to come up with algorithms that process these queries together instead of one query at a time. The main motivation for performing such an interquery optimization lies in the fact that queries may share common data. We examine the problem of multiple-query optimization in this paper. The first major contribution of the paper is a systematic look at the problem, along with the presentation and analysis of algorithms that can be used for multiple-query optimization. The second contribution lies in the presentation of experimental results. Our results show that using multiple-query processing algorithms may reduce execution cost considerably.
Updating Derived Relations: Detecting Irrelevant and Autonomously Computable Updates
- ACM Transactions on Database Systems
, 1989
"... Consider a database containing not only base relations but also stored derived relations (also called materialized or concrete views). When a base relation is updated, it may also be necessary to update some of the derived relations. This paper gives sufficient and necessary conditions for detecting ..."
Abstract
-
Cited by 151 (2 self)
- Add to MetaCart
Consider a database containing not only base relations but also stored derived relations (also called materialized or concrete views). When a base relation is updated, it may also be necessary to update some of the derived relations. This paper gives sufficient and necessary conditions for detecting when an update of a base relation cannot affect a derived relation (an irrelevant update), and for detecting when a derived relation can be correctly updated using no data other than the derived relation itself and the given update operation (an autonomously computable update). The class of derived relations considered is restricted to those defined by PSJ-expressions, that is, any relational algebra expression constructed from an arbitrary number of project, select and join operations (but containing no self-joins). The class of update operations consists of insertions, deletions, and modifications, where the set of tuples to be deleted or modified is specified by a selection condition on ...
A Query Translation Scheme for Rapid Implementation of Wrappers
, 1995
"... Wrappers provide access to heterogeneous information sources by converting application queries into source specific queries or commands. In this paper we present a wrapper implementation toolkit that facilitates rapid development of wrappers. We focus on the query translation component of the toolki ..."
Abstract
-
Cited by 123 (22 self)
- Add to MetaCart
Wrappers provide access to heterogeneous information sources by converting application queries into source specific queries or commands. In this paper we present a wrapper implementation toolkit that facilitates rapid development of wrappers. We focus on the query translation component of the toolkit, called the converter. The converter takes as input a Query Description and Translation Language (QDTL) description of the queries that can be processed by the underlying source. Based on this description the converter decides if an application query is (a) directly supported, i.e., it can be translated to a query of the underlying system following instructions in the QDTL description; (b) logically supported, i.e., logically equivalent to a directly supported query; (c) indirectly supported, i.e., it can be computed by applying a filter, automatically generated by the converter, to the result of a directly supported query. 1 Introduction A wrapper or translator [C + 94, PGMW95] is a s...
Aggregate-Query Processing in Data Warehousing Environments
- In Proceedings of the International Conference on Very Large Databases
, 1995
"... In this paper we introduce generalized projections (GPs), an extension of duplicate eliminating projections, that capture aggregations, groupbys, duplicate-eliminating projections (distinct), and duplicate-preserving projections in a common unified framework. Using GPs we extend well known and simpl ..."
Abstract
-
Cited by 105 (1 self)
- Add to MetaCart
In this paper we introduce generalized projections (GPs), an extension of duplicate eliminating projections, that capture aggregations, groupbys, duplicate-eliminating projections (distinct), and duplicate-preserving projections in a common unified framework. Using GPs we extend well known and simple algorithms for SQL queries that use distinct projections to derive algorithms for queries using aggregations like sum, max, min, count, and avg. We develop powerful query rewrite rules for aggregate queries that unify and extend rewrite rules previously known in the literature. We then illustrate the power of our approach by solving a very practical and important problem in data warehousing: how to answer an aggregate query on base tables using materialized aggregate views (summary tables).
Efficient and Extensible Algorithms for Multi Query Optimization
, 2000
"... Complex queries are becoming commonplace, with the growing use of decision support systems. These complex queries often have a lot of common sub-expressions, either within a single query, or across multiple such queries run as a batch. Multi-query optimization aims at exploiting common subexpressi ..."
Abstract
-
Cited by 99 (4 self)
- Add to MetaCart
Complex queries are becoming commonplace, with the growing use of decision support systems. These complex queries often have a lot of common sub-expressions, either within a single query, or across multiple such queries run as a batch. Multi-query optimization aims at exploiting common subexpressions to reduce evaluation cost. Multi-query optimization has hither-to been viewed as impractical, since earlier algorithms were exhaustive, and explore a doubly exponential search space. In this paper we demonstrate that multi-query optimization using heuristics is practical, and provides significant benefits. We propose three cost-based heuristic algorithms: Volcano-SH and Volcano-RU, which are based on simple modifications to the Volcano search strategy, and a greedy heuristic. Our greedy heuristic incorporates novel optimizations that improve efficiency greatly. Our algorithms are designed to be easily added to existing optimizers. We present a performance study comparing the algo...
Conjunctive Query Containment Revisited
, 1998
"... We consider the problems of conjunctive query containment and minimization, which are known to be NP-complete, and show that these problems can be solved in polynomial time for the class of acyclic queries. We then generalize the notion of acyclicity and define a parameter called query width that ca ..."
Abstract
-
Cited by 83 (0 self)
- Add to MetaCart
We consider the problems of conjunctive query containment and minimization, which are known to be NP-complete, and show that these problems can be solved in polynomial time for the class of acyclic queries. We then generalize the notion of acyclicity and define a parameter called query width that captures the "degree of cyclicity" of a query: in particular, a query is acyclic if and only if its query width is 1. We give algorithms for containment and minimization that run in time polynomial in n k , where n is the input size and k is the query width. These algorithms naturally generalize those for acyclic queries, and are of practical significance because many queries have small query width compared to their sizes. We show that good bounds on the query width of Q can be obtained using the treewidth of the incidence graph of Q. We then consider the problem of finding an equivalent query to a given conjunctive query Q that has the least number of subgoals. We show that a polynomial tim...

