Results 1 - 10
of
35
An overview of query optimization in relational systems
- In PODS
, 1998
"... There has been extensive work in query optimization since the early ‘70s. It is hard to capture the breadth and depth of this large body of work in a short article. Therefore, I have decided to focus primarily on the optimization of SQL queries in relational database systems and present my biased an ..."
Abstract
-
Cited by 99 (1 self)
- Add to MetaCart
There has been extensive work in query optimization since the early ‘70s. It is hard to capture the breadth and depth of this large body of work in a short article. Therefore, I have decided to focus primarily on the optimization of SQL queries in relational database systems and present my biased and incomplete view of this field. The goal of this article is not to be comprehensive, but rather to explain the foundations and present samplings of significant work in this area. I would like to apologize to the many contributors in this area whose work I have failed to explicitly acknowledge due to oversight or lack of space. I take the liberty of trading technical precision for ease of presentation. 2.
MonetDB/XQuery: a fast XQuery processor powered by a relational engine
- In SIGMOD
, 2006
"... Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-bas ..."
Abstract
-
Cited by 84 (22 self)
- Add to MetaCart
Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-based encoding of XML documents into relational tables, (ii) a compilation technique that translates XQuery into a basic relational algebra, (iii) a restricted (order) property-aware peephole relational query optimization strategy, and (iv) a mapping from XML update statements into relational updates. Thus, this system implements all essential XML database functionalities (rather than a single feature) such that we can learn from the full consequences of our architectural decisions. While implementing this system, we had to extend the state-of-theart with a number of new technical contributions, such as looplifted staircase join and efficient relational query evaluation strategies for XQuery theta-joins with existential semantics. These contributions as well as the architectural lessons learned are also deemed valuable for other relational back-end engines. The performance and scalability of the resulting system is evaluated on the XMark benchmark up to data sizes of 11 GB. The performance section also provides an extensive comparison of all major XMark results published previously, which confirm that the goal of purely relational XQuery processing, namely speed and scalability, was met. 1.
Iterative Dynamic Programming: A New Class of Query Optimization Algorithms
- ACM Trans. on Database Systems
, 1998
"... The query optimizer is one of the most important components of a database system. Most commercial query optimizers today are based on a dynamic-programming algorithm, as proposed in [SAC+79]. While this algorithm produces good optimization results (i.e., good plans), its high complexity can be prohi ..."
Abstract
-
Cited by 36 (5 self)
- Add to MetaCart
The query optimizer is one of the most important components of a database system. Most commercial query optimizers today are based on a dynamic-programming algorithm, as proposed in [SAC+79]. While this algorithm produces good optimization results (i.e., good plans), its high complexity can be prohibitive if complex queries need to be processed, new query execution techniques need to be integrated, or in certain programming environments (e.g., distributed database systems). In this paper, we present and thoroughly evaluate a new class of query optimization algorithms that are based on a principle that we call iterative dynamic programming, or IDP for short. IDP has several important advantages: First, IDP-algorithms produce the best plans of all known algorithms in situations in which dynamic programming is not viable because of its high complexity. Second, some IDP variants are adaptive and produce as good plans as dynamic programming if dynamic programming is viable an...
Supporting ad-hoc ranking aggregates
- In SIGMOD
, 2006
"... This paper presents a principled framework for efficient processing of ad-hoc top-k (ranking) aggregate queries, which provide the k groups with the highest aggregates as results. Essential support of such queries is lacking in current systems, which process the queries in a naïve materialize-group- ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
This paper presents a principled framework for efficient processing of ad-hoc top-k (ranking) aggregate queries, which provide the k groups with the highest aggregates as results. Essential support of such queries is lacking in current systems, which process the queries in a naïve materialize-group-sort scheme that can be prohibitively inefficient. Our framework is based on three fundamental principles. The Upper-Bound Principle dictates the requirements of early pruning, and the Group-Ranking and Tuple-Ranking Principles dictate group-ordering and tuple-ordering requirements. They together guide the query processor toward a provably optimal tuple schedule for aggregate query processing. We propose a new execution framework to apply the principles and requirements. We address the challenges in realizing the framework and implementing new query operators, enabling efficient group-aware and rankaware query plans. The experimental study validates our framework by demonstrating orders of magnitude performance improvement in the new query plans, compared with the traditional plans. 1.
On reasoning about structural equality in XML: A description logic approach
- Theoretical Computer Science
, 2002
"... Abstract. We define a boolean complete description logic dialect called DLFDreg that can be used to reason about structural equality in semistructured ordered data in the presence of document type definitions. This application depends on the novel ability of DLFDreg to express functional dependencie ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
Abstract. We define a boolean complete description logic dialect called DLFDreg that can be used to reason about structural equality in semistructured ordered data in the presence of document type definitions. This application depends on the novel ability of DLFDreg to express functional dependencies over sets of possibly infinite feature paths defined by regular languages. We also present a decision procedure for the associated logical implication problem. The procedure underlies a mapping of such problems to satisfiability problems of Datalog ∨,¬ nS and in turn to the Ackermann case of the decision problem. 1
Reusing Invariants: A New Strategy for Correlated Queries
- In SIGMOD
, 1998
"... Correlated queries are very common and important in decision support systems. Traditional nested iteration evaluation methods for such queries can be very time consuming. When they apply, query rewriting techniques have been shown to be much more efficient. But query rewriting is not always possible ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Correlated queries are very common and important in decision support systems. Traditional nested iteration evaluation methods for such queries can be very time consuming. When they apply, query rewriting techniques have been shown to be much more efficient. But query rewriting is not always possible. When query rewriting does not apply, can we do something better than the traditional nested iteration methods? In this paper, we propose a new invariant technique to evaluate correlated queries efficiently. The basic idea is to recognize the part of the subquery that is not related to the outer references and cache the result of that part after its first execution. Later, we can reuse the result and combine it with the result of the rest of the subquery that is changing for each iteration. Our technique applies to arbitrary correlated subqueries. This paper introduces algorithms to recognize the invariant part of a data flow tree, and to restructure the evaluation plan to reuse the stored ...
Pattern Tree Algebras: Sets Or Sequences?
, 2005
"... XML and XQuery semantics are very sensitive to the order of the produced output. Although pattern-tree based algebraic approaches are becoming more and more popular for evaluating XML, there is no universally accepted technique which can guarantee both a correct output order and a choice of ef ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
XML and XQuery semantics are very sensitive to the order of the produced output. Although pattern-tree based algebraic approaches are becoming more and more popular for evaluating XML, there is no universally accepted technique which can guarantee both a correct output order and a choice of efficient alternative plans.
A Combined Framework for Grouping and Order Optimization
- In VLDB
, 2004
"... Since the introduction of cost-based query optimization by Selinger et al. in their seminal paper, the performance-critical role of interesting orders has been recognized. Some algebraic operators change interesting orders (e.g. sort and select), while others exploit them (e.g. merge join). Li ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Since the introduction of cost-based query optimization by Selinger et al. in their seminal paper, the performance-critical role of interesting orders has been recognized. Some algebraic operators change interesting orders (e.g. sort and select), while others exploit them (e.g. merge join). Likewise, Wang and Cherniack (VLDB 2003) showed that existing groupings should be exploited to avoid redundant grouping operations. Ideally, the reasoning about interesting orderings and groupings should be integrated into one framework.
Implementing Sorting in Database Systems
- ACM Comput. Surv
, 2006
"... Most commercial database systems do (or should) exploit many sorting techniques that are publicly known, but not readily available in the research literature. These techniques improve both sort performance on modern computer systems and the ability to adapt gracefully to resource fluctuations in mul ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Most commercial database systems do (or should) exploit many sorting techniques that are publicly known, but not readily available in the research literature. These techniques improve both sort performance on modern computer systems and the ability to adapt gracefully to resource fluctuations in multiuser operations. This survey collects many of these techniques for easy reference by students, researchers, and product developers. It covers in-memory sorting, disk-based external sorting, and considerations that apply specifically to sorting in database systems.
Exploiting Early Sorting and Early Partitioning for Decision Support Query Processing
, 2000
"... Decision support queries typically involve several joins, a grouping with aggregation, and/or sorting of the result tuples. We propose two new classes of query evaluation algorithms that can be used to speed up the execution of such queries. The algorithms are based on (1) early sorting and (2) earl ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Decision support queries typically involve several joins, a grouping with aggregation, and/or sorting of the result tuples. We propose two new classes of query evaluation algorithms that can be used to speed up the execution of such queries. The algorithms are based on (1) early sorting and (2) early partitioning--- or a combination of both. The idea is to push the sorting and/or the partitioning to the leaves, i. e., the base relations, of the query evaluation plans (QEPs) and thereby avoid sorting or partitioning large intermediate results generated by the joins. Both, early sorting and early partitioning are used in combination with hash-based algorithms for evaluating the join(s) and the grouping. To enable early sorting, the sort order generated at an early stage of the query evaluation plan is retained through an arbitrary number of socalled order-preserving hash joins (OHJs). To make early partitioning applicable to a large class of decision support queries, we generalize the s...

