Results 1 -
9 of
9
Optimization techniques for queries with expensive methods
- ACM Transactions on Database Systems (TODS
, 1998
"... Object-Relational database management systems allow knowledgeable users to de ne new data types, as well as new methods (operators) for the types. This exibility produces an attendant complexity, which must be handled in new ways for an Object-Relational database management system to be e cient. In ..."
Abstract
-
Cited by 53 (3 self)
- Add to MetaCart
Object-Relational database management systems allow knowledgeable users to de ne new data types, as well as new methods (operators) for the types. This exibility produces an attendant complexity, which must be handled in new ways for an Object-Relational database management system to be e cient. In this paper we study techniques for optimizing queries that contain time-consuming methods. The focus of traditional query optimizers has been on the choice of join methods and orders; selections have been handled by \pushdown " rules. These rules apply selections in an arbitrary order before as many joins as possible, using the assumption that selection takes no time. However, users of Object-Relational systems can embed complex methods in selections. Thus selections may take signi cant amounts of time, and the query optimization model must be enhanced. In this paper, we carefully de ne a query cost framework that incorporates both selectivity and cost estimates for selections. We develop an algorithm called Predicate Migration, and prove that it produces optimal plans for queries with expensive methods. We then describe our implementation of Predicate Migration in the commercial Object-Relational database management system Illustra, and discuss practical issues that a ect our earlier assumptions. We compare Predicate Migration to a variety of simpler optimization techniques, and demonstrate that Predicate Migration is the best general solution to date. The alternative techniques we presentmaybe useful for constrained workloads.
Query Execution Techniques for Caching Expensive Methods
- In SIGMOD
, 1996
"... . Object-Relational and Object-Oriented DBMSs allow users to invoke time-consuming ("expensive") methods in their queries. When queries containing these expensive methods are run on data with duplicate values, time is wasted redundantly computing methods on the same value. This problem has been stud ..."
Abstract
-
Cited by 50 (8 self)
- Add to MetaCart
. Object-Relational and Object-Oriented DBMSs allow users to invoke time-consuming ("expensive") methods in their queries. When queries containing these expensive methods are run on data with duplicate values, time is wasted redundantly computing methods on the same value. This problem has been studied in the context of programming languages, where "memoization" is the standard solution. In the database literature, sorting has been proposed to deal with this problem. We compare these approachesalong with a third solution, a variant of unary hybrid hashing which we call Hybrid Cache. We demonstrate that Hybrid Cache always dominates memoization, and significantly outperforms sorting in many instances. This provides new insights into the tradeoff between hashing and sorting for unary operations. Additionally, our Hybrid Cache algorithm includes some new optimizations for unary hybrid hashing, which can be used for other applications such as grouping and duplicate elimination. We conclude...
Translating OQL into Monoid Comprehensions -- Stuck with Nested Loops?
, 1996
"... This work tries to employ the monoid comprehension calculus --- which has proven to be an adequate framework to capture the semantics of modern object query languages featuring a family of collection types like sets, bags, and lists --- in a twofold manner: First, serving as a target language for ..."
Abstract
-
Cited by 14 (9 self)
- Add to MetaCart
This work tries to employ the monoid comprehension calculus --- which has proven to be an adequate framework to capture the semantics of modern object query languages featuring a family of collection types like sets, bags, and lists --- in a twofold manner: First, serving as a target language for the translation of ODMG OQL queries. We review work done in this field and also give comprehension calculus equivalents for the recently introduced OQL 1.2 concepts. Second, we use monoid comprehensions as the formalism in which we try to find efficient execution methods working on a rich set of physical structures (including indices, vertical and horizontal decomposition, etc.). The main problem coming up here is the "nested-loop nature" of the calculus expressions. While these loop-based semantics for evaluating comprehensions at least provide a way for executing OQL queries, their execution is almost always much less efficient than alternative physical algorithms of the database e...
Optimizing Queries with Universal Quantification in Object-Oriented and Object-Relational Databases
, 1997
"... We investigate the optimization and evaluation of queries with universal quantification in the context of the object-oriented and object-relational data models. The queries are classified into 16 categories depending on the variables referenced in the so-called range and quantifier predicates. For ..."
Abstract
-
Cited by 13 (7 self)
- Add to MetaCart
We investigate the optimization and evaluation of queries with universal quantification in the context of the object-oriented and object-relational data models. The queries are classified into 16 categories depending on the variables referenced in the so-called range and quantifier predicates. For the three most important classes we enumerate the known query evaluation plans and devise some new ones. These alternative plans are primarily based on anti-semijoin, division, generalized grouping with count aggregation, and set difference. In order to evaluate the quality of the many different evaluation plans a thorough performance analysis on some sample database configurations was carried out. The quantitative analysis reveals that---if applicable---the anti-semijoin-based plans are superior to all the other alternatives, even if we employ the most sophisticated division algorithms. Furthermore, exploiting object-oriented features, anti-semijoin plans can be derived even when this is not...
How to Comprehend Queries Functionally
, 1999
"... Compilers and optimizers for declarative query languages use some form of intermediate language to represent user-level queries. The advent of compositional query languages for orthogonal type systems (e.g. OQL) calls for internal query representations beyond extensions of relational algebra. This w ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Compilers and optimizers for declarative query languages use some form of intermediate language to represent user-level queries. The advent of compositional query languages for orthogonal type systems (e.g. OQL) calls for internal query representations beyond extensions of relational algebra. This work adopts a view of query processing which is greatly influenced by ideas from the functional programming domain. A uniform formal framework is presented which covers all query translation phases, including user-level query language compilation, query optimization, and execution plan generation. We pursue the type-based design - based on initial algebras - of a core functional language which is then developed into an intermediate representation that ts the needs of advanced query processing. Based on the principle of structural recursion we extend the language by monad comprehensions (which provide us with a calculus-style sublanguage that proves to be useful during the optimization of nested...
Optimization of DAG-Structured Query Evaluation Plans
, 1998
"... In OLAP environments, huge amount of data accumulated over a certain period of time is analyzed to extract or discover relevant information. The activity is CPU and IO-intensive, and it typically takes days to complete the analysis. Of even more serious concern is that the analysis tends to be treme ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In OLAP environments, huge amount of data accumulated over a certain period of time is analyzed to extract or discover relevant information. The activity is CPU and IO-intensive, and it typically takes days to complete the analysis. Of even more serious concern is that the analysis tends to be tremendously storage intensive. Since the base data itself measures in gigabytes, the intermediate relations tend to be formidably huge. In such a scenario, it is not only economical but also crucial that the analysis methodology be carefully optimized. engage.fusion is an analysis engine generator that also incorporates optimization of the methodology presented in the form of engage.fusion query graphs. This thesis project is a part of the consultancy project for Engage Technologies regarding optimization of engage.fusion query graphs. The engage.fusion query graphs are directed acyclic graphs because they allow reuse of common subplans during evaluation. We present a set of transformation rules...
Optimization and Evaluation of Disjunctive Queries
- IEEE Trans. on Knowledge and Data Engineering
, 2000
"... this paper, we propose a novel technique, called ###### ##########, for evaluating such disjunctive queries. The bypass processing technique is based on new selection and join operators that produce two output streams: the ####-stream with tuples satisfying the selection (join) predicate and the # ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
this paper, we propose a novel technique, called ###### ##########, for evaluating such disjunctive queries. The bypass processing technique is based on new selection and join operators that produce two output streams: the ####-stream with tuples satisfying the selection (join) predicate and the #####-stream with tuples not satisfying the corresponding predicate. Splitting the tuple streams in this way enables us to bypass costly predicates whenever the fate of the corresponding tuple (stream) can be determined without evaluating this predicate. In the paper, we show how to systematically generate bypass evaluation plans utilizing a bottom-up building block approach. We show that our evaluation technique allows to incorporate the standard SQL semantics of null values. For this, we devise two different approaches: One is based on explicitly incorporating three-valued logic into the evaluation plans; the other one relies on two-valued logic by moving all negations to atomic conditions of the selection predicate. We describe how to extend an iterator-based query engine to support bypass evaluation with little extra overhead. This query engine was used to quantitatively evaluate the bypass evaluation plans against the traditional evaluation techniques utilizing a CNF- or DNF-based query predicate
Optimal Ordering of Selections and Joins in Acyclic Queries with Expensive Predicates
- Lehrstuhl für Praktische Informatik III, RWTH
, 1996
"... The generally accepted optimization heuristics of pushing selections down does not yield optimal plans in the presence of expensive predicates. Therefore, several researchers have proposed algorithms for the optimal ordering of expensive joins and selections in a query evaluation plan. All of these ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
The generally accepted optimization heuristics of pushing selections down does not yield optimal plans in the presence of expensive predicates. Therefore, several researchers have proposed algorithms for the optimal ordering of expensive joins and selections in a query evaluation plan. All of these algorithms have an exponential run time. For a special case, we propose a polynomial algorithm which -- in one integrated step -- computes the optimal join order and places expensive predicates optimally within the join tree. The special case is characterized by the following statements: 1. only left-deep trees are considered, 2. no cross-products are considered, 3. the cost function has to exhibit the ASI property, and 4. cheap selections are pushed before-hand. 1 Introduction Traditional work on algebraic query optimization has mainly focused on the problem of ordering joins in a query. Restrictions like selections and projections are generally treated by "push-down rules". According to t...
On the Optimal Ordering of Maps, Selections, and Joins under Factorization
"... We examine the problem of producing the optimal evaluation order for queries containing joins, selections, and maps. Specifically, we look at the case where common subexpressions involving expensive UDF calls can be factored out. First, we show that ignoring factorization during optimization can ..."
Abstract
- Add to MetaCart
We examine the problem of producing the optimal evaluation order for queries containing joins, selections, and maps. Specifically, we look at the case where common subexpressions involving expensive UDF calls can be factored out. First, we show that ignoring factorization during optimization can lead to plans that are far o# the best possible plan: the di#erence in cost between the best plan considering factorization and the best plan not considering factorization can easily reach several orders of magnitude. Then, we introduce optimization strategies that produce optimal left-deep and bushy plans when factorization is taken into account.

