## Answering Queries: Tractable Cases and Optimizations (2001)

Citations: | 2 - 1 self |

### BibTeX

@TECHREPORT{Scarcello01answeringqueries:,

author = {Francesco Scarcello},

title = {Answering Queries: Tractable Cases and Optimizations},

institution = {},

year = {2001}

}

### OpenURL

### Abstract

Answering queries is computationally very expensive, and many approaches have been proposed in the literature to face this fundamental problem. Some of them are based on optimization modules that exploit quantitative information on the database instance, while other approaches exploit structural properties of the query hypergraph. For instance, acyclic queries can be answered in polynomial time, and also query containment is efficiently decidable for acyclic queries. In this report, we review both quantitative and structural methods for optimizing query answering and identifying tractable classes of queries. Moreover, we provide a formal comparison of structural methods.

### Citations

10982 |
Computers and Intractability: A guide to the theory of NP-completeness
- Garey, Johnson
- 1978
(Show Context)
Citation Context ...nd the best possible way to compute the answer of Q on db, denoted by Q(db). Note that answering a query is a very difficult task, as the problem of deciding whether Q(db) is not empty is NP-complete =-=[9, 23]-=-, even if Q is a conjunctive query. Moreover, even computing a single tuple in the result of Q is NP-hard, and computing all the output relation is exponential, in general. All the available algorithm... |

1536 |
Foundations of Databases
- Abiteboul, Hull, et al.
- 1995
(Show Context)
Citation Context ...tudied class of tractable queries is the class of acyclic queries [7, 11, 12, 17, 18, 24, 42, 48, 50, 54, 55, 68, 70]. It was shown that acyclic queries coincide with 1 the tree queries [4], see also =-=[1, 47, 64]-=-. The latter are queries which are representable by a join tree (or join forest) (see Section 4.2, for a formal definition). By well-known results of Yannakakis [68], acyclic conjunctive queries are e... |

948 | Temporal constraint networks
- Dechter, Meiri, et al.
- 1991
(Show Context)
Citation Context ...ework. The good results about acyclic conjunctive queries extend to very relevant classes of nearly acyclic queries, such as queries whose associated primal graph has bounded treewidth [53], a cutset =-=[13]-=- of bounded size, or a bounded degree of cyclicity [33]. Thus, a number of query answering techniques based on such structural properties have been proposed. Conceptually, all of them may be viewed as... |

528 |
The complexity of relational query languages
- Vardi
- 1982
(Show Context)
Citation Context ...L [26]. Efficient parallel algorithms for 1 Note that, since both the database db and the query Q are part of an input-instance of BCQ, what we are considering is the combined complexity of the query =-=[66]-=-. 7 Boolean and non-Boolean queries have been proposed in [26] and [28]. They run on parallel database machines that exploit the inter-operation parallelism [67], i.e., machines that execute different... |

500 |
Graphs and Hypergraphs
- Berge
- 1973
(Show Context)
Citation Context ... concept of hypergraph acyclicity among all those defined in the literature. There are several more restrictive notions of acyclicity such as, e.g., fi-acyclicity, fl-acyclicity, and Berge-acyclicity =-=[5, 17]-=-. In particular, the following chain of implications holds for any hypergraph H: H is Berge-acyclic ) H is fl-acyclic ) H is fi-acyclic ) H is acyclic (i.e., ff-acyclic). However, in general, none of ... |

482 | 1979]. \Access path selection in a relational database management system
- Astrahan, Chamberlin, et al.
- 1979
(Show Context)
Citation Context ...umeration algorithm of the optimizer determines which plans to enumerate, and the classic enumeration algorithm is based on dynamic programming. This algorithm was pioneered in IBM's System R project =-=[57]-=-, and it is used in most query optimizers today. Dynamic programming works very well if all queries are standard SQL-92 queries, the queries are moderately complex, and only simple textbook query exec... |

456 |
Optimal implementation of conjunctive queries in relational databases
- Chandra, Merlin
- 1977
(Show Context)
Citation Context ...nd the best possible way to compute the answer of Q on db, denoted by Q(db). Note that answering a query is a very difficult task, as the problem of deciding whether Q(db) is not empty is NP-complete =-=[9, 23]-=-, even if Q is a conjunctive query. Moreover, even computing a single tuple in the result of Q is NP-hard, and computing all the output relation is exponential, in general. All the available algorithm... |

433 |
The Theory of Relational Databases
- Maier
- 1983
(Show Context)
Citation Context ...tudied class of tractable queries is the class of acyclic queries [7, 11, 12, 17, 18, 24, 42, 48, 50, 54, 55, 68, 70]. It was shown that acyclic queries coincide with 1 the tree queries [4], see also =-=[1, 47, 64]-=-. The latter are queries which are representable by a join tree (or join forest) (see Section 4.2, for a formal definition). By well-known results of Yannakakis [68], acyclic conjunctive queries are e... |

392 |
Network-based heuristics for constraint-satisfaction problems
- Dechter, Pearl
- 1987
(Show Context)
Citation Context ...me fixed constant k, is NP-complete [27]. Further interesting methods that do not explicitly generalize acyclic hypergraphs are based on a notion of consistency as used in [19, 20]. Dechter and Pearl =-=[14]-=- introduced the notion of induced width w which is -- roughly -- the smallest width k of any graph G 0 obtained by triangulation methods from the primal graph G of a hypergraph such that G 0 ensures k... |

366 |
Graph Minors II. Algorithmic aspects of tree width
- Robertson, Seymour
- 1986
(Show Context)
Citation Context ... warehouse framework. The good results about acyclic conjunctive queries extend to very relevant classes of nearly acyclic queries, such as queries whose associated primal graph has bounded treewidth =-=[53]-=-, a cutset [13] of bounded size, or a bounded degree of cyclicity [33]. Thus, a number of query answering techniques based on such structural properties have been proposed. Conceptually, all of them m... |

274 |
A sufficient condition for backtrack-free search
- Freuder
- 1982
(Show Context)
Citation Context ... a query has widthsk, for some fixed constant k, is NP-complete [27]. Further interesting methods that do not explicitly generalize acyclic hypergraphs are based on a notion of consistency as used in =-=[19, 20]-=-. Dechter and Pearl [14] introduced the notion of induced width w which is -- roughly -- the smallest width k of any graph G 0 obtained by triangulation methods from the primal graph G of a hypergraph... |

255 | Optimizing Queries across Diverse Data Sources
- Haas, Kossmann, et al.
- 1997
(Show Context)
Citation Context ...any tables [16, 40] or new query optimization and execution techniques need to be integrated into the system in order to optimize queries in a distributed and/or heterogeneous programming environment =-=[35, 43]-=-. In these situations, the search space of query optimization can become very large and dynamic programming is not always viable because of its very high complexity. In general, there is a tradeoff be... |

251 |
The clustering for constraint networks
- Dechter, Pearl
- 1999
(Show Context)
Citation Context ...h 9 {G} {C, G, F} {C} {A, B, C} {G, H, I} {G, L, M} {C, D, E} {D} {D, N, O} {E} {E, P, Q} Figure 5: The BICOMP decomposition of the hypergraph H b in Example 5.1 5.2 Tree Clustering (short: TCLUSTER) =-=[15]-=- The tree clustering method is based on a triangulation algorithm which transforms the primal graph G = (V; E) of any CQ instance I into a chordal graph G 0 . The acyclic hypergraph H(G 0 ) having the... |

244 |
Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs
- Tarjan, Yannakakis
- 1984
(Show Context)
Citation Context ...a distributed environment. 3. Acyclicity is efficiently recognizable, and a join tree of an acyclic hypergraph is efficiently computable. A linear-time algorithm for computing a join tree is shown in =-=[63]-=-; an L SL method has been provided in [26]. 4. The result of a (non-Boolean) acyclic conjunctive query Q can be computed in time polynomial in the combined size of the input instance and of the output... |

231 | The state of the art in distributed query processing
- Kossmann
(Show Context)
Citation Context ...any tables [16, 40] or new query optimization and execution techniques need to be integrated into the system in order to optimize queries in a distributed and/or heterogeneous programming environment =-=[35, 43]-=-. In these situations, the search space of query optimization can become very large and dynamic programming is not always viable because of its very high complexity. In general, there is a tradeoff be... |

206 |
Principles of Database and Knowledge Base Systems
- Ullman
- 1989
(Show Context)
Citation Context ...tudied class of tractable queries is the class of acyclic queries [7, 11, 12, 17, 18, 24, 42, 48, 50, 54, 55, 68, 70]. It was shown that acyclic queries coincide with 1 the tree queries [4], see also =-=[1, 47, 64]-=-. The latter are queries which are representable by a join tree (or join forest) (see Section 4.2, for a formal definition). By well-known results of Yannakakis [68], acyclic conjunctive queries are e... |

170 | The Volcano optimizer generator: Extensibility and efficient search
- Graefe, McKenna
- 1993
(Show Context)
Citation Context ...[65]. Other representatives of this class are A* search [41] and transformation-based techniques (with top-down dynamic programming) such as those used in EXODUS, Volcano, and some commercial systems =-=[30, 31, 52]-=-. 3.2 Heuristics Typically, the algorithms of this class have polynomial time and space complexity, but they they typically produce worse plans [59]. Representatives of this class of algorithms are mi... |

165 |
A sufficient condition for backtrack-bounded search
- Freuder
- 1985
(Show Context)
Citation Context ...s paper. Detailed descriptions of these methods can be found in the corresponding reference (see below) and in many surveys on this subject, e.g., [51, 13]. 5.1 Biconnected Components (short: BICOMP) =-=[20]-=- Let G = (V; E) be a graph. A vertex p 2 V is a separating vertex for G if, by removing p from G, the number of connected components of G increases. A biconnected component of G is a maximal set of ve... |

160 | The EXODUS Optimizer Generator
- Graefe, DeWitt
- 1987
(Show Context)
Citation Context ...[65]. Other representatives of this class are A* search [41] and transformation-based techniques (with top-down dynamic programming) such as those used in EXODUS, Volcano, and some commercial systems =-=[30, 31, 52]-=-. 3.2 Heuristics Typically, the algorithms of this class have polynomial time and space complexity, but they they typically produce worse plans [59]. Representatives of this class of algorithms are mi... |

155 |
Randomized algorithms for optimizing large join queries
- Ioannidis, Kang
- 1990
(Show Context)
Citation Context ... algorithm strongly depends on the plan evaluation function that guides the preference criterion. 3.3 Randomized algorithms Various variants of randomized algorithms have been proposed in [39], [61], =-=[38]-=-, [46], [22], and [59]. The big advantage of randomized algorithms is that they have constant space overhead. The running time of most randomized algorithms cannot be predicted because these algorithm... |

145 | F.: A comparison of structural CSP decomposition methods
- Gottlob, Leone, et al.
- 2000
(Show Context)
Citation Context ...y the OEOE relation. Each element of the hierarchy represents a DM, apart from that containing Tree Clustering, w , and Treewidth which are grouped together because they are -equivalent. Theorem 6.1 (=-=[29]-=-) For each pair D 1 and D 2 of decompositions methods represented in Figure 14, the following holds: ffl There is a directed path from D 1 to D 2 iff D 1 OEOE D 2 , i.e., iff D 2 strongly generalizes ... |

134 | Treewidth: Algorithmic Techniques and Results
- Bodlaender
(Show Context)
Citation Context ...ing induced widthsk can be also characterized as partial k-trees [21] or, equivalently, as graphs having treewidthsk [2]. It follows that, for fixed k, checking whether wsk is feasible in linear time =-=[8]-=-. The approach based on w is referred to as the w -Tractability method [13]. Note that this method is implicitly based on hypergraph acyclicity, given that the used triangulation methods enforce chord... |

132 | Conjunctive-query containment and constraint satisfaction
- Kolaitis, Vardi
(Show Context)
Citation Context ...es. Exploiting such properties is possible to answer large classes of queries in polynomial time. One of the most important and best studied class of tractable queries is the class of acyclic queries =-=[7, 11, 12, 17, 18, 24, 42, 48, 50, 54, 55, 68, 70]-=-. It was shown that acyclic queries coincide with 1 the tree queries [4], see also [1, 47, 64]. The latter are queries which are representable by a join tree (or join forest) (see Section 4.2, for a f... |

128 | Hypertree decompositions and tractable queries
- Gottlob, Leone, et al.
(Show Context)
Citation Context ...lity 2, e.g., the set fX 1 ; X 4 g. 13 5.7 Hypertree Decompositions A new class of tractable conjunctive database queries, which generalizes the class of acyclic queries, has recently been identified =-=[27]-=-. This is the class of queries having a bounded-width hypertree decomposition [27]. Deciding whether a given query has this property is feasible in polynomial time and even highly parallelizable. We w... |

120 | Optimization of nonrecursive queries
- Krishnamurthy, Boral, et al.
- 1986
(Show Context)
Citation Context ...and space complexity, but they they typically produce worse plans [59]. Representatives of this class of algorithms are minimum selectivity and other greedy algorithms [60, 58, 59], the KBZ algorithm =-=[45]-=-; and the AB algorithm [62]. Basically, at each step, a partially determined order is extended by choosing the most promising relational operation to be executed, according to some preference criterio... |

111 |
Algorithms for acyclic database schemes
- Yannakakis
- 1981
(Show Context)
Citation Context ...es. Exploiting such properties is possible to answer large classes of queries in polynomial time. One of the most important and best studied class of tractable queries is the class of acyclic queries =-=[7, 11, 12, 17, 18, 24, 42, 48, 50, 54, 55, 68, 70]-=-. It was shown that acyclic queries coincide with 1 the tree queries [4], see also [1, 47, 64]. The latter are queries which are representable by a join tree (or join forest) (see Section 4.2, for a f... |

100 | Using semi-joins to solve relational queries - Bernstein, Chiu - 1981 |

97 |
Optimization of large join queries
- Swami, Gupta
- 1988
(Show Context)
Citation Context ...greedy algorithm strongly depends on the plan evaluation function that guides the preference criterion. 3.3 Randomized algorithms Various variants of randomized algorithms have been proposed in [39], =-=[61]-=-, [38], [46], [22], and [59]. The big advantage of randomized algorithms is that they have constant space overhead. The running time of most randomized algorithms cannot be predicted because these alg... |

87 |
Query Optimization by Simulated Annealing
- Ioannidis, Wong
- 1987
(Show Context)
Citation Context ...y the greedy algorithm strongly depends on the plan evaluation function that guides the preference criterion. 3.3 Randomized algorithms Various variants of randomized algorithms have been proposed in =-=[39]-=-, [61], [38], [46], [22], and [59]. The big advantage of randomized algorithms is that they have constant space overhead. The running time of most randomized algorithms cannot be predicted because the... |

83 | Decomposing constraint satisfaction problems using database techniques
- Gyssens, Jeavons, et al.
- 1994
(Show Context)
Citation Context ...es extend to very relevant classes of nearly acyclic queries, such as queries whose associated primal graph has bounded treewidth [53], a cutset [13] of bounded size, or a bounded degree of cyclicity =-=[33]-=-. Thus, a number of query answering techniques based on such structural properties have been proposed. Conceptually, all of them may be viewed as composed of two phases: 1. trasform the given query in... |

83 |
Measuring the complexity of join enumeration in query optimization
- Ono, Lohman
- 1990
(Show Context)
Citation Context ...y so-called left-deep plans, as originally proposed in [57]. See [59], for a discussion about "left-deep vs. bushy" (see, e.g.,). Efficient ways to implement dynamic programming have been pr=-=oposed in [49]-=- and [65]. Other representatives of this class are A* search [41] and transformation-based techniques (with top-down dynamic programming) such as those used in EXODUS, Volcano, and some commercial sys... |

82 |
Optimization of large join queries: Combining heuristics and combinatorial techniques
- Swami
- 1989
(Show Context)
Citation Context ...algorithms have already been developed for query optimization in database systems. All algorithms proposed so far fall into one of three different classes or are combinations of such basic algorithms =-=[60]-=-. In the following, we will briefly discuss each class of algorithms; a more complete overview and comparison of many of the existing algorithms can be found in [59]. 3.1 Exhaustive search All publish... |

77 | Degrees of acyclicity for hypergraphs and relational database schemes
- Fagin
- 1983
(Show Context)
Citation Context ...es. Exploiting such properties is possible to answer large classes of queries in polynomial time. One of the most important and best studied class of tractable queries is the class of acyclic queries =-=[7, 11, 12, 17, 18, 24, 42, 48, 50, 54, 55, 68, 70]-=-. It was shown that acyclic queries coincide with 1 the tree queries [4], see also [1, 47, 64]. The latter are queries which are representable by a join tree (or join forest) (see Section 4.2, for a f... |

77 |
Complexity of k-tree structured constraint satisfaction problems
- Freuder
- 1990
(Show Context)
Citation Context ... graph G 0 obtained by triangulation methods from the primal graph G of a hypergraph such that G 0 ensures k + 1consistency. Graphs having induced widthsk can be also characterized as partial k-trees =-=[21]-=- or, equivalently, as graphs having treewidthsk [2]. It follows that, for fixed k, checking whether wsk is feasible in linear time [8]. The approach based on w is referred to as the w -Tractability me... |

74 | The Complexity of Acyclic Conjunctive Queries
- Gottlob, Leone, et al.
- 2001
(Show Context)
Citation Context ...able in the acyclic case. In particular, given two queries Q 1 and Q 2 , checking whether Q 1 is contained in Q 2 is feasible in polynomial time, and actually highly parallelizable, if Q 1 is acyclic =-=[26]-=-. This is a fundamental problem for answering queries using views, in the data warehouse framework. The good results about acyclic conjunctive queries extend to very relevant classes of nearly acyclic... |

67 | Heuristic and randomized optimization for the join ordering problem
- Steinbrunn, Moerkotte, et al.
- 1997
(Show Context)
Citation Context ...binations of such basic algorithms [60]. In the following, we will briefly discuss each class of algorithms; a more complete overview and comparison of many of the existing algorithms can be found in =-=[59]-=-. 3.1 Exhaustive search All published algorithms of this class have exponential time and space complexity and are guaranteed to find the best plan according to the optimizer's cost model. The most pro... |

58 | A simplified universal relation assumption and its properties
- Fagin, Mendelzon, et al.
- 1982
(Show Context)
Citation Context |

57 |
N.: The power of natural semijoins
- Bernstein, Goodman
- 1981
(Show Context)
Citation Context |

57 |
On the effectiveness of optimization search strategies for parallel execution spaces
- Lanzelotte, Valduriez, et al.
- 1993
(Show Context)
Citation Context ...ithm strongly depends on the plan evaluation function that guides the preference criterion. 3.3 Randomized algorithms Various variants of randomized algorithms have been proposed in [39], [61], [38], =-=[46]-=-, [22], and [59]. The big advantage of randomized algorithms is that they have constant space overhead. The running time of most randomized algorithms cannot be predicted because these algorithms are ... |

53 |
On the Optimal Nesting Order for Computing N-Relational Joins
- Ibaraki, Kameda
- 1984
(Show Context)
Citation Context ... algorithms have lower complexity than dynamic programming, but these algorithms are not able to find as low-cost plans as dynamic programming. Since the problem of finding an optimal plan is NP-hard =-=[37, 56]-=-, implementors of query optimizers will probably always have to take this fundamental tradeoff between algorithm complexity and quality of plans into account when they decide which enumeration algorit... |

51 |
On the complexity of database queries
- Papadimitriou, Yannakakis
- 1999
(Show Context)
Citation Context |

50 | Rapid bushy join-order optimization with cartesian products - Vance, Maier - 1996 |

45 | Iterative dynamic programming: a new class of query optimization algorithms
- Kossmann, Stocker
(Show Context)
Citation Context ...xpensive than an optimal plan. 3.4 Iterative Dynamic Programming This technique is based on iteratively applying dynamic programming and can be seen as a combination of dynamic and greedy programming =-=[44]-=-. The essence of this heuristic is that instead of fully enumerating all query processing plans, a resource limit is established (defined by a parameter k). During each dynamic programming stage, all ... |

42 | Apers. Parallel Evaluation of Multi-Join Queries
- Wilschut, Flokstra, et al.
- 1995
(Show Context)
Citation Context ...s the combined complexity of the query [66]. 7 Boolean and non-Boolean queries have been proposed in [26] and [28]. They run on parallel database machines that exploit the inter-operation parallelism =-=[67]-=-, i.e., machines that execute different relational operations in parallel. These algorithms can be also employed for solving acyclic queries efficiently in a distributed environment. 3. Acyclicity is ... |

41 | A survey of tractable constraint satisfaction problems
- Pearson, Jeavons
- 1997
(Show Context)
Citation Context ...fined in the decomposition methods we consider in this paper. Detailed descriptions of these methods can be found in the corresponding reference (see below) and in many surveys on this subject, e.g., =-=[51, 13]-=-. 5.1 Biconnected Components (short: BICOMP) [20] Let G = (V; E) be a graph. A vertex p 2 V is a separating vertex for G if, by removing p from G, the number of connected components of G increases. A ... |

37 | K.L.: Multi-join optimization for symmetric multiprocessors
- Shekita, Young, et al.
- 1993
(Show Context)
Citation Context ...this class have polynomial time and space complexity, but they they typically produce worse plans [59]. Representatives of this class of algorithms are minimum selectivity and other greedy algorithms =-=[60, 58, 59]-=-, the KBZ algorithm [45]; and the AB algorithm [62]. Basically, at each step, a partially determined order is extended by choosing the most promising relational operation to be executed, according to ... |

32 |
A polynomial time algorithm for optimizing join queries
- Swami, Iyer
- 1993
(Show Context)
Citation Context ...hey they typically produce worse plans [59]. Representatives of this class of algorithms are minimum selectivity and other greedy algorithms [60, 58, 59], the KBZ algorithm [45]; and the AB algorithm =-=[62]-=-. Basically, at each step, a partially determined order is extended by choosing the most promising relational operation to be executed, according to some preference criterion. Obviously, the quality o... |

29 | A Blackboard Architecture for Query Optimization in Object Bases
- Kemper, Moerkotte, et al.
- 1993
(Show Context)
Citation Context ...[59], for a discussion about "left-deep vs. bushy" (see, e.g.,). Efficient ways to implement dynamic programming have been proposed in [49] and [65]. Other representatives of this class are =-=A* search [41]-=- and transformation-based techniques (with top-down dynamic programming) such as those used in EXODUS, Volcano, and some commercial systems [30, 31, 52]. 3.2 Heuristics Typically, the algorithms of th... |

28 | The complexity of transformation-based join enumeration
- Pellenkoft, Galindo-Legaria, et al.
- 1997
(Show Context)
Citation Context ...[65]. Other representatives of this class are A* search [41] and transformation-based techniques (with top-down dynamic programming) such as those used in EXODUS, Volcano, and some commercial systems =-=[30, 31, 52]-=-. 3.2 Heuristics Typically, the algorithms of this class have polynomial time and space complexity, but they they typically produce worse plans [59]. Representatives of this class of algorithms are mi... |

25 |
Mihalis Yannakakis. On the desirability of acyclic database schemes
- Beeri, Fagin, et al.
- 1983
(Show Context)
Citation Context ...ant and best studied class of tractable queries is the class of acyclic queries [7, 11, 12, 17, 18, 24, 42, 48, 50, 54, 55, 68, 70]. It was shown that acyclic queries coincide with 1 the tree queries =-=[4]-=-, see also [1, 47, 64]. The latter are queries which are representable by a join tree (or join forest) (see Section 4.2, for a formal definition). By well-known results of Yannakakis [68], acyclic con... |