## Heuristic and Randomized Optimization for the Join Ordering Problem (1997)

### Cached

### Download Links

- [dodgers.fmi.uni-passau.de]
- [pi3.informatik.uni-mannheim.de]
- [medoc.springer.de:9999]
- DBLP

### Other Repositories/Bibliography

Venue: | VLDB Journal |

Citations: | 67 - 2 self |

### BibTeX

@ARTICLE{Steinbrunn97heuristicand,

author = {Michael Steinbrunn and Guido Moerkotte and Alfons Kemper},

title = {Heuristic and Randomized Optimization for the Join Ordering Problem},

journal = {VLDB Journal},

year = {1997},

volume = {6},

pages = {191--208}

}

### Years of Citing Articles

### OpenURL

### Abstract

Recent developments in database technology, such as deductive database systems, have given rise to the demand for new, cost-effective optimization techniques for join expressions. In this paper many different algorithms that compute approximate solutions for optimizing join orders are studied since traditional dynamic programming techniques are not appropriate for complex problems. First, two possible solution spaces, the space of left-deep and bushy processing trees, respectively, are evaluated from a statistical point of view. The result is that the common limitation to leftdeep processing trees is only advisable for certain join graph types. Basically, optimizers from three classes are analysed: heuristic, randomized and genetic algorithms. Each one is extensively scrutinized with respect to its working principle and its fitness for the desired application. It turns out that randomized and genetic algorithms are well suited for optimizing join expressions. They generate solutions of...

### Citations

7342 |
J.H.: Genetic Algorithms and
- Goldberg, Holland
- 1988
(Show Context)
Citation Context ...hem to achieve further improvement. We provide a brief overview of the terminology and the working principles of genetic algorithms. For a comprehensive introduction, the reader is referred to, e.g., =-=[Gol89]-=-. 4.3.1 Terminology Because genetic algorithms are designed to simulate biological evolution, much of the terminology used to describe them is borrowed from biology. One of the most important characte... |

617 |
Fundamentals of Database Systems
- Elmasri, Navathe
- 2006
(Show Context)
Citation Context ...is cost is C write (R 1 \Gamma R 2 ) = oe 12 \Delta jR 1 j \Delta jR 2 j bs=ts R12 2.2.1 Nested Loop Join The cost for performing a nested loop join (depending on the presence of index structures) is =-=[EN94]-=-: 1. Without index support C nl (R 1 \Gamma R 2 ) = b R1 -----z read R 1 + /& b R1 ms \Gamma 1 ' \Delta b R2 ! --- --zsread R 2 and perform join 2. Primary B + -tree index on the join attribute of R 2... |

436 |
On the Shortest Spanning Subtree of a Graph and the Travelling Salesman Problem
- Kruskal
- 1956
(Show Context)
Citation Context ... of the optimization. The algorithm can be extended to general (cyclic) join graphs in a straightforward way, namely by reducing the query graph to its minimal spanning tree using Kruskal's algorithm =-=[Kru56]-=-. The weight of the join graph's edges is determined by the selectivity of the appropriate join, and the minimal spanning tree is determined as the tree with the lowest product of edge weights, rather... |

178 |
Join processing in database systems with large main memories
- Shapiro
- 1986
(Show Context)
Citation Context ... on the amount of main memory available. 2.2.3 Hash Join We assume that a "Hybrid Hash Join" is carried out. This algorithm performs very well over a large range of available main memory. Th=-=e cost is [Sha86]-=-: b R 1 + b R 2 + 2 \Delta (b R 1 + b R 2 ) \Delta (1 \Gamma q) where q denotes the fraction of R 1 whose hash table fits into main memory. It is computed as: q = ms \Gamma l 1:4\Deltab R 1 \Gammams m... |

155 |
Randomized algorithms for optimizing large join queries
- Ioannidis, Kang
- 1990
(Show Context)
Citation Context ...2 R 5 , whereas "3Cycle" could yield R 5 R 2 R 1 R 4 R 3 . If the complete solution space with arbitrarily shaped (bushy) processing trees is considered, the moves depicted in Figure 7 (intr=-=oduced in [IK90]-=-) are used for traversal of the solution space. 4.2.1 Iterative Improvement If the solution space of the join optimization problem did contain but one global cost minimum without any local minima, we ... |

134 | Join processing in relational databases
- Mishra, Eich
- 1992
(Show Context)
Citation Context ...esting order in which the joins are to be performed. In this article, we shall be concentrating on the generation of low-cost join nesting orders while disregarding the specifics of join computing--- =-=[ME92]-=- provides a good overview on this subject. In relational database systems where queries are stated interactively, join expressions that involve more than about five or six relations are rarely encount... |

120 | Optimization of nonrecursive queries
- Krishnamurthy, Boral, et al.
- 1986
(Show Context)
Citation Context ...r in polynomial time, provided the query graph forms a tree (i.e., no cycles) and the cost function is a member of a certain class. Based on this result, Krishnamurthy, Boral and Zaniolo developed in =-=[KBZ86]-=- an algorithm (from now on called KBZ algorithm) that computes the optimal solution for a tree query in O(n 2 ) time, where n is the number of joins. In the first step, every relation plays, in turn, ... |

112 | Genetic algorithm in search, optimization and machine learning - DE - 1989 |

97 |
Optimization of large join queries
- Swami, Gupta
- 1988
(Show Context)
Citation Context ...e next set of benchmarks is carried out with randomized algorithms (cf. Section 4.2) and genetic algorithms (cf. Section 4.3). We will compare three variants of Iterative Improvement (called IIJ, IIH =-=[SG88]-=-, IIIO [IK90]), and Simulated Annealing (called SAJ, SAH [SG88], SAIO [IK90]) and two variants of genetic algorithms (Genetic, BushyGenetic). Furthermore, the results of the System-R optimization are ... |

87 | Query optimization by simulated annealing - Ioannidis, Wong - 1987 |

83 | A Strategy for Query Processing
- Wong, Youseffi
- 1976
(Show Context)
Citation Context ...m selectivity heuristic builds a left-deep processing tree step by step while trying to keep intermediate relations as small as possible. In this regard, this resembles Ingres' decomposition strategy =-=[WY76]-=-; however, unlike the decomposion strategy, which considers only the operands' cardinalities, the minimum selectivity heuristic makes use of the selectivity factor oe of the join R 1 \Gamma R 2 to ach... |

82 |
Measuring the complexity of join enumeration in query optimization
- Ono, Lohman
- 1990
(Show Context)
Citation Context ...rd right-deep (i.e., the join nodes' left operands are base relations), the smaller is the size of this subset. For a right-deep tree, none of its join processing nodes is capable of pipelining. 7 In =-=[OL90]-=-, an adaptable plan enumeration strategy for linear (chain) and star-shaped join graphs is proposed that reduces the number of plans whose costs have to be evaluated considerably. If n denotes the num... |

82 | Optimization of large join queries: Combining heuristic and combinatorial techniques - Swami - 1989 |

70 | Left-deep vs. bushy trees: An analysis of strategy spaces and its implications for query optimization
- Ioannidis, Kang
- 1991
(Show Context)
Citation Context ...00 100000 5 10 15 20 25 30 Scaled Cost Number of Relations MinSel KBZ System R Figure 18: Deterministic Algorithms; Chain Join Graph optimization algorithms' capability to locate these solutions (cf. =-=[IK91]-=-). In the remainder of this section, we will investigate whether the bushy tree optimizers can exploit the potential of good solutions in the bushy tree solution space. Benchmark Results for Determini... |

69 |
Sequencing jobs to minimize total weighted completion time subject to precedence constraints. Annals of Discrete Mathematics
- Lawler
- 1978
(Show Context)
Citation Context ... far) intermediate result and moved from R remaining to R used . Figure 3 shows the complete algorithm for left-deep processing trees. 4.1.3 Krishnamurthy-Boral-Zaniolo Algorithm On the foundation of =-=[Law78]-=- and [MS79], Ibaraki and Kameda showed in [IK84] that it is possible to compute the optimal nesting order in polynomial time, provided the query graph forms a tree (i.e., no cycles) and the cost funct... |

57 |
On the effectiveness of optimization search strategies for parallel execution spaces
- Lanzelotte, Valduriez, et al.
- 1993
(Show Context)
Citation Context ...st interesting special cases, although other tree shapes might be contemplated, e.g., right-deep trees or zig-zag trees, which are mainly of interest in distributed computing environments (cf., e.g., =-=[LVZ93]). 3.1 Lef-=-t-Deep Trees This subset consists of all processing trees where the inner relation of each join is a base relation. For a fixed number of base relations, the specification "left-deep" does n... |

50 | Rapid bushy join-order optimization with cartesian products
- Vance, Maier
- 1996
(Show Context)
Citation Context ...utions. That (and the exponential running time) makes its application for queries that involve more than about ten to fifteen relations prohibitively expensive. In a very recent work, Vance and Maier =-=[VM96]-=- devised a very efficient, socalled light-weight implementation of dynamic programming for bushy-tree join optimization. Their method allows to optimize join queries with up to about 18 relations---al... |

41 | Access support relations: An indexing method for object bases
- Kemper, Moerkotte
- 1997
(Show Context)
Citation Context ...lized access mechanisms and would not be treated as ordinary join operations, this is not true for backward traversal. This would require appropriate index structures such as Access Support Relations =-=[KM92]-=-, processing of which, in turn, involves handling of potentially very complex join expressions for both initial materialization and maintenance. Hence, there is a demand for optimization techniques th... |

38 |
Sequencing with Series-Parallel Precedence Constraints
- Monma, Sidney
- 1979
(Show Context)
Citation Context ...ediate result and moved from R remaining to R used . Figure 3 shows the complete algorithm for left-deep processing trees. 4.1.3 Krishnamurthy-Boral-Zaniolo Algorithm On the foundation of [Law78] and =-=[MS79]-=-, Ibaraki and Kameda showed in [IK84] that it is possible to compute the optimal nesting order in polynomial time, provided the query graph forms a tree (i.e., no cycles) and the cost function is a me... |

36 | Optimal nesting for computing Nrelational joins - Ibaraki, Kameda - 1984 |

32 |
A polynomial time algorithm for optimizing join queries
- Swami, Iyer
- 1993
(Show Context)
Citation Context ...the optimal solution in O(n 2 ) time. In the following section, an algorithm is discussed that tries to remedy this situation. 4.1.4 AB Algorithm The AB algorithm has been developed by Swami and Iyer =-=[SI93]-=-. It is based on the KBZ algorithm with various enhancements, trying to remove the restrictions that are imposed on the join method placement. The algorithm permits the use of two different join metho... |

28 |
Object-Oriented Database Management
- Kemper, Moerkotte
- 1994
(Show Context)
Citation Context ... query-generating database system frontends and complex views. In both cases, very complex queries may be issued without the end user being aware of that fact. Even in objectoriented database systems =-=[KM94]-=-, complex join expressions may be encountered: while forward traversal of object references are usually very well supported by specialized access mechanisms and would not be treated as ordinary join o... |

25 | A genetic algorithm for database query optimization
- Bennett, Ferris, et al.
- 1991
(Show Context)
Citation Context ...mma R 3 ) \Gamma R 2 ) \Gamma R 5 ) is encoded as "14325". (b) Bushy trees: Bushy trees without cartesian products are encoded as an ordered list of join graph edges. This scheme has been pr=-=oposed in [BFI91]-=-. As an example of this encoding scheme, we represent the processing tree depicted in Figure 11b as a character string. In a preliminary step, every edge of the join graph is labelled by an arbitrary ... |

10 | Query processing in a relational database management system - Youssefi, Wong |

8 | On the shortest spanning subtree of a graph and the traveling salesman problem - JB - 1956 |

5 | Moerkotte G - Kemper - 1994 |

4 |
randomized join-order selection---why use transformations
- Fast
- 1994
(Show Context)
Citation Context ...s cost) than for Simulated Annealing with a random starting point, so the annealing process does not spend much time accepting moves that do not improve the current solution. 4.2.5 Random Sampling In =-=[GLPK94]-=-, a radically different idea is pursued. All randomized algorithms discussed so far are based on transformations that attempt to reduce a given solution's evaluation cost according to a set of rules u... |

3 | Navathe SB - Elmasri - 1989 |

3 | Left-deep vs. bushy trees: An analysis of strategy spaces and its implications for query optimization - YE, Kang - 1991 |

3 | Boral H, Zaniolo C - Krishnamurthy - 1986 |

2 |
Optimal nesting for computing N - relational joins
- Ibaraki, Kameda
- 1984
(Show Context)
Citation Context ... exhaustive search is perfectly feasible---it takes but a few seconds CPU time. But if more than about eight relations are to be joined, the generally NP-hard problem of determining the optimal order =-=[IK84]-=- cannot be solved exactly anymore. We have to rely on algorithms that compute (hopefully) good approximate solutions. Those algorithms fall into two classes: first, augmentation heuristics that build ... |

2 | Randomized algorithms for optimizing large join queries - YE, Kang - 1990 |

2 | Join processing in database systems with large main memories - LD - 1986 |

1 | Fast, randomized join-order selectoin -- why use transformations - Galindo-Legaria, Pellenkoft, et al. - 1994 |