## Expanding From Discrete To Continuous Estimation Of Distribution Algorithms: The IDEA (2000)

Venue: | In Parallel Problem Solving From Nature - PPSN VI |

Citations: | 33 - 8 self |

### BibTeX

@INPROCEEDINGS{Bosman00expandingfrom,

author = {Peter A.N. Bosman and Dirk Thierens},

title = {Expanding From Discrete To Continuous Estimation Of Distribution Algorithms: The IDEA},

booktitle = {In Parallel Problem Solving From Nature - PPSN VI},

year = {2000},

pages = {767--776},

publisher = {Springer}

}

### Years of Citing Articles

### OpenURL

### Abstract

. The direct application of statistics to stochastic optimization based on iterated density estimation has become more important and present in evolutionary computation over the last few years. The estimation of densities over selected samples and the sampling from the resulting distributions, is a combination of the recombination and mutation steps used in evolutionary algorithms. We introduce the framework named IDEA to formalize this notion. By combining continuous probability theory with techniques from existing algorithms, this framework allows us to dene new continuous evolutionary optimization algorithms. 1 Introduction Algorithms in evolutionary optimization guide their search through statistics based on a vector of samples, often called a population. By using this stochastic information, non{deterministic induction is performed in order to attempt to use the structure of the search space and thereby aid the search for the optimal solution. In order to perform induct...

### Citations

8321 |
Genetic Algorithms
- Goldberg
- 1989
(Show Context)
Citation Context ...ed so as to generate new solutions that will hopefully be closer to the optimum. As this process is iterated, convergence is intended to lead the algorithm to asnal solution. In the genetic algorithm =-=[11, 14]-=- and many variants thereof, values for problem variables are often exchanged and subsequently individually adapted. Another way of combining the samples is to regard them as being representative of so... |

297 | A survey of optimization by building and using probabilistic models
- Pelikan, Goldberg, et al.
- 2002
(Show Context)
Citation Context ...proposed for discrete spaces [2-4, 12, 13, 15, 17, 19, 21], as well as in a limited way for continuous spaces [5, 10, 15, 22, 23]. An overview of thisseld has been given by Pelikan, Goldberg and Lobo =-=[20]-=-. Our goal in this paper is to apply the search for good probability density models to continuous spaces. To this end, we formalize the notion of building and using probabilistic models in a new frame... |

273 | BOA: the Bayesian Optimization Algorithm
- Pelikan, Goldberg, et al.
- 1999
(Show Context)
Citation Context ...ition to having an acyclic pds graph, each node may have at most parents, the pds is constrained to 8 i2L hj(i)j i. This general approach is used in the BOA by Pelikan, Goldberg and Cantu{ Paz [19], as well as the LFDA by Muhlenbein and Mahnig [16] and the EBNA by Larra~naga, Etxeberria, Lozano and Pe~na. In the case of = 1, a polynomial time algorithm can be used to minimize the KL divergen... |

250 | From recombination of genes to the estimation of distributions, I. Binary parameters
- Mühlenbein, Paaß
- 1996
(Show Context)
Citation Context ... In the univariate distribution, all variables are regarded independently of each other. The PBIL by Baluja and Caruana [2], the cGA by Harik, Lobo and Goldberg [13], the UMDA by Muhlenbein and Paa [1=-=-=-8], and all known approaches in the continuous case prior to the IDEA [10, 22, 23], use this pds. It can be modelled by 8 i2L h(i) = () ^ ! i = ii, giving: ^ P (;!) (Z) = Q l 1 i=0 ^ P (Z i ). In the ... |

237 | The Compact Genetic Algorithm
- Harik, Lobo, et al.
- 1999
(Show Context)
Citation Context ...orithms that we use in our experiments. In the univariate distribution, all variables are regarded independently of each other. The PBIL by Baluja and Caruana [2], the cGA by Harik, Lobo and Goldberg =-=-=-[13], the UMDA by Muhlenbein and Paa [18], and all known approaches in the continuous case prior to the IDEA [10, 22, 23], use this pds. It can be modelled by 8 i2L h(i) = () ^ ! i = ii, giving: ^ P (... |

198 | Linkage learning via probabilistic modeling in the ECGA
- Harik
- 1999
(Show Context)
Citation Context ...dy algorithm is used that iteratively adds arcs to the pds graph. There are other special case algorithms, such as the optimal dependency trees approach by Baluja and Davies [3] and the ECGA by Harik =-=[12]-=-. Like the LFDA, the ECGA uses minimum description length as a search metric. This metric has the advantage that the resulting pds will not be overly complex. Using the KL divergence, this can only be... |

186 | Removing the genetics from the standard genetic algorithm
- Baluja, Caruana
- 1995
(Show Context)
Citation Context ...hree previously introduced pds search algorithms that we use in our experiments. In the univariate distribution, all variables are regarded independently of each other. The PBIL by Baluja and Caruana =-=[-=-2], the cGA by Harik, Lobo and Goldberg [13], the UMDA by Muhlenbein and Paa [18], and all known approaches in the continuous case prior to the IDEA [10, 22, 23], use this pds. It can be modelled by 8... |

170 |
Introduction to Graphical Modelling
- Edwards
- 1995
(Show Context)
Citation Context ...literature, a pds is also called a factorization. Let atb be the splicing of a and b such that the elements of b are placed behind the elements of a, giving jatbj = jaj+jbj. Using graphical modelling [9], we can denote any non{clustered pds with conditional probabilities P (ZhaijZhbi) = P (Zhatbi)=P (Zhbi). We let () be a function that returns a vector (i) = ((i) 0 ; (i) 1 ; : : : ; (i) j(i)j1... |

135 | MIMIC: Finding Optima by Estimating Probability Densities
- Bonet, Isbell, et al.
- 1997
(Show Context)
Citation Context ...ase prior to the IDEA [10, 22, 23], use this pds. It can be modelled by 8 i2L h(i) = () ^ ! i = ii, giving: ^ P (;!) (Z) = Q l 1 i=0 ^ P (Z i ). In the MIMIC algorithm by De Bonet, Isbell and Viola [4], the pds is a chain which is constrained to (! l 1 ) = () ^8 i2L (l 1) h(! i ) = (! i+1 )i, giving ^ P (;!) (Z) = ( Q l 2 i=0 ^ P (Z! i jZ ! i+1 )) ^ P (Z! l 1 ). Tosnd the chain, an O(l 2 ) greed... |

119 | Using optimal dependency-trees for combinatorial optimization: Learning the structure of the search space
- Baluja, Davies
- 1997
(Show Context)
Citation Context ... the case of > 1, a greedy algorithm is used that iteratively adds arcs to the pds graph. There are other special case algorithms, such as the optimal dependency trees approach by Baluja and Davies [=-=3]-=- and the ECGA by Harik [12]. Like the LFDA, the ECGA uses minimum description length as a search metric. This metric has the advantage that the resulting pds will not be overly complex. Using the KL d... |

92 | The bivariate marginal distribution algorithm - Pelikan, Miihlenbein - 1999 |

67 | FDA — A scalable evolutionary algorithm for the optimization of additively decomposed functions
- Mühlenbein, Mahnig
- 1999
(Show Context)
Citation Context ...have at most parents, the pds is constrained to 8 i2L hj(i)j i. This general approach is used in the BOA by Pelikan, Goldberg and Cantu{ Paz [19], as well as the LFDA by Muhlenbein and Mahnig [16] and the EBNA by Larra~naga, Etxeberria, Lozano and Pe~na. In the case of = 1, a polynomial time algorithm can be used to minimize the KL divergence [5]. In the case of > 1, a greedy algorithm is ... |

66 | Extending population-based incremental learning to continuous search spaces”, Parallel Problem Solving from Nature-PPSN VI
- Sebag, Ducoulombier
- 1998
(Show Context)
Citation Context ... it, is a global statistical type of inductive iterated search. Such algorithms have been proposed for discrete spaces [2-4, 12, 13, 15, 17, 19, 21], as well as in a limited way for continuous spaces =-=[5, 10, 15, 22, 23]-=-. An overview of thisseld has been given by Pelikan, Goldberg and Lobo [20]. Our goal in this paper is to apply the search for good probability density models to continuous spaces. To this end, we for... |

47 | Continuous iterated density estimation evolutionary algorithms within the IDEA framework
- Bosman, Thierens
- 2000
(Show Context)
Citation Context ...ce. We write Y instead of Z from now on to indicate the use of continuous random variables instead of either the discrete or continuous case. Using our denitions, the KL divergence can be written as [7]: D( ^ P ( + ;! + ) (Y)jj ^ P (;!) (Y)) = h( ^ P ( + ;! + ) (Y)) + l 1 X i=0 h( ^ P (Y! i jY h(! i )i)) (3) Let a v L; b v L where a v L means that a contains only elements of L. In equation 3, h(... |

46 | Optimization by Learning and Simulation of Bayesian and Gaussian Networks
- Larranaga, Etxeberria, et al.
- 1999
(Show Context)
Citation Context ... it, is a global statistical type of inductive iterated search. Such algorithms have been proposed for discrete spaces [2-4, 12, 13, 15, 17, 19, 21], as well as in a limited way for continuous spaces =-=[5, 10, 15, 22, 23]-=-. An overview of thisseld has been given by Pelikan, Goldberg and Lobo [20]. Our goal in this paper is to apply the search for good probability density models to continuous spaces. To this end, we for... |

43 | Linkage information processing in distribution estimation algorithms
- PAN, Thierens
- 1999
(Show Context)
Citation Context ...straints. The probabilistic models used in previously proposed algorithms range from lower order structures to structures of unbounded complexity. It has been empirically shown by Bosman and Thierens =-=[6]-=- that a higher order pds is required to solve higher order building block problems. We shortly state three previously introduced pds search algorithms that we use in our experiments. In the univariate... |

35 |
Evolution strategies I: Variants and their computational implementation
- Bäck, Schwefel
- 1995
(Show Context)
Citation Context ...ge) Dimensions Univariate Graph Full Joint Fig. 1. Results on C1 (left, linear) and C3 (right, logarithmic) for increasing dimension. We compared the IDEA using the normal pdf to Evolution Strategies [1] (ES) on C 1 . The ES has a (; ) strategy with = 7 and independent mutations using either individual standard deviations n = l or a single standard deviation n = 1. We initialized the standar... |

29 |
Adaptation in natural and arti systems. Ann Arbor: The
- Holland
- 1975
(Show Context)
Citation Context ...ed so as to generate new solutions that will hopefully be closer to the optimum. As this process is iterated, convergence is intended to lead the algorithm to asnal solution. In the genetic algorithm =-=[11, 14]-=- and many variants thereof, values for problem variables are often exchanged and subsequently individually adapted. Another way of combining the samples is to regard them as being representative of so... |

24 | An Algorithmic Framework For Density Estimation Based Evolutionary Algorithms. Utrecht University technical report UU-CS-1999-46
- Bosman, Thierens
- 1999
(Show Context)
Citation Context ... it, is a global statistical type of inductive iterated search. Such algorithms have been proposed for discrete spaces [2-4, 12, 13, 15, 17, 19, 21], as well as in a limited way for continuous spaces =-=[5, 10, 15, 22, 23]-=-. An overview of thisseld has been given by Pelikan, Goldberg and Lobo [20]. Our goal in this paper is to apply the search for good probability density models to continuous spaces. To this end, we for... |

18 |
distributions and graphical models in evolutionary optimization
- Schemata
- 1999
(Show Context)
Citation Context ... t end . We call an IDEA with m, sel() and rep() so chosen, a monotonic IDEA. If we set m in the IDEA to n and set rep() to replace P with O, we obtain the EDA by Muhlenbein, Mahnig and Rodriguez [17=-=-=-]. In the EDA however, the threshold t cannot be enforced. Note how EDA is thus an instance of IDEA. 3 Probability density structure search algorithms In order to search for a pds, a metric is requir... |

6 | D.: IDEAs based on the normal kernels probability density function
- Bosman, Thierens
- 2000
(Show Context)
Citation Context ... normal kernels pdf places a normal pdf over every available sample point. Let s i be asxed standard deviation in the i-th dimension. The conditional pdf and the entropy can then be stated as follows =-=-=-[8]: fNK (y a0 jyha a 0 i) = jSj1 X i=0 i 1 s a 0 p 2 e (y a 0 y i a 0 ) 2 2s 2 a 0 (8) where i = e P jaj1 j=1 (y a j y i a j ) 2 2s 2 a j P jSj1 k=0 e P jaj1 j=1 (y a j y k a j ) 2 2s 2 a j h(Y hai... |

5 |
Real{valued evolutionary optimization using a probability density estimator
- Gallagher, Fream, et al.
- 1999
(Show Context)
Citation Context |

5 |
Telephone network trac overloading diagnosis and evolutionary computation technique
- Servet, Trave-Massuyes, et al.
- 1997
(Show Context)
Citation Context |