## 17 Evolutionary Methods for Learning Bayesian Network Structures

### BibTeX

@MISC{Brouard_17evolutionary,

author = {Thierry Brouard and Alain Delaplace and Hubert Cardot},

title = {17 Evolutionary Methods for Learning Bayesian Network Structures},

year = {}

}

### OpenURL

### Abstract

Bayesian networks (BN) are a family of probabilistic graphical models representing a joint distribution for a set of random variables. Conditional dependencies between these variables are symbolized by a Directed Acyclic Graph (DAG). Two classical approaches are often encountered when automaticaly determining an appropriate graphical structure from

### Citations

682 | Approximating discrete probability distributions with dependence trees
- Chow, Liu
- 1968
(Show Context)
Citation Context ...phs. When one cycle is detected within a graph, the operator suppresses the one arc in the cycle bearing the weakest mutual information. The mutual information between two variables is defined as in (=-=Chow & Liu, 1968-=-): ∑ ⎟ ⎛ ⎞ = ⎜ a b ⎝ ⎠ x x b a N ab N abN W( X A , X B ) log (6) , N N N Where the mutual information W(XA,XB) between two variables XA and XB is calculated according to the number of times Nab that X... |

263 |
Regression and time series model selection in small samples
- Hurvich, Tsai
- 1989
(Show Context)
Citation Context ...tives have emerged, for example CAIC - Consistent AIC - (Bozdogan, 1987). If the size of the database is very small, it is generally preferable to use AICC - Akaike Information Corrected Criterion - (=-=Hurvich & Tsai, 1989-=-). The MDL criterion (Rissanen, 1978; Suzuki, 1996) incorporates a penalizing scheme for the structures which are too complex. It takes into account the complexity of the model and the complexity of e... |

173 | Graphical models for associations between variables, some of which are qualitative and some - LAURITZEN, WERMUTH - 1989 |

162 | Adaptive Probabilistic Networks with Hidden Variables - Binder, Koller, et al. - 1997 |

67 | Learning Belief Networks from Data: An Information Theory Based Approach
- Cheng, Bell, et al.
- 1997
(Show Context)
Citation Context ...ed when automaticaly determining an appropriate graphical structure from a database of cases,. The first one consists in the detection of (in)dependencies between the variables (Spirtes et al., 2001; =-=Cheng et al., 2002-=-). The second one uses a scoring metric (Chickering, 2002a). But neither the first nor the second are really satisfactory. The first one uses statistical tests which are not reliable enough when in pr... |

41 | The Bayes net toolbox for matlab. Computing Science and Statistics - Murphy - 2001 |

29 |
Estimating the dimension of a model. The Annals of Statistics
- Schwartz
- 1978
(Show Context)
Citation Context ...are too complex. It takes into account the complexity of the model and the complexity of encoding data related to this model. Finally, the BIC criterion (Bayesian Information Criterion), proposed in (=-=Schwartz, 1978-=-), is similar to the AIC criterion. Properties such as equivalence, breakdown-ability of the score and consistency are introduced. Due to its tendency to return the simplest models (Bouckaert, 1994), ... |

16 | Searching for bayesian network structures in the space of restricted acyclic partially directed graphs - Acid, Campos - 2003 |

16 | BNT structure learning package: Documentation and experiments - Francois, Leray - 2004 |

10 | Reasons for Premature Convergence of Self-Adapting Mutation Rates," presented at Evolutionary Computation
- Glickman, Sycara
- 2000
(Show Context)
Citation Context ...uired to reach the optimum. However, applying this kind of policy can do more harm than good. When there are many local optima, as in our case, we can be confronted with the bowl effect described in (=-=Glickman & Sycara, 2000-=-). That is: when the population is clustered around a local optimum and the mutation rate is too low to allow at least one individual to escape this local optimum, a strictly decrementing adaptive pol... |

7 | A bayesian network scoring metric that is based on globally uniform parameter priors
- Kayaalp, Cooper
- 2002
(Show Context)
Citation Context ...figuration of a variable Xi and of its parents Pa(Xi) from being regarded as impossible. A variant, BDeu, initializes the prior probability distributions of parameters according to a uniform law. In (=-=Kayaalp & Cooper, 2002-=-) authors have shown that under certain conditions, this algorithm was able to detect arcs corresponding to low-weighted conditional dependencies. AIC, the Akaike Information Criterion (Akaike, 1970) ... |

7 | A skeleton-based approach to learning bayesian networks from data - Dijk, Gaag, et al. - 2003 |

5 |
Monotone DAG faithfulness: A bad assumption
- Chickering, Meek
- 2003
(Show Context)
Citation Context ...a structure G. Tests of conditional independence are equivalent to determine a threshold for mutual information (conditional or not) between couples of involved variables. In the latter case, a work (=-=Chickering & Meek, 2003-=-) comes to question the reliability of BNPC. Many algorithms, by conducting casual research, are quite similar. These algorithms propose a gradual construction of the structure returned. However, we n... |

4 | A primer on the evolution of equivalence classes of bayesiannetwork structures
- Muruzábal, Cotta
- 2004
(Show Context)
Citation Context ...et al., 1994) for a scoring approach. In this field of research, evolutionary methods such as Genetic Algorithms – GAs (De Jong, 2006) have already been used in various forms (Larrañaga et al., 1996; =-=Muruzábal & Cotta, 2004-=-; Wong et al., 1999; Wong et al., 2002; Van Dijk et al., 2003b; Acid & De Campos, 2003). Among these works, two lines of research are interesting. The first idea is to effectively reduce the search sp... |

1 |
Research Methods for Learning Bayesian Network Structures 357 Chickering, D.M. (2002a). Optimal structure identification with greedy search
- Chickering, Geiger, et al.
- 1994
(Show Context)
Citation Context ...te DAGs. Finally, in the case of the automatic determination of the appropriate graphical structure of a BN, it was shown that the search space is huge (Robinson, 1976) and that is a NP-hard problem (=-=Chickering et al., 1994-=-) for a scoring approach. In this field of research, evolutionary methods such as Genetic Algorithms – GAs (De Jong, 2006) have already been used in various forms (Larrañaga et al., 1996; Muruzábal & ... |

1 | Evolutionary Computation: A Unified Approach - No - 2006 |

1 | Genetic algorithms: What fitness scaling is optimal ? Cybernetics and Systems - No - 1993 |

1 | Exact inference in networks with discrete children of continuous parents - No - 2001 |

1 | 09/2004 Methods for Learning Bayesian Network Structures 359 - Birmingham - 2003 |