## A New Hybrid Method for Bayesian Network Learning With Dependency Constraints

Citations: | 2 - 1 self |

### BibTeX

@MISC{Schulte_anew,

author = {Oliver Schulte and Gustavo Frigo and Russell Greiner and Wei Luo and Hassan Khosravi},

title = {A New Hybrid Method for Bayesian Network Learning With Dependency Constraints},

year = {}

}

### OpenURL

### Abstract

Abstract — A Bayes net has qualitative and quantitative aspects: The qualitative aspect is its graphical structure that corresponds to correlations among the variables in the Bayes net. The quantitative aspects are the net parameters. This paper develops a hybrid criterion for learning Bayes net structures that is based on both aspects. We combine model selection criteria measuring data fit with correlation information from statistical tests: Given a sample d, search for a structure G that maximizes score(G, d), over the set of structures G that satisfy the dependencies detected in d. We rely on the statistical test only to accept conditional dependencies, not conditional independencies. We show how to adapt local search algorithms to accommodate the observed dependencies. Simulation studies with GES search and the BDeu/BIC scores provide evidence that the additional dependency information leads to Bayes nets that better fit the target model in distribution and structure. I.

### Citations

7052 |
Probabilistic reasoning in intelligent systems: networks of plausible inference
- Pearl
- 1988
(Show Context)
Citation Context ...arch and the BDeu/BIC scores provide evidence that the additional dependency information leads to Bayes nets that better fit the target model in distribution and structure. I. INTRODUCTION Bayes nets =-=[1]-=- are a widely used formalism for representing and reasoning with uncertain knowledge, with many applications ranging from medical diagnosis to scientific discovery. A Bayes net (BN) model is a directe... |

2852 |
Controlling the false discovery rate: a practical and powerful approach to multiple testing
- Benjamini, Hochberg
- 1995
(Show Context)
Citation Context ... testing arise. Our system architecture is modular, so any multiple hypothesis testing method can be employed to implement the functionality of find-new-dependencies, such as the methods described in =-=[21]-=-, [22]. Many constraintbased and hybrid systems simply carry out multiple hypotheses at the same fixed significance level [2], [6], [12]. Our simulations follow this approach, to facilitate comparison... |

1240 |
A.: On information and sufficiency
- KULLBACK, LEIBLER
- 1951
(Show Context)
Citation Context ...fore convergence. As in other Bayes net learning studies (e.g., [6], [18]), the distributional criterion considered is the KullbackLeibler (KL) divergence of the fitted model to the true distribution =-=[23]-=-. Given a target distribution f that generates the training sample, and a DAG G inferred from the sample, let ˆ fG be the fitted distribution (with MLE estimation of parameters [4]). Then the KL diver... |

903 | Learning Bayesian networks: The combination of knowledge and statistical data - Heckerman, Geiger, et al. - 1995 |

854 | A tutorial on learning with bayesian networks
- Heckerman
- 1995
(Show Context)
Citation Context ...earning BN structure. Constraint-based (CB) methods employ a statistical test to detect conditional (in)dependencies given a sample d, and then compute a BN G that fits the (in)dependencies [2], [3], =-=[4]-=-. By constrast, score-based methods search for models that maximize a model selection score [3], [4]. Recent research into hybrid methods aims to combine the strengths of both approaches [5], [6]. Add... |

245 |
E.; Learning Bayesian networks
- Neapolitan
- 2004
(Show Context)
Citation Context ... to learning BN structure. Constraint-based (CB) methods employ a statistical test to detect conditional (in)dependencies given a sample d, and then compute a BN G that fits the (in)dependencies [2], =-=[3]-=-, [4]. By constrast, score-based methods search for models that maximize a model selection score [3], [4]. Recent research into hybrid methods aims to combine the strengths of both approaches [5], [6]... |

180 | Learning of Bayesian network structure from massive datasets: The “sparse candidate” algorithm
- Friedman, Nachman, et al.
- 1999
(Show Context)
Citation Context ...endence test (type II errors). While the single edge deletion strategy has to address type II error, the issue for our system is type I error. Other previous hybrid BN learning algorithms (e.g., [6], =-=[17]-=-) consider statistical measures (e.g., mutual information), but do not incorporate the outcome of a statistical test as a constraint that the learned model must satisfy. Our algorithm can be seen as a... |

159 | Optimal structure identification with greedy search
- HEMMECKE, Chickering
(Show Context)
Citation Context ...e edges from the BN structure, maintaining the Markov boundary condition, until a local score optimum is reached. For experimental evaluation, we adapted the state-of-theart GES search procedure [9], =-=[10]-=- for constrained optimization; we refer to the resulting procedure as IGES (for “Imap + GES”). We report a number of simulations comparing GES search with and without dependency constraints, based on ... |

158 | Adaptive probabilistic networks with hidden variables
- Binder, Koller, et al.
- 1997
(Show Context)
Citation Context ...s with Insurance and Alarm Networks: We followed the same simulation protocol for generating samples and testing learning methods on two well-known real world BNs: Alarm [26] (37 nodes) and Insurance =-=[27]-=- (25 nodes) networks. We found that for larger graphs, the significance level should be adjusted downward to maintain a suitable false discovery rate for the testing strategy. A static approach is to ... |

92 | Learning bayesian networks from data: an information-theory based approach
- Tan, Cheng, et al.
(Show Context)
Citation Context ... to both constraint-based and score-based methods.b) Related Work. CB and Hybrid Methods: There are many constraint-based algorithms that employ statistical tests to discover BN structure [2], [12], =-=[13]-=-. Many of these methods use the “single link deletion” strategy [14]: if a significance test does not reject an independence null hypothesis X is independent of Y given S, then infer a conditional ind... |

75 | Aliferis. The max-min hill-climbing Bayesian network structure learning algorithm
- Tsamardinos, Brown, et al.
- 2006
(Show Context)
Citation Context ...[2], [3], [4]. By constrast, score-based methods search for models that maximize a model selection score [3], [4]. Recent research into hybrid methods aims to combine the strengths of both approaches =-=[5]-=-, [6]. Additional motivation for the hybrid approach comes from cognitive science and observations of human intelligence: Psychological studies have shown that people infer causal models on the basis ... |

44 |
Graphical models, selecting causal and statistical models
- Meek
- 1997
(Show Context)
Citation Context ...remove edges from the BN structure, maintaining the Markov boundary condition, until a local score optimum is reached. For experimental evaluation, we adapted the state-of-theart GES search procedure =-=[9]-=-, [10] for constrained optimization; we refer to the resulting procedure as IGES (for “Imap + GES”). We report a number of simulations comparing GES search with and without dependency constraints, bas... |

28 | Critical remarks on single link search in learning belief networks
- Xiang, Wong, et al.
- 1996
(Show Context)
Citation Context ...CB and Hybrid Methods: There are many constraint-based algorithms that employ statistical tests to discover BN structure [2], [12], [13]. Many of these methods use the “single link deletion” strategy =-=[14]-=-: if a significance test does not reject an independence null hypothesis X is independent of Y given S, then infer a conditional independence and mark variables X and Y as nonadjacent. As we do not in... |

25 |
The ALARM monitoring system
- Beinlich, Suermondt, et al.
- 1989
(Show Context)
Citation Context ...arned structure. 2) Simulations with Insurance and Alarm Networks: We followed the same simulation protocol for generating samples and testing learning methods on two well-known real world BNs: Alarm =-=[26]-=- (37 nodes) and Insurance [27] (25 nodes) networks. We found that for larger graphs, the significance level should be adjusted downward to maintain a suitable false discovery rate for the testing stra... |

17 | A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. JMLR
- Campos
- 2006
(Show Context)
Citation Context ...[3], [4]. By constrast, score-based methods search for models that maximize a model selection score [3], [4]. Recent research into hybrid methods aims to combine the strengths of both approaches [5], =-=[6]-=-. Additional motivation for the hybrid approach comes from cognitive science and observations of human intelligence: Psychological studies have shown that people infer causal models on the basis of ob... |

15 | Determining the number of non-spurious arcs in a learned dag model: Investigation of a bayesian and a frequentist approach
- Listgarten, Heckerman
(Show Context)
Citation Context ...formation for the true dependencies detected by the statistical test but missed by the score-based search without testing. a multiple hypothesis testing method is the false discovery rate (FDR) [21], =-=[25]-=-, which is defined as #rejected true independence hypotheses/#tested independence hypotheses. Figure 5 shows that in our simulations, with the significance level fixed at α = 5%, the FDR in random gra... |

14 | A Theory of Causal Learning - GOPNIK, GLYMOUR, et al. - 2004 |

13 | Robust Independence Testing for Constraint-Based Learning of Causal Structure
- Dash, Druzdzel
- 2003
(Show Context)
Citation Context ...l tests follows this recommendation and is more conservative than the use of tests in previous CB algorithms. For more discussion of independence tests in CB algorithms, see [3, p.593], [2, Sec.5.6], =-=[16]-=-. A recent hybrid method (max-min hill climbing) that incorporates the single link deletion strategy is presented in [5]. While this work indicates that independence constraints from a statistical tes... |

11 | Model selection criteria for learning belief nets
- Allen, Greiner
- 2000
(Show Context)
Citation Context ...core-based search. c) Related Work. Score+Search Methods: Several previous studies have observed the tendency of many scorebased methods towards graphs that are sparser than the target structure [6], =-=[18]-=-, [10]. The following simple experiment illustrates how standard model-selection scores can fail to capture statistically significant associations on smallto-medium sample sizes. It is meant only to e... |

2 | D.: Understanding the effects of search constraints on structure learning
- Hay, Fast, et al.
(Show Context)
Citation Context ...frequencies of events. A natural approach to a hybrid system is to treat the information from statistical tests as a constraint on the model selection search that effectively reduces the search space =-=[8]-=-. In this paper we propose a new hybrid criterion: find a Bayes net that maximizes the score given the constraint that the net must satisfy the dependencies detected by a suitable statistical test. We... |

2 |
induction via local neighbor
- net
- 2000
(Show Context)
Citation Context ...proach to both constraint-based and score-based methods.b) Related Work. CB and Hybrid Methods: There are many constraint-based algorithms that employ statistical tests to discover BN structure [2], =-=[12]-=-, [13]. Many of these methods use the “single link deletion” strategy [14]: if a significance test does not reject an independence null hypothesis X is independent of Y given S, then infer a condition... |

2 |
A SINful approach to Bayesian graphical model selection
- Drton
- 2008
(Show Context)
Citation Context ...over these and plotting the number of independence tests as a function of number of candidate graphs examined during the search, we find that the number of tests performed is about 6 times themodels =-=[28]-=-. Fig. 6. Comparing GES/BDeu (left) and IGES/BDeu (right) on the Insurance network structure. For each sample size of 500, 1000, and 5000, we generated 14 random samples and compared the outputs of GE... |

1 |
Statistical prudence and statistical inference,” in The Significance Test Controversy
- Hogben
- 1970
(Show Context)
Citation Context ...ilure to reject is a less reliable indicator that the null hypothesis is true. Many statisticians recommend against inferring the truth of the null hypothesis when the null hypothesis is not rejected =-=[15]-=-; our use of statistical tests follows this recommendation and is more conservative than the use of tests in previous CB algorithms. For more discussion of independence tests in CB algorithms, see [3,... |

1 |
Occam’s hammer,” in Learning theory
- Blanchard, Fleuret
- 2007
(Show Context)
Citation Context ...ng arise. Our system architecture is modular, so any multiple hypothesis testing method can be employed to implement the functionality of find-new-dependencies, such as the methods described in [21], =-=[22]-=-. Many constraintbased and hybrid systems simply carry out multiple hypotheses at the same fixed significance level [2], [6], [12]. Our simulations follow this approach, to facilitate comparisons with... |