## Controlled Generation of Hard and Easy Bayesian Networks: Impact on Maximal Clique Tree in Tree Clustering (2006)

Venue: | Artificial Intelligence |

Citations: | 8 - 7 self |

### BibTeX

@ARTICLE{Mengshoel06controlledgeneration,

author = {Ole J. Mengshoel and David C. Wilkins and Dan Roth},

title = {Controlled Generation of Hard and Easy Bayesian Networks: Impact on Maximal Clique Tree in Tree Clustering},

journal = {Artificial Intelligence},

year = {2006},

volume = {170},

pages = {2006}

}

### OpenURL

### Abstract

This article presents and analyzes algorithms that systematically generate random Bayesian networks of varying difficulty levels, with respect to inference using tree clustering. The results are relevant to research on efficient Bayesian network inference, such as computing a most probable explanation or belief updating, since they allow controlled experimentation to determine the impact of improvements to inference algorithms. The results are also relevant to research on machine learning of Bayesian networks, since they support controlled generation of a large number of data sets at a given difficulty level. Our generation algorithms, called BPART and MPART, support controlled but random construction of bipartite and multipartite Bayesian networks. The Bayesian network parameters that we vary are the total number of nodes, degree of connectivity, the ratio of the number of non-root nodes to the number of root nodes, regularity of the underlying graph, and characteristics of the conditional probability tables. The main dependent parameter is the size of the maximal clique as generated by tree clustering. This article presents extensive empirical analysis using the H��� � tree clustering approach as well as theoretical analysis related to the random generation of Bayesian networks using BPART and MPART. 1

### Citations

7047 |
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
- Pearl
- 1988
(Show Context)
Citation Context ... inference problems is an important research problem. The performance of exact Bayesian network inference algorithms – including tree clustering algorithms [2,33,38,39,46,65], conditioning algorithms =-=[16,17,22,32,57,58,64]-=-, and elimination algorithms [19, 47, 72] – depends on the treewidth or the optimal maximal clique size of a BN’s induced clique tree [5,17,20,21]. Treewidth was initially a theoretical concept relate... |

1283 |
Local computations with probabilities on graphical structures and their application to expert systems
- Lauritzen, Spiegelhalter
- 1988
(Show Context)
Citation Context ...], developing efficient algorithms for these inference problems is an important research problem. The performance of exact Bayesian network inference algorithms – including tree clustering algorithms =-=[2,33,38,39,46,65]-=-, conditioning algorithms [16,17,22,32,57,58,64], and elimination algorithms [19, 47, 72] – depends on the treewidth or the optimal maximal clique size of a BN’s induced clique tree [5,17,20,21]. Tree... |

1156 |
Information Theory, Inference, and Learning Algorithms
- MacKay
- 2005
(Show Context)
Citation Context ...ignment of a corresponding CNF formula to construct hard instances for the MPE problem. These BPART networks are also similar in structure to application BNs from medicine [67] and information theory =-=[27, 28, 49]-=-, and our results should be relevant to these two areas of research. The other class of BNs, the MPART networks, is closely related to an approach of Kask and Dechter [41]. For these MPART networks we... |

904 |
An Introduction to Bayesian Networks
- Jensen
- 1996
(Show Context)
Citation Context ... in multiply connected Bayesian networks [58]. Like other tree clustering algorithms, the H���� algorithm employs two phases: a compilation (or clustering) phase and a propagation (or run-time) phase =-=[2, 37, 39, 46]-=-. During compilation, a Bayesian network is transformed into cliques organized in a clique tree. During propagation, evidence is propagated in the clique tree, leading to belief updating or belief rev... |

680 | A New Method for Solving Hard Satisfiability Problems
- Selman, Levesque, et al.
- 1992
(Show Context)
Citation Context ...ference in BNs has to be experimental and rely on the use of BN instances. Similar experiments are needed and have indeed been performed for other problems, including the satisfiability problem (SAT) =-=[11,13,26,55,62,63]-=-. For SAT, it has been established empirically that there 1sis a phase transition in the probability of satisfiability of an instance drawn from a certain distribution [55]. This phase transition phen... |

580 |
The computational complexity of probabilistic inference using bayesian belief networks
- Cooper
- 1990
(Show Context)
Citation Context ...f Bayesian networks using BPART and MPART. 1 Introduction Essentially all inference problems studied using the Bayesian network (BN) formalism are known to be computationally hard in the general case =-=[14, 60, 66]-=-. Given the central role of BNs in a wide range of automated reasoning applications, for example in medical diagnosis [3, 43, 67], probabilistic risk analysis [9,45], language understanding [10,12], i... |

577 | Where the really hard problems are
- Cheeseman, Kanefsky, et al.
- 1991
(Show Context)
Citation Context ...ference in BNs has to be experimental and rely on the use of BN instances. Similar experiments are needed and have indeed been performed for other problems, including the satisfiability problem (SAT) =-=[11,13,26,55,62,63]-=-. For SAT, it has been established empirically that there 1sis a phase transition in the probability of satisfiability of an instance drawn from a certain distribution [55]. This phase transition phen... |

422 |
density parity check codes
- Gallager
(Show Context)
Citation Context ... size sample means were from 3.0 to 5.4 times greater for Class A BNs compared to Class B BNs. These results also shed new light on the computational benefit of irregularity in information theory BNs =-=[27, 28, 48]-=-. Our studies with MPART used quite different input parameters for BN sample generation. Still, the regression results bear resemblance to BPART’s regression results. There turned out to be approximat... |

389 | Network-based heuristics for constraint-satisfaction problems - Dechter, Pearl - 1987 |

364 | Graphs minors II. Algorithmic aspects of tree-width - Robertson, Seymour - 1986 |

361 | Noise strategies for improving local search
- Selman, Kautz, et al.
- 1994
(Show Context)
Citation Context ...ference in BNs has to be experimental and rely on the use of BN instances. Similar experiments are needed and have indeed been performed for other problems, including the satisfiability problem (SAT) =-=[11,13,26,55,62,63]-=-. For SAT, it has been established empirically that there 1sis a phase transition in the probability of satisfiability of an instance drawn from a certain distribution [55]. This phase transition phen... |

324 |
Complexity of finding embeddings in a k-tree
- Arnborg, Corneil, et al.
- 1987
(Show Context)
Citation Context ...eewidth ̟ ∗ for a graph is in itself computationally hard. In particular, the problem of determining whether the treewidth of a given graph is bounded by an integer k has been shown to be NP-complete =-=[6]-=-. However, it is possible to empirically establish lower bounds for treewidth as well as upper bounds, computed using heuristics, for treewidth in polynomial time [44]. Optimal triangulation is closel... |

287 |
Bayesian Updating in Causal Probabilistic Networks by Local Computations
- Jensen, Lauritzen, et al.
- 1990
(Show Context)
Citation Context ...], developing efficient algorithms for these inference problems is an important research problem. The performance of exact Bayesian network inference algorithms – including tree clustering algorithms =-=[2,33,38,39,46,65]-=-, conditioning algorithms [16,17,22,32,57,58,64], and elimination algorithms [19, 47, 72] – depends on the treewidth or the optimal maximal clique size of a BN’s induced clique tree [5,17,20,21]. Tree... |

283 |
and easy distributions of sat problems
- Hard
- 1992
(Show Context)
Citation Context |

270 | Bucket elimination: A unifying framework for reasoning
- Dechter
- 1999
(Show Context)
Citation Context ...m. The performance of exact Bayesian network inference algorithms – including tree clustering algorithms [2,33,38,39,46,65], conditioning algorithms [16,17,22,32,57,58,64], and elimination algorithms =-=[19, 47, 72]-=- – depends on the treewidth or the optimal maximal clique size of a BN’s induced clique tree [5,17,20,21]. Treewidth was initially a theoretical concept related to graph minors [59]; it has more recen... |

256 |
Fast Probabilistic Algorithms for Hamiltonian Circuits and Matchings
- Angluin, Valiant
- 1979
(Show Context)
Citation Context ...rating problem instances randomly, a common practice in the BN community [7, 17, 34, 35, 41, 56, 69, 70], may result in easy inference problems that do not present a challenge to inference algorithms =-=[4,11,25]-=-, even though worst case complexity results show that both exact and approximate MPE computation is NP-hard [1, 66]. In this article we extend previous research on randomly generating BN instances and... |

256 |
Graphical Models for Machine Learning and Digital Communication
- Frey
- 1998
(Show Context)
Citation Context ...reasoning applications, for example in medical diagnosis [3, 43, 67], probabilistic risk analysis [9,45], language understanding [10,12], intelligent data analysis [40,54,61], error correction coding =-=[27,28,48, 49]-=-, and biological pedigree analysis [68], developing efficient algorithms for these inference problems is an important research problem. The performance of exact Bayesian network inference algorithms –... |

216 | On the Hardness of approximate reasoning
- Roth
- 1996
(Show Context)
Citation Context ...f Bayesian networks using BPART and MPART. 1 Introduction Essentially all inference problems studied using the Bayesian network (BN) formalism are known to be computationally hard in the general case =-=[14, 60, 66]-=-. Given the central role of BNs in a wide range of automated reasoning applications, for example in medical diagnosis [3, 43, 67], probabilistic risk analysis [9,45], language understanding [10,12], i... |

206 |
Efficient algorithms for combinatorial problems on graphs with bounded decomposability – a survey
- Arnborg
- 1985
(Show Context)
Citation Context ...[2,33,38,39,46,65], conditioning algorithms [16,17,22,32,57,58,64], and elimination algorithms [19, 47, 72] – depends on the treewidth or the optimal maximal clique size of a BN’s induced clique tree =-=[5,17,20,21]-=-. Treewidth was initially a theoretical concept related to graph minors [59]; it has more recently been established that the notion of treewidth plays a key role in the analysis of algorithms [8,20,44... |

190 |
Some theorem on abstract graphs
- Dirac
- 1952
(Show Context)
Citation Context ...guaranteed. 24 (16)sThere are cases where one might not be able to show that a Hamiltonian cycle must exist, but one can compute the longest cycle c(γ ′ ) and apply the following theorem due to Dirac =-=[23]-=-. Theorem 44 (Longest cycle, Dirac) Let c (G) be the length of the longest cycle of an undirected graph G. If G is 2-connected then c (G) ≥ min (n(G), 2δ(G)) . The longest cycle is determined by the f... |

179 |
Nonserial Dynamic Programming
- Bertelè, Brioschi
- 1972
(Show Context)
Citation Context ...7,20,21]. Treewidth was initially a theoretical concept related to graph minors [59]; it has more recently been established that the notion of treewidth plays a key role in the analysis of algorithms =-=[8,20,44]-=-. A significant component of research on inference in BNs has to be experimental and rely on the use of BN instances. Similar experiments are needed and have indeed been performed for other problems, ... |

170 | Improved low-density parity-check codes using irregular graphs and belief propagation
- Luby, Shokrolloahi, et al.
(Show Context)
Citation Context ...reasoning applications, for example in medical diagnosis [3, 43, 67], probabilistic risk analysis [9,45], language understanding [10,12], intelligent data analysis [40,54,61], error correction coding =-=[27,28,48, 49]-=-, and biological pedigree analysis [68], developing efficient algorithms for these inference problems is an important research problem. The performance of exact Bayesian network inference algorithms –... |

155 | Exploiting causal independence in bayesian network inference
- Zhang, Poole
- 1996
(Show Context)
Citation Context ...m. The performance of exact Bayesian network inference algorithms – including tree clustering algorithms [2,33,38,39,46,65], conditioning algorithms [16,17,22,32,57,58,64], and elimination algorithms =-=[19, 47, 72]-=- – depends on the treewidth or the optimal maximal clique size of a BN’s induced clique tree [5,17,20,21]. Treewidth was initially a theoretical concept related to graph minors [59]; it has more recen... |

151 |
HUGINA Shell for Building Bayesian Belief Universes for Expert Systems
- Andersen, Olesen, et al.
- 1989
(Show Context)
Citation Context ...], developing efficient algorithms for these inference problems is an important research problem. The performance of exact Bayesian network inference algorithms – including tree clustering algorithms =-=[2,33,38,39,46,65]-=-, conditioning algorithms [16,17,22,32,57,58,64], and elimination algorithms [19, 47, 72] – depends on the treewidth or the optimal maximal clique size of a BN’s induced clique tree [5,17,20,21]. Tree... |

149 | A.: Inference in belief networks: A procedural guide
- Huang, Darwiche
- 1996
(Show Context)
Citation Context |

141 |
Recursive conditioning
- Darwiche
- 2001
(Show Context)
Citation Context ... inference problems is an important research problem. The performance of exact Bayesian network inference algorithms – including tree clustering algorithms [2,33,38,39,46,65], conditioning algorithms =-=[16,17,22,32,57,58,64]-=-, and elimination algorithms [19, 47, 72] – depends on the treewidth or the optimal maximal clique size of a BN’s induced clique tree [5,17,20,21]. Treewidth was initially a theoretical concept relate... |

133 |
An algebra of Bayesian belief universes for knowledge-based systems
- Jensen, Olsen, et al.
- 1990
(Show Context)
Citation Context |

115 | Probabilistic diagnosis using a reformulation of the internist-1/qmr knowledge base
- Middleton, Shwe, et al.
- 1991
(Show Context)
Citation Context ...N) formalism are known to be computationally hard in the general case [14, 60, 66]. Given the central role of BNs in a wide range of automated reasoning applications, for example in medical diagnosis =-=[3, 43, 67]-=-, probabilistic risk analysis [9,45], language understanding [10,12], intelligent data analysis [40,54,61], error correction coding [27,28,48, 49], and biological pedigree analysis [68], developing ef... |

92 |
Bounded Conditioning: Flexible Inference for Decision under Scarce Resources
- Horvitz, Suermondt, et al.
- 1989
(Show Context)
Citation Context ... inference problems is an important research problem. The performance of exact Bayesian network inference algorithms – including tree clustering algorithms [2,33,38,39,46,65], conditioning algorithms =-=[16,17,22,32,57,58,64]-=-, and elimination algorithms [19, 47, 72] – depends on the treewidth or the optimal maximal clique size of a BN’s induced clique tree [5,17,20,21]. Treewidth was initially a theoretical concept relate... |

91 |
Finding MAPs for belief networks is NP-hard
- Shimony
- 1994
(Show Context)
Citation Context ...f Bayesian networks using BPART and MPART. 1 Introduction Essentially all inference problems studied using the Bayesian network (BN) formalism are known to be computationally hard in the general case =-=[14, 60, 66]-=-. Given the central role of BNs in a wide range of automated reasoning applications, for example in medical diagnosis [3, 43, 67], probabilistic risk analysis [9,45], language understanding [10,12], i... |

87 | Applications of a general propagation algorithm for probabilistic expert systems - Dawid - 1992 |

86 |
Munin—a causal probabilistic network for interpretation of electromyographic findings
- Andreassen, Woldbye, et al.
- 1987
(Show Context)
Citation Context ...N) formalism are known to be computationally hard in the general case [14, 60, 66]. Given the central role of BNs in a wide range of automated reasoning applications, for example in medical diagnosis =-=[3, 43, 67]-=-, probabilistic risk analysis [9,45], language understanding [10,12], intelligent data analysis [40,54,61], error correction coding [27,28,48, 49], and biological pedigree analysis [68], developing ef... |

77 | Approximation algorithms for the loop cutset problem
- Becker, Geiger
- 1994
(Show Context)
Citation Context ... the C/V -ratio and treewidth or maximal clique size? Answering these research questions is important for several reasons. Generating problem instances randomly, a common practice in the BN community =-=[7, 17, 34, 35, 41, 56, 69, 70]-=-, may result in easy inference problems that do not present a challenge to inference algorithms [4,11,25], even though worst case complexity results show that both exact and approximate MPE computatio... |

66 | Inductive and bayesian learning in medical diagnosis
- Kononenko
- 1993
(Show Context)
Citation Context ...N) formalism are known to be computationally hard in the general case [14, 60, 66]. Given the central role of BNs in a wide range of automated reasoning applications, for example in medical diagnosis =-=[3, 43, 67]-=-, probabilistic risk analysis [9,45], language understanding [10,12], intelligent data analysis [40,54,61], error correction coding [27,28,48, 49], and biological pedigree analysis [68], developing ef... |

65 | Where gravity fails: Local search topology
- Frank, Cheeseman, et al.
- 1997
(Show Context)
Citation Context |

64 | A compendium of NP optimization problems - Crescenzi, Kann, et al. |

55 | Fattah, Y.: Topological parameters for time-space tradeoff
- Dechter, El
- 2001
(Show Context)
Citation Context ...[2,33,38,39,46,65], conditioning algorithms [16,17,22,32,57,58,64], and elimination algorithms [19, 47, 72] – depends on the treewidth or the optimal maximal clique size of a BN’s induced clique tree =-=[5,17,20,21]-=-. Treewidth was initially a theoretical concept related to graph minors [59]; it has more recently been established that the notion of treewidth plays a key role in the analysis of algorithms [8,20,44... |

53 |
A valuation-based language for expert systems
- Shenoy
- 1989
(Show Context)
Citation Context |

45 | Global conditioning for probabilistic inference in belief networks
- Shachter, Szolovits
- 1994
(Show Context)
Citation Context |

44 | Local search and the number of solutions
- Clark, Frank, et al.
- 1996
(Show Context)
Citation Context |

43 |
Probabilistic analysis of the Davis Putnam procedure for solving the satisfiability problem
- Franco, Paull
- 1983
(Show Context)
Citation Context ...rating problem instances randomly, a common practice in the BN community [7, 17, 34, 35, 41, 56, 69, 70], may result in easy inference problems that do not present a challenge to inference algorithms =-=[4,11,25]-=-, even though worst case complexity results show that both exact and approximate MPE computation is NP-hard [1, 66]. In this article we extend previous research on randomly generating BN instances and... |

43 | Treewidth: Computational Experiments
- Koster, Bodlaender, et al.
- 2001
(Show Context)
Citation Context ...7,20,21]. Treewidth was initially a theoretical concept related to graph minors [59]; it has more recently been established that the notion of treewidth plays a key role in the analysis of algorithms =-=[8,20,44]-=-. A significant component of research on inference in BNs has to be experimental and rely on the use of BN instances. Similar experiments are needed and have indeed been performed for other problems, ... |

39 |
Search-based methods to bound diagnostic probabilities in very large belief nets
- Henrion
- 1991
(Show Context)
Citation Context ...connection to an optimization problem. Concerning (ii), we argue that the bipartite topology is interesting in its own right and inference algorithms have been specifically designed for bipartite BNs =-=[31]-=-. Important classes of application BNs, including 19smedical diagnosis BNs such as the QMR-DT BN [67] and information theory BNs [27, 28], are essentially bipartite. 3 Bipartite medical BNs typically ... |

39 |
A Constraint Propagation Approach to Probabilistic Reasoning
- Pearl
- 1986
(Show Context)
Citation Context |

35 |
Probabilistic models of language processing and acquisition. Trends in Cognitive Sciences
- Chater, Manning
- 2006
(Show Context)
Citation Context ...4, 60, 66]. Given the central role of BNs in a wide range of automated reasoning applications, for example in medical diagnosis [3, 43, 67], probabilistic risk analysis [9,45], language understanding =-=[10,12]-=-, intelligent data analysis [40,54,61], error correction coding [27,28,48, 49], and biological pedigree analysis [68], developing efficient algorithms for these inference problems is an important rese... |

34 | Stochastic local search for Bayesian network
- Kask, Dechter
- 1999
(Show Context)
Citation Context ... the C/V -ratio and treewidth or maximal clique size? Answering these research questions is important for several reasons. Generating problem instances randomly, a common practice in the BN community =-=[7, 17, 34, 35, 41, 56, 69, 70]-=-, may result in easy inference problems that do not present a challenge to inference algorithms [4,11,25], even though worst case complexity results show that both exact and approximate MPE computatio... |

32 | Improving the analysis of dependable systems by mapping fault trees into Bayesian networks. Reliability Engineering and System Safety
- Bobbio, Portinale, et al.
- 2001
(Show Context)
Citation Context ...lly hard in the general case [14, 60, 66]. Given the central role of BNs in a wide range of automated reasoning applications, for example in medical diagnosis [3, 43, 67], probabilistic risk analysis =-=[9,45]-=-, language understanding [10,12], intelligent data analysis [40,54,61], error correction coding [27,28,48, 49], and biological pedigree analysis [68], developing efficient algorithms for these inferen... |

30 |
Probabilistic inference in multiply connected behef networks using loop cutsets, Int
- Suermondt, Cooper
- 1990
(Show Context)
Citation Context ... the C/V -ratio and treewidth or maximal clique size? Answering these research questions is important for several reasons. Generating problem instances randomly, a common practice in the BN community =-=[7, 17, 34, 35, 41, 56, 69, 70]-=-, may result in easy inference problems that do not present a challenge to inference algorithms [4,11,25], even though worst case complexity results show that both exact and approximate MPE computatio... |

27 |
Approximating MAPs for belief networks is NP-hard and other theorems
- Abdelbar, Hedetnieme
- 1998
(Show Context)
Citation Context ...in easy inference problems that do not present a challenge to inference algorithms [4,11,25], even though worst case complexity results show that both exact and approximate MPE computation is NP-hard =-=[1, 66]-=-. In this article we extend previous research on randomly generating BN instances and present an experimental paradigm for systematically generating increasingly hard random Bayesian network instances... |

27 | Local conditioning in Bayesian networks
- Diez
- 1996
(Show Context)
Citation Context |