## Chain Graphs for Learning (1995)

Venue: | In Uncertainty in Artificial Intelligence |

Citations: | 28 - 1 self |

### BibTeX

@INPROCEEDINGS{Buntine95chaingraphs,

author = {Wray Buntine},

title = {Chain Graphs for Learning},

booktitle = {In Uncertainty in Artificial Intelligence},

year = {1995},

pages = {46--54},

publisher = {Morgan Kaufmann}

}

### Years of Citing Articles

### OpenURL

### Abstract

Chain graphs combine directed and undirected graphs and their underlying mathematics combines properties of the two. This paper gives a simplified definition of chain graphs based on a hierarchical combination of Bayesian (directed) and Markov (undirected) networks. Examples of a chain graph are multivariate feed-forward networks, clustering with conditional interaction between variables, and forms of Bayes classifiers. Chain graphs are then extended using the notation of plates so that samples and data analysis problems can be represented in a graphical model as well. Implications for learning are discussed in the conclusion. 1 Introduction Probabilistic networks are a notational device that allow one to abstract forms of probabilistic reasoning without getting lost in the mathematical detail of the underlying equations. They offer a framework whereby many forms of probabilistic reasoning can be combined and performed on probabilistic models without careful hand programming. Efforts ...

### Citations

7052 |
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
- Pearl
- 1988
(Show Context)
Citation Context ...interpretation theorem, Theorem 2, are the major technical contribution of this paper. First, Sections 2 and 3 review basic results on directed and undirected networks, as for instance introduced in (=-=Pearl, 1988-=-; Whittaker, 1990). Necessary independence properties and functional representations of these networks, as needed for chain graphs, are summarized. Then the notion of conditional networks are formaliz... |

3921 | Pattern Classification and Scene Analysis - Duda, Hart - 1973 |

1774 | Introduction to the Theory of Neural Computation - Hertz, Palmer - 1991 |

1284 | Local Computations with Probabilities on Graphical Structures and Their Application to Expert Systems (with Discussion - Lauritzen, Spiegelhalter - 1988 |

376 |
Evaluating influence diagrams
- Shachter
- 1986
(Show Context)
Citation Context ...ssiveness of chain graphs is illustrated in Section 6 where a number of models are represented. Decision theoretic constructs could also be used to represent the decisions and utilities of a problem (=-=Shachter, 1986-=-), although this is not done here. In this paper, I define a chain graph as a hierarchical combination of directed (Bayesian) and undirected (Markov) networks. This definition extends the notion of bl... |

305 | Planning and control - Dean, Wellman - 1991 |

247 | Operations for learning with graphical models
- Buntine
- 1994
(Show Context)
Citation Context ...first suggested by Lauritzen and Spiegelhalter (Lauritzen & Spiegelhalter, 1988), and has subsequently been developed by several groups (Gilks, Thomas, & Spiegelhalter, 1993; Dawid & Lauritzen, 1993; =-=Buntine, 1994-=-; Shachter, Eddy, & Hasselblad, 1990). Whereas, an introduction to learning of Bayesian networks can be found in (Heckerman, 1995). This paper uses chain graphs (Lauritzen & Wermuth, 1989) as a genera... |

224 | Bayesian image restoration, with two applications in spatial statistics, (with discussion - Besag, York, et al. - 1991 |

217 | Equivalence and synthesis of causal models - Verma, Pearl - 1990 |

216 | The EM algorithm for graphical association models with missing data’, Computational Statistics and Analysis - Lauritzen - 1995 |

191 | Bayesian analysis in expert systems - Spiegelhalter, Dawid, et al. - 1993 |

186 | Conditional independence in statistical theory - Dawid - 1979 |

181 |
Connectionist learning of belief networks
- Neal
- 1992
(Show Context)
Citation Context ...tic neural networks Stochastic networks form the basis of the stochastic Boltzmann machine, and the Hopfield network (Hertz, Krogh, & Palmer, 1991), which both have relationships to graphical models (=-=Neal, 1992-=-). A stochastic network corresponds to an undirected network with hidden variables, except interactions involve quadratic terms at most. A simple configuration is given in Figure 6. On the left is a r... |

164 |
Graphical models for associations between variables, some of which are qualitative and some quantitative’, Annals of Statistics
- Lauritzen, Wermuth
- 1989
(Show Context)
Citation Context ...random fields and various Markov models. Lauritzen and Wermuth demonstrated that chain graphs are a powerful tool for modeling statistical analysis, research hypotheses, and hence learning (Wermuth & =-=Lauritzen, 1989-=-). Chain graphs when augmented with deterministic nodes can represent many well known models as a special case including generalized linear models, various forms of clustering, feed-forward neural net... |

140 | Independence properties of directed markov fields - Lauritzen, Dawid, et al. - 1990 |

120 | Hyper Markov laws in the statistical analysis of decomposable models - DAWID, LAURITZEN - 1993 |

111 | A language and program for complex Bayesian modelling. The Statistician - Gilks, Thomas, et al. - 1994 |

105 |
The chain graph Markov property
- FRYDENBERG
- 1990
(Show Context)
Citation Context ...of directed (Bayesian) and undirected (Markov) networks. This definition extends the notion of block recursive models used in (Wermuth & Lauritzen, 1989; H��jsgaard & Thiesson, 1995) and analyzed =-=in (Frydenberg, 1990-=-, Theorem 4.1) by allowing blocks to include directed networks as well as undirected networks. This definition allows the complex independence properties and functional form of a chain graph (Frydenbe... |

58 |
Random Fields and Inverse Problems in Imaging
- GEMAN
(Show Context)
Citation Context ...C (C) ; (2) for some functions fC ? 0. The proof follows directly from (Frydenberg, 1990; Buntine, 1994). A form of this theorem for finite discrete domains is called the Hammersley-Clifford Theorem (=-=Geman, 1990-=-; Besag, York, & Mollie, 1991). Again, Equation (2) is used as the interpretation of an undirected network. 4 Conditional networks Networks can also represent conditional probability distributions. Co... |

57 | Real-world applications of bayesian networks - Heckerman, Mamdani, et al. - 1995 |

52 |
On substantive research hypotheses, conditional independence graphs and graphical chain models
- N, Lauritzen
- 1990
(Show Context)
Citation Context ...itzen, 1993; Buntine, 1994; Shachter, Eddy, & Hasselblad, 1990). Whereas, an introduction to learning of Bayesian networks can be found in (Heckerman, 1995). This paper uses chain graphs (Lauritzen & =-=Wermuth, 1989-=-) as a general probabilistic network model. Chain graphs mix undirected and directed graphs (or networks) to give a probabilistic representation that includes Markov random fields and various Markov m... |

30 | On the Markov equivalence of chain graphs, undirected graphs and acyclic digraphs - Andersson, Madigan, et al. - 1997 |

29 |
An entropy-based learning algorithm of Bayesian conditional trees
- Geiger
- 1992
(Show Context)
Citation Context ...supervised learning systems. Bayesian networks offer a rich representation for designing many different kinds of Bayesian classifiers, for instance illustrated with the Bayesian conditional trees of (=-=Geiger, 1992). Chain g-=-raphs offer a richer family again of Bayesian classifiers, and a nice framework for their elicitation. During elicitation we can interpret the directed arcs in the "true" 1 chain graph as be... |

23 |
An Ordered Examination of Influence Diagrams
- Shachter
- 1990
(Show Context)
Citation Context ...x 2 x 3 m 1 o 1 o 2 h 1 h 2 h 3 Sigmoid Sigmoid Sigmoid Sigmoid Gaussian Gaussian (b) x 1 x 2 x 3 m 1 m 2 (a) Figure 4: A feed-forward network and its chain graph influence diagrams is considered by (=-=Shachter, 1990-=-). The network outputs m 1 and m 2 represent the mean of a bivariate Gaussian. To analyze these nodes, we need to extend the usual definition of a parent and a child for a graph. Only one case is give... |

20 | Thinking Backwards for Knowledge Acquisition - Shachter, Heckerman - 1987 |

14 | BIFROST --- Block recursive models Induced From Relevant knowledge - H��jsgaard - 1992 |

7 | An in uence diagram approach to medical technology assessment - Shachter, Eddy, et al. - 1990 |

5 |
Network methods in statistics
- Ripley
- 1994
(Show Context)
Citation Context ...ralized linear models, various forms of clustering, feed-forward neural networks and various stochastic neural networks. This includes a large number of the more general network models now available (=-=Ripley, 1994-=-). These many different models are formed by combining basic nodes in the network representing for instance, Gaussian variables or deterministic Sigmoid units. The expressiveness of chain graphs is il... |

3 |
Bayesian networks for knowledge representation and learning", in Advances in Knowledge Discovery and Data
- Heckerman
- 1995
(Show Context)
Citation Context ...l groups (Gilks, Thomas, & Spiegelhalter, 1993; Dawid & Lauritzen, 1993; Buntine, 1994; Shachter, Eddy, & Hasselblad, 1990). Whereas, an introduction to learning of Bayesian networks can be found in (=-=Heckerman, 1995-=-). This paper uses chain graphs (Lauritzen & Wermuth, 1989) as a general probabilistic network model. Chain graphs mix undirected and directed graphs (or networks) to give a probabilistic representati... |