Results 1  10
of
11
A Bayesian approach to learning Bayesian networks with local structure
 In Proceedings of Thirteenth Conference on Uncertainty in Artificial Intelligence
, 1997
"... Recently several researchers have investigated techniques for using data to learn Bayesian networks containing compact representations for the conditional probability distributions (CPDs) stored at each node. The majority of this work has concentrated on using decisiontree representations for the C ..."
Abstract

Cited by 167 (14 self)
 Add to MetaCart
Recently several researchers have investigated techniques for using data to learn Bayesian networks containing compact representations for the conditional probability distributions (CPDs) stored at each node. The majority of this work has concentrated on using decisiontree representations for the CPDs. In addition, researchers typically apply nonBayesian (or asymptotically Bayesian) scoring functions such as MDL to evaluate the goodnessoffit of networks to the data. In this paper we investigate a Bayesian approach to learning Bayesian networks that contain the more general decisiongraph representations of the CPDs. First, we describe how to evaluate the posterior probability— that is, the Bayesian score—of such a network, given a database of observed cases. Second, we describe various search spaces that can be used, in conjunction with a scoring function and a search procedure, to identify one or more highscoring networks. Finally, we present an experimental evaluation of the search spaces, using a greedy algorithm and a Bayesian scoring function. 1
Probabilistic independence networks for hidden Markov probability models
, 1996
"... Graphical techniques for modeling the dependencies of random variables have been explored in a variety of different areas including statistics, statistical physics, artificial intelligence, speech recognition, image processing, and genetics. Formalisms for manipulating these models have been develop ..."
Abstract

Cited by 167 (12 self)
 Add to MetaCart
Graphical techniques for modeling the dependencies of random variables have been explored in a variety of different areas including statistics, statistical physics, artificial intelligence, speech recognition, image processing, and genetics. Formalisms for manipulating these models have been developed relatively independently in these research communities. In this paper we explore hidden Markov models (HMMs) and related structures within the general framework of probabilistic independence networks (PINs). The paper contains a selfcontained review of the basic principles of PINs. It is shown that the wellknown forwardbackward (FB) and Viterbi algorithms for HMMs are special cases of more general inference algorithms for arbitrary PINs. Furthermore, the existence of inference and estimation algorithms for more general graphical models provides a set of analysis tools for HMM practitioners who wish to explore a richer class of HMM structures. Examples of relatively complex models to handle sensor fusion and coarticulation in speech recognition are introduced and treated within the graphical model framework to illustrate the advantages of the general approach.
A Bayesian Approach to Causal Discovery
, 1997
"... We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraintbased approach. Both approaches rely on the Causal Markov assumption, but the two differ significantly in theory and practice. An important difference between the approaches is that t ..."
Abstract

Cited by 79 (1 self)
 Add to MetaCart
We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraintbased approach. Both approaches rely on the Causal Markov assumption, but the two differ significantly in theory and practice. An important difference between the approaches is that the constraintbased approach uses categorical information about conditionalindependence constraints in the domain, whereas the Bayesian approach weighs the degree to which such constraints hold. As a result, the Bayesian approach has three distinct advantages over its constraintbased counterpart. One, conclusions derived from the Bayesian approach are not susceptible to incorrect categorical decisions about independence facts that can occur with data sets of finite size. Two, using the Bayesian approach, finer distinctions among model structuresboth quantitative and qualitativecan be made. Three, information from several models can be combined to make better inferences and to better ...
On predictive distributions and Bayesian networks
 Statistics and Computing
, 2000
"... this paper we are interested in discrete prediction problems for a decisiontheoretic setting, where the ..."
Abstract

Cited by 38 (29 self)
 Add to MetaCart
this paper we are interested in discrete prediction problems for a decisiontheoretic setting, where the
Accelerated Quantification of Bayesian Networks with Incomplete Data
 In Proceedings of First International Conference on Knowledge Discovery and Data Mining
, 1995
"... Probabilistic expert systems based on Bayesian networks (BNs) require initial specification of both a qualitative graphical structure and quantitative assessment of conditional probability tables. This paper considers statistical batch learning of the probability tables on the basis of incomple ..."
Abstract

Cited by 28 (2 self)
 Add to MetaCart
Probabilistic expert systems based on Bayesian networks (BNs) require initial specification of both a qualitative graphical structure and quantitative assessment of conditional probability tables. This paper considers statistical batch learning of the probability tables on the basis of incomplete data and expert knowledge. The EM algorithm with a generalized conjugate gradient acceleration method has been dedicated to quantification of BNs by maximum posterior likelihood estimation for a superclass of the recursive graphical models. This new class of models allows a great variety of local functional restrictions to be imposed on the statistical model, which hereby extents the control and applicability of the constructed method for quantifying BNs. Introduction The construction of probabilistic expert systems (Pearl 1988, Andreassen et al. 1989) based on Bayesian networks (BNs) is often a challenging process. It is typically divided into two parts: First the constructi...
Learning Mixtures of Bayesian Networks
 in Cooper & Moral
, 1997
"... We describe a heuristic method for learning mixtures of Bayesian Networks (MBNs) from possibly incomplete data. The considered class of models is mixtures in which each mixture component is a Bayesian network encoding a conditional Gaussian distribution over a fixed set of variables. Some variables ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
We describe a heuristic method for learning mixtures of Bayesian Networks (MBNs) from possibly incomplete data. The considered class of models is mixtures in which each mixture component is a Bayesian network encoding a conditional Gaussian distribution over a fixed set of variables. Some variables may be hidden or otherwise have missing observations. A key idea in our approach is to treat expected data as real data. This allows us to interleave structure and parameter search and to take advantage of closed form approximations for marginal likelihood. In addition, by treating expected data as real data, the search criterion factors by variable, making the search processes more efficient. We evaluate our approach on synthetic and realworld data sets. Keywords : Mixture models, Bayesian networks, structure learning, parameter learning, hidden variables, EM algorithm. 1 Introduction There is growing interest in a class of models for density estimation known as Bayesian networks. In the...
Challenge: Where is the Impact of Bayesian Networks in Learning?
 In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence
, 1997
"... Bayesian networks are graphical representations of probability distributions. Over the last decade, these representations have become the method of choice for representation of uncertainly in artificial intelligence. Today, they play a crucial role in modern expert systems, diagnosis engines, and de ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Bayesian networks are graphical representations of probability distributions. Over the last decade, these representations have become the method of choice for representation of uncertainly in artificial intelligence. Today, they play a crucial role in modern expert systems, diagnosis engines, and decision support systems. In recent years, there has been much interest in learning Bayesian networks from data. Learning such models is desirable simply because there is a wide array of offtheshelf tools that can apply the learned models as described above. Practitioners also claim that adaptive Bayesian networks have advantages in their own right as a nonparametric method for density estimation, data analysis, pattern classification, and modeling. Among the reasons cited we find: their semantic clarity and understandability by humans, the ease of acquisition and incorporation of prior knowledge, the ease of integration with optimal decisionmaking methods, the possibility of causal interp...
Edinburgh University
"... Learning the structure of discrete Bayesian networks has been the subject of extensive research in machine learning, with most Bayesian approaches focusing on fully observed networks. One of the few methods that can handle networks with latent variables is the ”structural EM algorithm ” which interl ..."
Abstract
 Add to MetaCart
Learning the structure of discrete Bayesian networks has been the subject of extensive research in machine learning, with most Bayesian approaches focusing on fully observed networks. One of the few methods that can handle networks with latent variables is the ”structural EM algorithm ” which interleaves greedy structure search with the estimation of latent variables and parameters, maintaining a single best network at each step. We introduce Structural Expectation Propagation (SEP), an extension of EP which can infer the structure of Bayesian networks having latent variables and missing data. SEP performs variational inference in a joint model of structure, latent variables, and parameters, offering two advantages: (i) it accounts for uncertainty in structure and parameter values when making local distribution updates (ii) it returns a variational distribution over network structures rather than a single network, and. We demonstrate the performance of SEP both on synthetic problems and on realworld clinical data. 1
Aspects of the Interface between STatistics and . . .
, 1999
"... In recent years the crossfertilisation of ideas between the statistics and machine learning communities has become increasingly important. This exchange of ideas resulted from a recognition that the two communities often have to tackle similar problems and has resulted in an exchange which has enri ..."
Abstract
 Add to MetaCart
In recent years the crossfertilisation of ideas between the statistics and machine learning communities has become increasingly important. This exchange of ideas resulted from a recognition that the two communities often have to tackle similar problems and has resulted in an exchange which has enriched both disciplines. There is much to be gained in considering the two literatures in tandem, and the aim of this thesis is to build on some of the research currently taking place at the interface between these two disciplines. Specifically we will be considering a class of models called Bayesian belief networks. These are models which are closely related to neural networks, a type of model often used in machine learning but largely eschewed by statisticians due to their `black box' approach. Neural networks, while useful tools, lack transparency; by their nature it is difficult to interpret the method in which neural network