Results 1  10
of
187
A Bayesian method for the induction of probabilistic networks from data
 Machine Learning
, 1992
"... Abstract. This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computerassisted hypothesis testing, automated scientific discovery, and automated construction of ..."
Abstract

Cited by 1079 (26 self)
 Add to MetaCart
Abstract. This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computerassisted hypothesis testing, automated scientific discovery, and automated construction of probabilistic expert systems. We extend the basic method to handle missing data and hidden (latent) variables. We show how to perform probabilistic inference by averaging over the inferences of multiple belief networks. Results are presented of a preliminary evaluation of an algorithm for constructing a belief network from a database of cases. Finally, we relate the methods in this paper to previous work, and we discuss open problems.
Learning Bayesian networks: The combination of knowledge and statistical data
 Machine Learning
, 1995
"... We describe scoring metrics for learning Bayesian networks from a combination of user knowledge and statistical data. We identify two important properties of metrics, which we call event equivalence and parameter modularity. These properties have been mostly ignored, but when combined, greatly simpl ..."
Abstract

Cited by 905 (34 self)
 Add to MetaCart
We describe scoring metrics for learning Bayesian networks from a combination of user knowledge and statistical data. We identify two important properties of metrics, which we call event equivalence and parameter modularity. These properties have been mostly ignored, but when combined, greatly simplify the encoding of a user’s prior knowledge. In particular, a user can express his knowledge—for the most part—as a single prior Bayesian network for the domain. 1
Loopy Belief Propagation for Approximate Inference: An Empirical Study
 In Proceedings of Uncertainty in AI
, 1999
"... Recently, researchers have demonstrated that "loopy belief propagation"  the use of Pearl's polytree algorithm in a Bayesian network with loops  can perform well in the context of errorcorrecting codes. The most dramatic instance of this is the near Shannonlimit performance of "Turbo ..."
Abstract

Cited by 471 (18 self)
 Add to MetaCart
Recently, researchers have demonstrated that "loopy belief propagation"  the use of Pearl's polytree algorithm in a Bayesian network with loops  can perform well in the context of errorcorrecting codes. The most dramatic instance of this is the near Shannonlimit performance of "Turbo Codes"  codes whose decoding algorithm is equivalent to loopy belief propagation in a chainstructured Bayesian network. In this paper we ask: is there something special about the errorcorrecting code context, or does loopy propagation work as an approximate inference scheme in a more general setting? We compare the marginals computed using loopy propagation to the exact ones in four Bayesian network architectures, including two realworld networks: ALARM and QMR. We find that the loopy beliefs often converge and when they do, they give a good approximation to the correct marginals. However, on the QMR network, the loopy beliefs oscillated and had no obvious relationship ...
Learning Bayesian Networks With Local Structure
, 1996
"... . We examine a novel addition to the known methods for learning Bayesian networks from data that improves the quality of the learned networks. Our approach explicitly represents and learns the local structure in the conditional probability distributions (CPDs) that quantify these networks. This inc ..."
Abstract

Cited by 234 (13 self)
 Add to MetaCart
. We examine a novel addition to the known methods for learning Bayesian networks from data that improves the quality of the learned networks. Our approach explicitly represents and learns the local structure in the conditional probability distributions (CPDs) that quantify these networks. This increases the space of possible models, enabling the representation of CPDs with a variable number of parameters. The resulting learning procedure induces models that better emulate the interactions present in the data. We describe the theoretical foundations and practical aspects of learning local structures and provide an empirical evaluation of the proposed learning procedure. This evaluation indicates that learning curves characterizing this procedure converge faster, in the number of training instances, than those of the standard procedure, which ignores the local structure of the CPDs. Our results also show that networks learned with local structures tend to be more complex (in terms of a...
Being Bayesian about network structure
 Machine Learning
, 2000
"... Abstract. In many multivariate domains, we are interested in analyzing the dependency structure of the underlying distribution, e.g., whether two variables are in direct interaction. We can represent dependency structures using Bayesian network models. To analyze a given data set, Bayesian model sel ..."
Abstract

Cited by 203 (5 self)
 Add to MetaCart
Abstract. In many multivariate domains, we are interested in analyzing the dependency structure of the underlying distribution, e.g., whether two variables are in direct interaction. We can represent dependency structures using Bayesian network models. To analyze a given data set, Bayesian model selection attempts to find the most likely (MAP) model, and uses its structure to answer these questions. However, when the amount of available data is modest, there might be many models that have nonnegligible posterior. Thus, we want compute the Bayesian posterior of a feature, i.e., the total posterior probability of all models that contain it. In this paper, we propose a new approach for this task. We first show how to efficiently compute a sum over the exponential number of networks that are consistent with a fixed order over network variables. This allows us to compute, for a given order, both the marginal probability of the data and the posterior of a feature. We then use this result as the basis for an algorithm that approximates the Bayesian posterior of a feature. Our approach uses a Markov Chain Monte Carlo (MCMC) method, but over orders rather than over network structures. The space of orders is smaller and more regular than the space of structures, and has much a smoother posterior “landscape”. We present empirical results on synthetic and reallife datasets that compare our approach to full model averaging (when possible), to MCMC over network structures, and to a nonBayesian bootstrap approach.
Learning Bayesian belief networks: An approach based on the MDL principle
 Computational Intelligence
, 1994
"... A new approach for learning Bayesian belief networks from raw data is presented. The approach is based on Rissanen's Minimal Description Length (MDL) principle, which is particularly well suited for this task. Our approach does not require any prior assumptions about the distribution being learned. ..."
Abstract

Cited by 188 (8 self)
 Add to MetaCart
A new approach for learning Bayesian belief networks from raw data is presented. The approach is based on Rissanen's Minimal Description Length (MDL) principle, which is particularly well suited for this task. Our approach does not require any prior assumptions about the distribution being learned. In particular, our method can learn unrestricted multiplyconnected belief networks. Furthermore, unlike other approaches our method allows us to tradeo accuracy and complexity in the learned model. This is important since if the learned model is very complex (highly connected) it can be conceptually and computationally intractable. In such a case it would be preferable to use a simpler model even if it is less accurate. The MDL principle o ers a reasoned method for making this tradeo. We also show that our method generalizes previous approaches based on Kullback crossentropy. Experiments have been conducted to demonstrate the feasibility of the approach. Keywords: Knowledge Acquisition � Bayes Nets � Uncertainty Reasoning. 1
A Bayesian approach to learning Bayesian networks with local structure
 In Proceedings of Thirteenth Conference on Uncertainty in Artificial Intelligence
, 1997
"... Recently several researchers have investigated techniques for using data to learn Bayesian networks containing compact representations for the conditional probability distributions (CPDs) stored at each node. The majority of this work has concentrated on using decisiontree representations for the C ..."
Abstract

Cited by 166 (13 self)
 Add to MetaCart
Recently several researchers have investigated techniques for using data to learn Bayesian networks containing compact representations for the conditional probability distributions (CPDs) stored at each node. The majority of this work has concentrated on using decisiontree representations for the CPDs. In addition, researchers typically apply nonBayesian (or asymptotically Bayesian) scoring functions such as MDL to evaluate the goodnessoffit of networks to the data. In this paper we investigate a Bayesian approach to learning Bayesian networks that contain the more general decisiongraph representations of the CPDs. First, we describe how to evaluate the posterior probability— that is, the Bayesian score—of such a network, given a database of observed cases. Second, we describe various search spaces that can be used, in conjunction with a scoring function and a search procedure, to identify one or more highscoring networks. Finally, we present an experimental evaluation of the search spaces, using a greedy algorithm and a Bayesian scoring function. 1
Learning Equivalence Classes Of Bayesian Network Structures
, 1996
"... Approaches to learning Bayesian networks from data typically combine a scoring metric with a heuristic search procedure. Given aBayesian network structure, many of the scoring metrics derived in the literature return a score for the entire equivalence class to which the structure belongs. When ..."
Abstract

Cited by 128 (1 self)
 Add to MetaCart
Approaches to learning Bayesian networks from data typically combine a scoring metric with a heuristic search procedure. Given aBayesian network structure, many of the scoring metrics derived in the literature return a score for the entire equivalence class to which the structure belongs. When using such a metric, it is appropriate for the heuristic search algorithm to searchover equivalence classes of Bayesian networks as opposed to individual structures. We present the general formulation of a search space for which the states of the search correspond to equivalence classes of structures. Using this space, anyoneofanumber of heuristic searchalgorithms can easily be applied. We compare greedy search performance in the proposed search space to greedy search performance in a search space for which the states correspond to individual Bayesian network structures. 1
Learning Belief Networks in the Presence of Missing Values and Hidden Variables
 Proceedings of the Fourteenth International Conference on Machine Learning
, 1997
"... In recent years there has been a flurry of works on learning probabilistic belief networks. Current state of the art methods have been shown to be successful for two learning scenarios: learning both network structure and parameters from complete data, and learning parameters for a fixed network fr ..."
Abstract

Cited by 120 (14 self)
 Add to MetaCart
In recent years there has been a flurry of works on learning probabilistic belief networks. Current state of the art methods have been shown to be successful for two learning scenarios: learning both network structure and parameters from complete data, and learning parameters for a fixed network from incomplete datathat is, in the presence of missing values or hidden variables. However, no method has yet been demonstrated to effectively learn network structure from incomplete data. In this paper, we propose a new method for learning network structure from incomplete data. This method is based on an extension of the ExpectationMaximization (EM) algorithm for model selection problems that performs search for the best structure inside the EM procedure. We prove the convergence of this algorithm, and adapt it for learning belief networks. We then describe how to learn networks in two scenarios: when the data contains missing values, and in the presence of hidden variables. We provide...