Results 1  10
of
16
Factorized normalized maximum likelihood criterion for learning bayesian network structures,” Submitted for PGM08
, 2008
"... This paper introduces a new scoring criterion, factorized normalized maximum likelihood, for learning Bayesian network structures. The proposed scoring criterion requires no parameter tuning, and it is decomposable and asymptotically consistent. We compare the new scoring criterion to other scoring ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
This paper introduces a new scoring criterion, factorized normalized maximum likelihood, for learning Bayesian network structures. The proposed scoring criterion requires no parameter tuning, and it is decomposable and asymptotically consistent. We compare the new scoring criterion to other scoring criteria and describe its practical implementation. Empirical tests confirm its good performance. 1
Learning networks determined by the ratio of prior and data
 In Proceedings of 26th Conference Conference on Uncertainty in Artificial Intelligence
, 2010
"... Recent reports have described that the equivalent sample size (ESS) in a Dirichlet prior plays an important role in learning Bayesian networks. This paper provides an asymptotic analysis of the marginal likelihood score for a Bayesian network. Results show that the ratio of the ESS and sample size ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Recent reports have described that the equivalent sample size (ESS) in a Dirichlet prior plays an important role in learning Bayesian networks. This paper provides an asymptotic analysis of the marginal likelihood score for a Bayesian network. Results show that the ratio of the ESS and sample size determine the penalty of adding arcs in learning Bayesian networks. The number of arcs increases monotonically as the ESS increases; the number of arcs monotonically decreases as the ESS decreases. Furthermore, the marginal likelihood score provides a unified expression of various score metrics by changing prior knowledge. 1
Locally Minimax Optimal Predictive Modeling with Bayesian Networks
"... We propose an informationtheoretic approach for predictive modeling with Bayesian networks. Our approach is based on the minimax optimal Normalized Maximum Likelihood (NML) distribution, motivated by the MDL principle. In particular, we present a parameter learning method which, together with a pre ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
We propose an informationtheoretic approach for predictive modeling with Bayesian networks. Our approach is based on the minimax optimal Normalized Maximum Likelihood (NML) distribution, motivated by the MDL principle. In particular, we present a parameter learning method which, together with a previously introduced NMLbased model selection criterion, provides a way to construct highly predictive Bayesian network models from data. The method is parameterfree and robust, unlike the currently popular Bayesian marginal likelihood approach which has been shown to be sensitive to the choice of prior hyperparameters. Empirical tests show that the proposed method compares favorably with the Bayesian approach in predictive tasks. 1
OneShot Learning with Bayesian Networks
"... Humans often make accurate inferences given a single exposure to a novel situation. Some of these inferences can be achieved by discovering and using neardeterministic relationships between attributes. Approaches based on Bayesian networks are good at discovering and using soft probabilistic relati ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Humans often make accurate inferences given a single exposure to a novel situation. Some of these inferences can be achieved by discovering and using neardeterministic relationships between attributes. Approaches based on Bayesian networks are good at discovering and using soft probabilistic relationships between attributes, but typically fail to identify and exploit neardeterministic relationships. Here we develop a Bayesian network approach that overcomes this limitation by learning a hyperparameter for each distribution in the network that specifies whether it is nondeterministic or neardeterministic. We apply our approach to oneshot learning problems based on a realworld database of immigration records, and show that it outperforms a more standard Bayesian network approach.
Learning optimal Bayesian networks with heuristic search
 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING, MISSISSIPPI STATE UNIVERSITY
, 2012
"... Bayesian networks are a widely used graphical model which formalize reasoning under uncertainty. Unfortunately, construction of a Bayesian network by an expert is timeconsuming, and, in some cases, all experts may not agree on the best structure for a problem domain. Additionally, for some complex ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Bayesian networks are a widely used graphical model which formalize reasoning under uncertainty. Unfortunately, construction of a Bayesian network by an expert is timeconsuming, and, in some cases, all experts may not agree on the best structure for a problem domain. Additionally, for some complex systems such as those present in molecular biology, experts with an understanding of the entire domain and how individual components interact may not exist. In these cases, we must learn the network structure from available data. This dissertation focuses on scorebased structure learning. In this context, a scoring function is used to measure the goodness of fit of a structure to data. The goal is to find the structure which optimizes the scoring function. The first contribution of this dissertation is a shortestpath finding perspective for the problem of learning optimal Bayesian network structures. This perspective builds on earlier dynamic programming strategies, but, as we show, offers much more flexibility. Second, we develop a set of data structures to improve the efficiency of many of the
AN NMLBASED METHOD FOR LEARNING BAYESIAN NETWORKS
"... Bayesian networks are among most popular model classes for discrete vectorvalued i.i.d data. Currently the most popular model selection criterion for Bayesian networks follows Bayesian paradigm. However, this method has ..."
Abstract
 Add to MetaCart
(Show Context)
Bayesian networks are among most popular model classes for discrete vectorvalued i.i.d data. Currently the most popular model selection criterion for Bayesian networks follows Bayesian paradigm. However, this method has
Statistics and Computing manuscript No. (will be inserted by the editor) Learning Discrete Decomposable Graphical Models via Constraint Optimization
, 2015
"... Abstract Statistical model learning problems are traditionally solved using either heuristic greedy optimization or stochastic simulation, such as Markov chain Monte Carlo or simulated annealing. Recently, there has been an increasing interest in the use of combinatorial search methods, including ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Statistical model learning problems are traditionally solved using either heuristic greedy optimization or stochastic simulation, such as Markov chain Monte Carlo or simulated annealing. Recently, there has been an increasing interest in the use of combinatorial search methods, including those based on computational logic. Some of these methods are particularly attractive since they can also be successful in proving the global optimality of solutions, in contrast to stochastic algorithms that only guarantee optimality at the limit. Here we improve and generalize a recently introduced constraintbased method for learning undirected graphical models. The new method combines perfect elimi
Learning Extended Tree Augmented Naive StructuresI
"... This work proposes an extended version of the wellknown treeaugmented naive Bayes (TAN) classifier where the structure learning step is performed without requiring features to be connected to the class. Based on a modification of Edmonds ’ algorithm, our structure learning procedure explores a sup ..."
Abstract
 Add to MetaCart
(Show Context)
This work proposes an extended version of the wellknown treeaugmented naive Bayes (TAN) classifier where the structure learning step is performed without requiring features to be connected to the class. Based on a modification of Edmonds ’ algorithm, our structure learning procedure explores a superset of the structures that are considered by TAN, yet achieves global optimality of the learning score function in a very efficient way (quadratic in the number of features, the same complexity as learning TANs). We enhance our procedure with a new score function that only takes into account arcs that are relevant to predict the class, as well as an optimization over the equivalent sample size during learning. These ideas may be useful for structure learning of Bayesian networks in general. A range of experiments show that we obtain models with better prediction accuracy than Naive Bayes and TAN, and comparable to the accuracy of the stateoftheart classifier averaged onedependence estimator (AODE). We release our implementation of ETAN so that it can be easily installed and run within Weka.
CALCULATING THE NML DISTRIBUTION FOR TREESTRUCTURED BAYESIAN NETWORKS
"... We are interested in model class selection. We want to compute a criterion which, given two competing model classes, chooses the better one. When learning Bayesian network structures from sample data, an important issue is how to evaluate the goodness of alternative network structures. Perhaps the m ..."
Abstract
 Add to MetaCart
(Show Context)
We are interested in model class selection. We want to compute a criterion which, given two competing model classes, chooses the better one. When learning Bayesian network structures from sample data, an important issue is how to evaluate the goodness of alternative network structures. Perhaps the most commonly used model (class) selection criterion is the marginal likelihood, which is obtained by integrating over a prior distribution for the model parameters. However, the problem of determining a reasonable prior for the parameters is a highly controversial issue, and no completely satisfying Bayesian solution has yet been presented in the noninformative setting [1]. The normalized maximum likelihood (NML), based on Rissanen’s informationtheoretic Minimum Description Length MDL methodology [2,
METHODOLOGY ARTICLE Learning genetic epistasis using Bayesian network scoring criteria
"... Background: Genegene epistatic interactions likely play an important role in the genetic basis of many common diseases. Recently, machinelearning and data mining methods have been developed for learning epistatic relationships from data. A wellknown combinatorial method that has been successfully ..."
Abstract
 Add to MetaCart
(Show Context)
Background: Genegene epistatic interactions likely play an important role in the genetic basis of many common diseases. Recently, machinelearning and data mining methods have been developed for learning epistatic relationships from data. A wellknown combinatorial method that has been successfully applied for detecting epistasis is Multifactor Dimensionality Reduction (MDR). Jiang et al. created a combinatorial epistasis learning method called BNMBL to learn Bayesian network (BN) epistatic models. They compared BNMBL to MDR using simulated data sets. Each of these data sets was generated from a model that associates two SNPs with a disease and includes 18 unrelated SNPs. For each data set, BNMBL and MDR were used to score all 2SNP models, and BNMBL learned significantly more correct models. In real data sets, we ordinarily do not know the number of SNPs that influence phenotype. BNMBL may not perform as well if we also scored models containing more than two SNPs. Furthermore, a number of other BN scoring criteria have been developed. They may detect epistatic interactions even better than BNMBL. Although BNs are a promising tool for learning epistatic relationships from data, we cannot confidently use them in this domain until we determine which scoring criteria work best or even well when we try learning the correct model without knowledge of the number of SNPs in that model.