Results 1  10
of
17
Learning Bayesian Network Structure using LP Relaxations
"... We propose to solve the combinatorial problem of finding the highest scoring Bayesian network structure from data. This structure learning problem can be viewed as an inference problem where the variables specify the choice of parents for each node in the graph. The key combinatorial difficulty aris ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
We propose to solve the combinatorial problem of finding the highest scoring Bayesian network structure from data. This structure learning problem can be viewed as an inference problem where the variables specify the choice of parents for each node in the graph. The key combinatorial difficulty arises from the global constraint that the graph structure has to be acyclic. We cast the structure learning problem as a linear program over the polytope defined by valid acyclic structures. In relaxing this problem, we maintain an outer bound approximation to the polytope and iteratively tighten it by searching over a new class of valid constraints. If an integral solution is found, it is guaranteed to be the optimal Bayesian network. When the relaxation is not tight, the fast dual algorithms we develop remain useful in combination with a branch and bound method. Empirical results suggest that the method is competitive or faster than alternative exact methods based on dynamic programming. 1
Efficient structure learning of Bayesian networks using constraints
 Journal of Machine Learning Research
"... This paper addresses the problem of learning Bayesian network structures from data based on score functions that are decomposable. It describes properties that strongly reduce the time and memory costs of many known methods without losing global optimality guarantees. These properties are derived fo ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
This paper addresses the problem of learning Bayesian network structures from data based on score functions that are decomposable. It describes properties that strongly reduce the time and memory costs of many known methods without losing global optimality guarantees. These properties are derived for different score criteria such as Minimum Description Length (or Bayesian Information Criterion), Akaike Information Criterion and Bayesian Dirichlet Criterion. Then a branchandbound algorithm is presented that integrates structural constraints with data in a way to guarantee global optimality. As an example, structural constraints are used to map the problem of structure learning in Dynamic Bayesian networks into a corresponding augmented Bayesian network. Finally, we show empirically the benefits of using the properties with stateoftheart methods and with the new algorithm, which is able to handle larger data sets than before.
Characteristic imsets for learning Bayesian network structure
, 2012
"... The motivation for the paper is the geometric approach to learning Bayesian network (BN) structure. The basic idea of our approach is to represent every BN structure by a certain uniquely determined vector so that usual scores for learning BN structure become affine functions of the vector represent ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
The motivation for the paper is the geometric approach to learning Bayesian network (BN) structure. The basic idea of our approach is to represent every BN structure by a certain uniquely determined vector so that usual scores for learning BN structure become affine functions of the vector representative. The original proposal from [26] was to use a special vector having integers as components, called the standard imset, as the representative. In this paper we introduce a new unique vector representative, called the characteristic imset, obtained from the standard imset by an affine transformation. Characteristic imsets are (shown to be) zeroone vectors and have many elegant properties, suitable for intended application of linear/integer programming methods to learning BN structure. They are much closer to the graphical description; we describe a simple transition between the characteristic imset and the essential graph, known as a traditional unique graphical representative of the BN structure. In the end, we relate our proposal to other recent approaches which apply linear programming methods in probabilistic reasoning.
Efficient and Accurate Learning of Bayesian Networks using ChiSquared Independence Tests
"... Bayesian network structure learning is a wellknown NPcomplete problem, whose solution is of importance in machine learning. Two algorithms are proposed, both of which assess dependency between variables using the chisquared test of independence between pairs of variables and the loglikelihood ev ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Bayesian network structure learning is a wellknown NPcomplete problem, whose solution is of importance in machine learning. Two algorithms are proposed, both of which assess dependency between variables using the chisquared test of independence between pairs of variables and the loglikelihood evaluation criterion for the network. The first determines the effect of adding a potential edge (in both directions) on the loglikelihood. The second uses KL divergence to determine direction, and edges to be included are determined by thresholding normalized chisquared statistics. Experiments on multinomial data show that the proposed algorithms are more efficient and accurate than an optimized branch and bound algorithm, and human experts. 1.
University Medal Finalist – One of the 6 Most Distinguished Graduating Seniors in the University. Honors and Awards
, 2010
"... ..."
David Sontag – Research Statement
"... In recent years, advances in science and lowcost permanent storage have resulted in the availability of massive data sets. Together with advances in machine learning, this data has the potential to lead to many new breakthroughs. For example, highthroughput genomic and proteomic experiments can be ..."
Abstract
 Add to MetaCart
In recent years, advances in science and lowcost permanent storage have resulted in the availability of massive data sets. Together with advances in machine learning, this data has the potential to lead to many new breakthroughs. For example, highthroughput genomic and proteomic experiments can be used to enable personalized medicine. Large data sets of search queries can be used to improve information retrieval. Historical climate data can be used to understand global warming and to better predict weather. However, to take full advantage of this data, we need models that are capable of explaining the data, and algorithms that can use the models to make predictions about the future. The goal of my research is to develop theory and practical algorithms for learning and probabilistic inference in very large statistical models. My research focuses on a class of statistical models called graphical models that describe multivariate probability distributions. 1 Graphical models provide a useful abstraction for quantifying uncertainty, describing complex dependencies in data while making the model’s structure explicit so that it can be exploited by algorithms. These models have been widely applied across diverse fields such as statistical machine learning, computational biology, statistical physics, communication theory, and information retrieval.
Refining Diagnostic POMDPs with User Feedback
"... Bayesian networks have been widely used for diagnostics. These models can be extended to POMDPs to select the best action. This allows modeling partial observability due to causes and the utility of executing various tests. We describe the problem of refining diagnostic POMDPs when user feedback is ..."
Abstract
 Add to MetaCart
Bayesian networks have been widely used for diagnostics. These models can be extended to POMDPs to select the best action. This allows modeling partial observability due to causes and the utility of executing various tests. We describe the problem of refining diagnostic POMDPs when user feedback is available. We propose utilizing user feedback to pose constraints on the model, i.e., the transition, observation and reward functions. These constraints can then be used to efficiently learn the POMDP model and incorporate expert knowledge about the problem.
Proceedings of the TwentySecond International Joint Conference on Artificial Intelligence Learning Optimal Bayesian Networks Using A * Search
"... This paper formulates learning optimal Bayesian network as a shortest path finding problem. An A* search algorithm is introduced to solve the problem. With the guidance of a consistent heuristic, the algorithm learns an optimal Bayesian network by only searching the most promising parts of the solut ..."
Abstract
 Add to MetaCart
This paper formulates learning optimal Bayesian network as a shortest path finding problem. An A* search algorithm is introduced to solve the problem. With the guidance of a consistent heuristic, the algorithm learns an optimal Bayesian network by only searching the most promising parts of the solution space. Empirical results show that the A* search algorithm significantly improves the time and space efficiency of existing methods on a set of benchmark datasets. 1
Article Sensor Data Fusion for Accurate Cloud Presence Prediction Using DempsterShafer Evidence Theory
, 2010
"... sensors ..."
Exact Maximum Margin Structure Learning of Bayesian Networks
"... Recently, there has been much interest in finding globally optimal Bayesian network structures. These techniques were developed for generative scores and can not be directly extended to discriminative scores, as desired for classification. In this paper, we propose an exact method for finding networ ..."
Abstract
 Add to MetaCart
Recently, there has been much interest in finding globally optimal Bayesian network structures. These techniques were developed for generative scores and can not be directly extended to discriminative scores, as desired for classification. In this paper, we propose an exact method for finding network structures maximizing the probabilistic soft margin, a successfully applied discriminative score. Our method is based on branchandbound techniques within a linear programming framework and maintains an anytime solution, together with worstcase suboptimality bounds. We apply a set of order constraints for enforcing the network structure to be acyclic, which allows a compact problem representation and the use of generalpurpose optimization techniques. In classification experiments, our methods clearly outperform generatively trained network structures and compete with support vector machines.