Results 11 - 20
of
33
On Separation Criterion and Recovery Algorithm for Chain Graphs
, 1996
"... Chain graphs (CGs) give a natural unifying point of view on Markov and Bayesian networks and enlarge the potential of graphical models for description of conditional independence structures. In the paper a direct graphical separation criterion for CGs which generalizes the d-separation criterion for ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Chain graphs (CGs) give a natural unifying point of view on Markov and Bayesian networks and enlarge the potential of graphical models for description of conditional independence structures. In the paper a direct graphical separation criterion for CGs which generalizes the d-separation criterion for Bayesian networks is introduced (recalled) . It is equivalent to the classic moralization criterion for CGs and complete in the sense that for every CG there exists a probability distribution satisfying exactly independencies derivable from the CG by the separation criterion. Every class of Markov equivalent CGs can be uniquely described by a natural representative, called the largest CG. A recovery algorithm, which on basis of the (conditional) dependency model given by a CG finds the corresponding largest CG, is presented. 1 INTRODUCTION Traditional graphical models for description of probabilistic conditional independence structure use either undirected graphs (UGs), named also Markov n...
Efficient Learning using Constrained Sufficient Statistics
- Proceedings of the 7th International Workshop on Artificial Intelligence and Statistic
, 1999
"... Learning Bayesian networks is a central problem for pattern recognition, density estimation and classification. In this paper, we propose a new method for speeding up the computational process of learning Bayesian network structure. This approach uses constraints imposed by the statistics already co ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Learning Bayesian networks is a central problem for pattern recognition, density estimation and classification. In this paper, we propose a new method for speeding up the computational process of learning Bayesian network structure. This approach uses constraints imposed by the statistics already collected from the data to guide the learning algorithm. This allows us to reduce the number of statistics collected during learning and thus speed up the learning time. We show that our method is capable of learning structure from data more efficiently than traditional approaches. Our technique is of particular importance when the size of the datasets is large or when learning from incomplete data. The basic technique that we introduce is general and can be used to improve learning performance in many settings where sufficient statistics must be computed. In addition, our technique may be useful for alternate search strategies such as branch and bound algorithms. 1 Introduction In recent yea...
Mind Change Optimal Learning of Bayes Net Structure". O.Schulte
- in Proceedings of the 20th Annual Conference on Learning Theory
, 2007
"... Abstract. This paper analyzes the problem of learning the structure of a Bayes net (BN) in the theoretical framework of Gold’s learning paradigm. Bayes nets are one of the most prominent formalisms for knowledge representation and probabilistic and causal reasoning. We follow constraint-based approa ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Abstract. This paper analyzes the problem of learning the structure of a Bayes net (BN) in the theoretical framework of Gold’s learning paradigm. Bayes nets are one of the most prominent formalisms for knowledge representation and probabilistic and causal reasoning. We follow constraint-based approaches to learning Bayes net structure, where learning is based on observed conditional dependencies between variables of interest (e.g., “X is dependent on Y given any assignment to variable Z”). Applying learning criteria in this model leads to the following results. (1) The mind change complexity of identifying a Bayes net graph over variables V from dependency data is � � |V|, the maximum number of 2 edges. (2) There is a unique fastest mind-change optimal Bayes net learner; convergence speed is evaluated using Gold’s dominance notion of “uniformly faster convergence”. This learner conjectures a graph if it is the unique Bayes net pattern that satisfies the observed dependencies with a minimum number of edges, and outputs “no guess ” otherwise. Therefore we are using standard learning criteria to define a natural and novel Bayes net learning algorithm. We investigate the complexity of computing the output of the fastest mind-change optimal learner, and show that this problem is NP-hard (assuming P=RP). To our knowledge this is the first NP-hardness result concerning the existence of a uniquely optimal Bayes net structure. 1
A Scoring Function for Learning Bayesian Networks based on Mutual Information and Conditional Independence Tests
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We propose a new scoring function for learning Bayesian networks from data using score search algorithms. This is based on the concept of mutual information and exploits some well-known properties of this measure in a novel way. Essentially, a statistical independence test based on the chi-square di ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We propose a new scoring function for learning Bayesian networks from data using score search algorithms. This is based on the concept of mutual information and exploits some well-known properties of this measure in a novel way. Essentially, a statistical independence test based on the chi-square distribution, associated with the mutual information measure, together with a property of additive decomposition of this measure, are combined in order to measure the degree of interaction between each variable and its parent variables in the network. The result is a non-Bayesian scoring function called MIT (mutual information tests) which belongs to the family of scores based on information theory. The MIT score also represents a penalization of the Kullback-Leibler divergence between the joint probability distributions associated with a candidate network and with the available data set. Detailed results of a complete experimental evaluation of the proposed scoring function and its comparison with the well-known K2, BDeu and BIC/MDL scores are also presented.
Combining Multiple Perspectives
- In Proceedings of the Seventeenth International Conference on Machine Learning
, 2000
"... We consider a group of Bayesian learners whose interactions with the environment and other agents allow them to improve their model of the dependency among various factors that have influence on their interactions with the environment. Effective collaboration can improve the performance of isolated ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
We consider a group of Bayesian learners whose interactions with the environment and other agents allow them to improve their model of the dependency among various factors that have influence on their interactions with the environment. Effective collaboration can improve the performance of isolated individual learners. We present a mechanism to pool together the knowledge of many modelers in the domain, each of whom may have only partial access to the environment. The application domain used in this study is a multiagent negotiation problem. We present results to compare the performance of such knowledge-composition against isolated learners, as also against a learner who has complete access to the environment.
2007, Bayesian network classifiers in weka for version 3-5-5. http://www.cs.waikato.ac.nz/ml/weka
"... Various Bayesian network classifier learning algorithms are implemented in Weka [12]. This note provides some user documentation and implementation details. Summary of main capabilities: • Structure learning of Bayesian networks using various hill climbing (K2, B, etc) and general purpose (simulated ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Various Bayesian network classifier learning algorithms are implemented in Weka [12]. This note provides some user documentation and implementation details. Summary of main capabilities: • Structure learning of Bayesian networks using various hill climbing (K2, B, etc) and general purpose (simulated annealing, tabu search) algorithms. • Local score metrics implemented; Bayes, BDe, MDL, entropy, AIC. • Global score metrics implemented; leave one out cv, k-fold cv and cumulative cv. • Conditional independence based causal recovery algorithm available. • Parameter estimation using direct estimates and Bayesian model averaging. • GUI for easy inspection of Bayesian networks. • Part of Weka allowing systematic experiments to compare Bayes
A reconstruction algorithm for the essential graph
, 2008
"... A standard graphical representative of a Bayesian network structure is a special chain graph, known as an essential graph. An alternative algebraic approach to the mathematical description of this statistical model uses instead a certain integer-valued vector, known as a standard imset. We give a di ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
A standard graphical representative of a Bayesian network structure is a special chain graph, known as an essential graph. An alternative algebraic approach to the mathematical description of this statistical model uses instead a certain integer-valued vector, known as a standard imset. We give a direct formula for the translation of any chain graph describing a Bayesian network structure into the standard imset. Moreover, we present a two-stage algorithm which makes it possible to reconstruct the essential graph on the basis of the standard imset. The core of this paper is the proof of the correctness of the algorithm.
Finding Optimal Bayesian Network Given a Super-Structure
"... Classical approaches used to learn Bayesian network structure from data have disadvantages in terms of complexity and lower accuracy of their results. However, a recent empirical study has shown that a hybrid algorithm improves sensitively accuracy and speed: it learns a skeleton with an independenc ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Classical approaches used to learn Bayesian network structure from data have disadvantages in terms of complexity and lower accuracy of their results. However, a recent empirical study has shown that a hybrid algorithm improves sensitively accuracy and speed: it learns a skeleton with an independency test (IT) approach and constrains on the directed acyclic graphs (DAG) considered during the search-and-score phase. Subsequently, we theorize the structural constraint by introducing the concept of super-structure S, which is an undirected graph that restricts the search to networks whose skeleton is a subgraph of S. We develop a super-structure constrained optimal search (COS): its time complexity is upper bounded by O(γm n), where γm < 2 depends on the maximal degree m of S. Empirically, complexity depends on the average degree ˜m and sparse structures allow larger graphs to be calculated. Our algorithm is faster than an optimal search by several orders and even finds more accurate results when given a sound super-structure. Practically, S can be approximated by IT approaches; significance level of the tests controls its sparseness, enabling to control the trade-off between speed and accuracy. For incomplete super-structures, a greedily post-processed version (COS+) still enables to significantly outperform other heuristic searches. Keywords: subset Bayesian networks, structure learning, optimal search, super-structure, connected 1.
A geometric view on learning Bayesian network structures
, 2009
"... We recall the basic idea of an algebraic approach to learning Bayesian network (BN) structures, namely to represent every BN structure by a certain (uniquely determined) vector, called a standard imset. The main result of the paper is that the set of standard imsets is the set of vertices ( = extrem ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We recall the basic idea of an algebraic approach to learning Bayesian network (BN) structures, namely to represent every BN structure by a certain (uniquely determined) vector, called a standard imset. The main result of the paper is that the set of standard imsets is the set of vertices ( = extreme points) of a certain polytope. Motivated by the geometric view, we introduce the concept of the geometric neighborhood for standard imsets, and, consequently, for BN structures. Then we show that it always includes the inclusion neighborhood, which was introduced earlier in connection with the greedy equivalence search (GES) algorithm. The third result is that the global optimum of an affine function over the polytope coincides with the local optimum relative to the geometric neighborhood. To illustrate the new concept by an example, we describe the geometric neighborhood in the case of three variables and show it differs from the inclusion neighborhood. This leads to a simple example of the failure of the GES algorithm if data are not “generated ” from a perfectly Markovian distribution. The point is that one can avoid this failure if the search technique is based on the geometric neighborhood instead. We also found out what is the geometric neighborhood in the case of four and five variables.
Reading Dependencies from Polytree-Like Bayesian Networks Revisited
"... We present a graphical criterion for reading dependencies from the minimal directed independence map G of a graphoid p, under the assumption that G is a polytree and p satisfies weak transitivity. We prove that the criterion is sound and complete. We argue that assuming weak transitivity is not too ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We present a graphical criterion for reading dependencies from the minimal directed independence map G of a graphoid p, under the assumption that G is a polytree and p satisfies weak transitivity. We prove that the criterion is sound and complete. We argue that assuming weak transitivity is not too restrictive. 1

