Results 1  10
of
41
Optimal Structure Identification with Greedy Search
, 2002
"... In this paper we prove the socalled "Meek Conjecture". In particular, we show that if a is an independence map of another DAG then there exists a finite sequence of edge additions and covered edge reversals in such that (1) after each edge modification and (2) after all mod ..."
Abstract

Cited by 173 (1 self)
 Add to MetaCart
In this paper we prove the socalled "Meek Conjecture". In particular, we show that if a is an independence map of another DAG then there exists a finite sequence of edge additions and covered edge reversals in such that (1) after each edge modification and (2) after all modifications H.
Learning factor graphs in polynomial time and sample complexity
 JMLR
, 2006
"... We study the computational and sample complexity of parameter and structure learning in graphical models. Our main result shows that the class of factor graphs with bounded degree can be learned in polynomial time and from a polynomial number of training examples, assuming that the data is generated ..."
Abstract

Cited by 48 (0 self)
 Add to MetaCart
We study the computational and sample complexity of parameter and structure learning in graphical models. Our main result shows that the class of factor graphs with bounded degree can be learned in polynomial time and from a polynomial number of training examples, assuming that the data is generated by a network in this class. This result covers both parameter estimation for a known network structure and structure learning. It implies as a corollary that we can learn factor graphs for both Bayesian networks and Markov networks of bounded degree, in polynomial time and sample complexity. Importantly, unlike standard maximum likelihood estimation algorithms, our method does not require inference in the underlying network, and so applies to networks where inference is intractable. We also show that the error of our learned model degrades gracefully when the generating distribution is not a member of the target class of networks. In addition to our main result, we show that the sample complexity of parameter learning in graphical models has an O(1) dependence on the number of variables in the model when using the KLdivergence normalized by the number of variables as the performance criterion.
Exact bayesian structure learning from uncertain interventions
 AI & Statistics, In
, 2007
"... We show how to apply the dynamic programming algorithm of Koivisto and Sood [KS04, Koi06], which computes the exact posterior marginal edge probabilities p(Gij = 1D) of a DAG G given data D, to the case where the data is obtained by interventions (experiments). In particular, we consider the case w ..."
Abstract

Cited by 27 (5 self)
 Add to MetaCart
(Show Context)
We show how to apply the dynamic programming algorithm of Koivisto and Sood [KS04, Koi06], which computes the exact posterior marginal edge probabilities p(Gij = 1D) of a DAG G given data D, to the case where the data is obtained by interventions (experiments). In particular, we consider the case where the targets of the interventions are a priori unknown. We show that it is possible to learn the targets of intervention at the same time as learning the causal structure. We apply our exact technique to a biological data set that had previously been analyzed using MCMC [SPP + 05, EW06, WGH06]. 1
On Local Optima in Learning Bayesian Networks
, 2003
"... This paper proposes and evaluates the kgreedy equivalence search algorithm (KES) for learning Bayesian networks (BNs) from complete data. The main characteristic of KES is that it allows a tradeoff between greediness and randomness, thus exploring different good local optima when run repeatedly. W ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
This paper proposes and evaluates the kgreedy equivalence search algorithm (KES) for learning Bayesian networks (BNs) from complete data. The main characteristic of KES is that it allows a tradeoff between greediness and randomness, thus exploring different good local optima when run repeatedly. When
Bayesian structure learning using dynamic programming and MCMC
 In UAI, 2007b
"... We show how to significantly speed up MCMC sampling of DAG structures by using a powerful nonlocal proposal based on Koivisto’s dynamic programming (DP) algorithm (11; 10), which computes the exact marginal posterior edge probabilities by analytically summing over orders. Furthermore, we show how s ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
We show how to significantly speed up MCMC sampling of DAG structures by using a powerful nonlocal proposal based on Koivisto’s dynamic programming (DP) algorithm (11; 10), which computes the exact marginal posterior edge probabilities by analytically summing over orders. Furthermore, we show how sampling in DAG space can avoid subtle biases that are introduced by approaches that work only with orders, such as Koivisto’s DP algorithm and MCMC order samplers (6; 5). 1
Consistent Feature Selection for Pattern Recognition in Polynomial Time
"... We analyze two different feature selection problems: finding a minimal feature set optimal for classification (MINIMALOPTIMAL) vs. finding all features relevant to the target variable (ALLRELEVANT). The latter problem is motivated by recent applications within bioinformatics, particularly gene exp ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
We analyze two different feature selection problems: finding a minimal feature set optimal for classification (MINIMALOPTIMAL) vs. finding all features relevant to the target variable (ALLRELEVANT). The latter problem is motivated by recent applications within bioinformatics, particularly gene expression analysis. For both problems, we identify classes of data distributions for which there exist consistent, polynomialtime algorithms. We also prove that ALLRELEVANT is much harder than MINIMALOPTIMAL and propose two consistent, polynomialtime algorithms. We argue that the distribution classes considered are reasonable in many practical cases, so that our results simplify feature selection in a wide range of machine learning tasks.
assignment is hard for the MDL, AIC, and NML costs
 in Proceedings of The 19th Annual Conference on Learning Theory (COLT 2006). 2006, Lecture Notes in Computer Science 4005
"... Abstract. Several hardness results are presented for the parent assignment problem: Given m observations of n attributes x1; : : : ; xn, nd the best parents for xn, that is, a subset of the preceding attributes so as to minimize a xed cost function. This attribute or feature selection task plays a ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Several hardness results are presented for the parent assignment problem: Given m observations of n attributes x1; : : : ; xn, nd the best parents for xn, that is, a subset of the preceding attributes so as to minimize a xed cost function. This attribute or feature selection task plays an important role, e.g., in structure learning in Bayesian networks, yet little is known about its computational complexity. In this paper we prove that, under the commonly adopted fullmultinomial likelihood model, the MDL, BIC, or AIC cost cannot be approximated in polynomial time to a ratio less than 2 unless there exists a polynomialtime algorithm for determining whether a directed graph with n nodes has a dominating set of size log n, a LOGSNPcomplete problem for which no polynomialtime algorithm is known; as we also show, it is unlikely that these penalized maximum likelihood costs can be approximated to within any constant ratio. For the NML (normalized maximum likelihood) cost we prove an NPcompleteness result. These results both justify the application of existing methods and motivate research on heuristic and superpolynomialtime algorithms. 1
Join Bayes Nets: A New Type of Bayes net for Relational Data
"... Many realworld data are maintained in relational format, with different tables storing information about entities and their links or relationships. The structure (schema) of the database is essentially that of a logical language, with variables ranging over individual entities and predicates for re ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Many realworld data are maintained in relational format, with different tables storing information about entities and their links or relationships. The structure (schema) of the database is essentially that of a logical language, with variables ranging over individual entities and predicates for relationships and attributes. Our work combines the graphical structure of Bayes nets with the logical structure of relational databases to achieve knowledge discovery in databases. We introduce Join Bayes nets, a new type of Bayes nets for representing and learning classlevel dependencies between attributes from the same table and from different tables; such dependencies are important for policy making and strategic planning. Focusing on classlevel dependencies brings advantages in terms of the simplicity of the model and the tractability of inference and learning. As usual with Bayes nets, the graphical structure supports efficient inference and reasoning. We show that applying standard Bayes net inference algorithms to the learned models provides fast and accurate probability estimates for queries that involve attributes and relationships from multiple tables. 1