Results 11  20
of
100
Supplement to “Learning highdimensional directed acyclic graphs with latent and selection variables.” DOI:10.1214/11AOS940SUPP
, 2012
"... ar ..."
(Show Context)
Consistent Feature Selection for Pattern Recognition in Polynomial Time
"... We analyze two different feature selection problems: finding a minimal feature set optimal for classification (MINIMALOPTIMAL) vs. finding all features relevant to the target variable (ALLRELEVANT). The latter problem is motivated by recent applications within bioinformatics, particularly gene exp ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
We analyze two different feature selection problems: finding a minimal feature set optimal for classification (MINIMALOPTIMAL) vs. finding all features relevant to the target variable (ALLRELEVANT). The latter problem is motivated by recent applications within bioinformatics, particularly gene expression analysis. For both problems, we identify classes of data distributions for which there exist consistent, polynomialtime algorithms. We also prove that ALLRELEVANT is much harder than MINIMALOPTIMAL and propose two consistent, polynomialtime algorithms. We argue that the distribution classes considered are reasonable in many practical cases, so that our results simplify feature selection in a wide range of machine learning tasks.
Geometry of faithfulness assumption in causal inference
 Annals of Statistics
"... ar ..."
(Show Context)
Active learning of causal networks with intervention experiments and optimal
, 2008
"... The causal discovery from data is important for various scientific investigations. Because we cannot distinguish the different directed acyclic graphs (DAGs) in a Markov equivalence class learned from observational data, we have to collect further information on causal structures from experiments wi ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
The causal discovery from data is important for various scientific investigations. Because we cannot distinguish the different directed acyclic graphs (DAGs) in a Markov equivalence class learned from observational data, we have to collect further information on causal structures from experiments with external interventions. In this paper, we propose an active learning approach for discovering causal structures in which we first find a Markov equivalence class from observational data, and then we orient undirected edges in every chain component via intervention experiments separately. In the experiments, some variables are manipulated through external interventions. We discuss two kinds of intervention experiments, randomized experiment and quasiexperiment. Furthermore, we give two optimal designs of experiments, a batchintervention design and a sequentialintervention design, to minimize the number of manipulated variables and the set of candidate structures based on the minimax and the maximum entropy criteria. We show theoretically that structural learning can be done locally in subgraphs of chain components without need of checking illegal vstructures and cycles in the whole network and that a Markov equivalence subclass obtained after each intervention can still be depicted as a chain graph.
Hifh dimensional sparse covariance estimation via directed acyclic graphs
, 2009
"... We present a graphbased technique for estimating sparse covariance matrices and their inverses from highdimensional data. The method is based on learning a directed acyclic graph (DAG) and estimating parameters of a multivariate Gaussian distribution based on a DAG. For inferring the underlying DA ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We present a graphbased technique for estimating sparse covariance matrices and their inverses from highdimensional data. The method is based on learning a directed acyclic graph (DAG) and estimating parameters of a multivariate Gaussian distribution based on a DAG. For inferring the underlying DAG we use the PCalgorithm [27] and for estimating the DAGbased covariance matrix and its inverse, we use a Cholesky decomposition approach which provides a positive (semi)definite sparse estimate. We present a consistency result in the highdimensional framework and we compare our method with the Glasso [12, 8, 2] for simulated and real data.
HIGHDIMENSIONAL STRUCTURE ESTIMATION IN ISING MODELS: LOCAL SEPARATION CRITERION
, 2012
"... We consider the problem of highdimensional Ising (graphical) model selection. We propose a simple algorithm for structure estimation based on the thresholding of the empirical conditional variation distances. We introduce a novel criterion for tractable graph families, where this method is efficien ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of highdimensional Ising (graphical) model selection. We propose a simple algorithm for structure estimation based on the thresholding of the empirical conditional variation distances. We introduce a novel criterion for tractable graph families, where this method is efficient, based on the presence of sparse local separators between node pairs in the underlying graph. For such graphs, the proposed algorithm has a sample complexity of n = �(J −2 min log p), where p is the number of variables, and Jmin is the minimum (absolute) edge potential in the model. We also establish nonasymptotic necessary and sufficient conditions for structure estimation.
Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs
, 2012
"... The investigation of directed acyclic graphs (DAGs) encoding the same Markov property, that is the same conditional independence relations of multivariate observational distributions, has a long tradition; many algorithms exist for model selection and structure learning in Markov equivalence classes ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
The investigation of directed acyclic graphs (DAGs) encoding the same Markov property, that is the same conditional independence relations of multivariate observational distributions, has a long tradition; many algorithms exist for model selection and structure learning in Markov equivalence classes. In this paper, we extend the notion of Markov equivalence of DAGs to the case of interventional distributions arising from multiple intervention experiments. We show that under reasonable assumptions on the intervention experiments, interventional Markov equivalence defines a finer partitioning of DAGs than observational Markov equivalence and hence improves the identifiability of causal models. We give a graph theoretic criterion for two DAGs being Markov equivalent under interventions and show that each interventional Markov equivalence class can, analogously to the observational case, be uniquely represented by a chain graph called interventional essential graph (also known as CPDAG in the observational case). These are key insights for deriving a generalization of the Greedy Equivalence Search algorithm aimed at structure learning from interventional data. This new algorithm is evaluated in a simulation study.
Learning gaussian graphical models of gene networks with false discovery rate control
 In 6th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics
"... Abstract. In many cases what matters is not whether a false discovery is made or not but the expected proportion of false discoveries among all the discoveries made, i.e. the socalled false discovery rate (FDR). We present an algorithm aiming at controlling the FDR of edges when learning Gaussian g ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
Abstract. In many cases what matters is not whether a false discovery is made or not but the expected proportion of false discoveries among all the discoveries made, i.e. the socalled false discovery rate (FDR). We present an algorithm aiming at controlling the FDR of edges when learning Gaussian graphical models (GGMs). The algorithm is particularly suitable when dealing with more nodes than samples, e.g. when learning GGMs of gene networks from gene expression data. We illustrate this on the Rosetta compendium [8]. 1