Results 1  10
of
12
The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs
"... Recent methods for estimating sparse undirected graphs for realvalued data in high dimensional problems rely heavily on the assumption of normality. We show how to use a semiparametric Gaussian copula—or “nonparanormal”—for high dimensional inference. Just as additive models extend linear models by ..."
Abstract

Cited by 41 (11 self)
 Add to MetaCart
Recent methods for estimating sparse undirected graphs for realvalued data in high dimensional problems rely heavily on the assumption of normality. We show how to use a semiparametric Gaussian copula—or “nonparanormal”—for high dimensional inference. Just as additive models extend linear models by replacing linear functions with a set of onedimensional smooth functions, the nonparanormal extends the normal by transforming the variables by smooth functions. We derive a method for estimating the nonparanormal, study the method’s theoretical properties, and show that it works well in many examples.
Estimation of gaussian graphs by model selection
 Electron. J. Stat
, 2008
"... Abstract. We investigate in this paper the estimation of Gaussian graphs by model selection from a nonasymptotic point of view. We start from a nsample of a Gaussian law PC in R p and focus on the disadvantageous case where n is smaller than p. To estimate the graph of conditional dependences of P ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Abstract. We investigate in this paper the estimation of Gaussian graphs by model selection from a nonasymptotic point of view. We start from a nsample of a Gaussian law PC in R p and focus on the disadvantageous case where n is smaller than p. To estimate the graph of conditional dependences of PC, we introduce a collection of candidate graphs and then select one of them by minimizing a penalized empirical risk. Our main result assess the performance of the procedure in a nonasymptotic setting. We pay a special attention to the maximal degree D of the graphs that we can handle, which turns to be roughly n/(2log p). 1.
The hidden life of latent variables: Bayesian learning with mixed graph models
, 2008
"... Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of D ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of DAGs is not closed under marginalization of hidden variables. This means that in general we cannot use a DAG to represent the independencies over a subset of variables in a larger DAG. Directed mixed graphs (DMGs) are a representation that includes DAGs as a special case, and overcomes this limitation. This paper introduces algorithms for performing Bayesian inference in Gaussian and probit DMG models. An important requirement for inference is the characterization of the distribution over parameters of the models. We introduce a new distribution for covariance matrices of Gaussian DMGs. We discuss and illustrate how several Bayesian machine learning tasks can benefit from the principle presented here: the power to model dependencies that are generated from hidden variables, but without necessarily modelling such variables explicitly.
Assessing the validity domains of graphical Gaussian models in order to infer relationships among components of complex biological systems
, 2008
"... Abstract. The study of the interactions of cellular components is an essential base step to understand the structure and dynamics of biological networks. So, various methods were recently developed in this purpose. While most of them combine different types of data and ¡em¿a priori¡/em ¿ knowledge, ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Abstract. The study of the interactions of cellular components is an essential base step to understand the structure and dynamics of biological networks. So, various methods were recently developed in this purpose. While most of them combine different types of data and ¡em¿a priori¡/em ¿ knowledge, methods based on Graphical Gaussian Models are capable of learning the network directly from raw data. They consider the fullorder partial correlations which are partial correlations between two variables given the remaining ones, for modelling direct links between variables. Statistical methods were developed for estimating these links when the number of observations is larger than the number of variables. However, the rapid advance of new technologies that allow to simultaneous measure genome expression, led to largescale datasets where the number of variables is far larger than the number of observations. To get round this dimensionality problem, different strategies and new statistical methods were proposed. In this study we focused on statistical methods recently published. All are based on the fact that the number of direct relationship between two variables is very small in regards to the number of possible relationships, ¡em¿p(p1)/2¡/em¿. In the biological context, this assumption is not always satisfied over the whole graph. So it is essential to precisely know the behaviour of the methods in regards to the characteristics of the studied object before applying them. For this purpose, we evaluated the validity domain of each method from wideranging simulated datasets. We then illustrated our results using recently published biological data.
Markov Properties for Linear Causal Models with Correlated Errors
"... A linear causal model with correlated errors, represented by a DAG with bidirected edges, can be tested by the set of conditional independence relations implied by the model. A global Markov property specifies, by the dseparation criterion, the set of all conditional independence relations holding ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
A linear causal model with correlated errors, represented by a DAG with bidirected edges, can be tested by the set of conditional independence relations implied by the model. A global Markov property specifies, by the dseparation criterion, the set of all conditional independence relations holding in any model associated with a graph. A local Markov property specifies a much smaller set of conditional independence relations which will imply all other conditional independence relations which hold under the global Markov property. For DAGs with bidirected edges associated with arbitrary probability distributions, a local Markov property is given in Richardson (2003) which may invoke an exponential number of conditional independencies. In this paper, we show that for a class of linear structural equation models with correlated errors the local Markov property will invoke only linear number of conditional independence relations. For general linear models, we provide a local Markov property that often invokes far fewer conditional independencies than that in Richardson (2003). The results have applications in testing linear structural equation models with correlated errors.
Transelliptical graphical models
 in: Advances in Neural Information Processing Systems
"... We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the varia ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the variables using univariate functions, the transelliptical extends the elliptical family in the same way. We propose a nonparametric rankbased regularization estimator which achieves the parametric rates of convergence for both graph recovery and parameter estimation. Such a result suggests that the extra robustness and flexibility obtained by the semiparametric transelliptical modeling incurs almost no efficiency loss. We also discuss the relationship between this work with the transelliptical component analysis proposed by Han and Liu (2012). 1
DOI: 10.1214/08EJS228 Estimation of Gaussian graphs by model selection
, 710
"... Abstract: We investigate in this paper the estimation of Gaussian graphs by model selection from a nonasymptotic point of view. We start from an nsample of a Gaussian law PC in R p and focus on the disadvantageous case where n is smaller than p. To estimate the graph of conditional dependences of ..."
Abstract
 Add to MetaCart
Abstract: We investigate in this paper the estimation of Gaussian graphs by model selection from a nonasymptotic point of view. We start from an nsample of a Gaussian law PC in R p and focus on the disadvantageous case where n is smaller than p. To estimate the graph of conditional dependences of PC, we introduce a collection of candidate graphs and then select one of them by minimizing a penalized empirical risk. Our main result assesses the performance of the procedure in a nonasymptotic setting. We pay special attention to the maximal degree D of the graphs that we can handle, which turns to be roughly n/(2 log p).
Graphical Model Selection with Applications to Biological Networks
, 2009
"... Multiple Testing Procedures for ..."
Thème COG
, 2008
"... de recherche SN 02496399 ISRN INRIA/RR6354FR+ENGGoodnessoffit Tests for highdimensional Gaussian linear models ..."
Abstract
 Add to MetaCart
de recherche SN 02496399 ISRN INRIA/RR6354FR+ENGGoodnessoffit Tests for highdimensional Gaussian linear models
Multivariate detection of genegene interactions
, 2011
"... Unraveling the nature of genetic interactions is crucial to obtaining a more complete picture of complex diseases. It is thought that genegene interactions play an important role in the etiology of cancer, cardiovascular and immunemediated disease. Interactions among genes are defined as phenotypi ..."
Abstract
 Add to MetaCart
Unraveling the nature of genetic interactions is crucial to obtaining a more complete picture of complex diseases. It is thought that genegene interactions play an important role in the etiology of cancer, cardiovascular and immunemediated disease. Interactions among genes are defined as phenotypic effects that differ from those observed for independent contributions of each gene, usually detected by univariate logistic regression methods. Using a multivariate extension of linkage disequilibrium, we have developed a novel method, based on distances between sample covariance matrices for groups of SNPs, to test for genegene interactions associated with a disease phenotype. Since a diseaseassociated interacting locus will often be in linkage disequilibrium with more than one marker in the region, a method that examines a set of markers in a region collectively can offer greater power than traditional methods. Our method effectively identifies interaction effects in simulated data, as well as in data on the genetic contributions to the risk for graftversushost disease following hematopoietic cell transplantation.