Results 1  10
of
21
Sparse Permutation Invariant Covariance Estimation
 Electronic Journal of Statistics
, 2008
"... The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in highdimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lassotype penalty. We establish a rate of convergence in the Fro ..."
Abstract

Cited by 83 (5 self)
 Add to MetaCart
The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in highdimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lassotype penalty. We establish a rate of convergence in the Frobenius norm as both data dimension p and sample size n are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlationbased version of the method exhibits better rates in the operator norm. The estimator is required to be positive definite, but we avoid having to use semidefinite programming by reparameterizing the objective function
Binary models for marginal independence
 JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B
, 2005
"... A number of authors have considered multivariate Gaussian models for marginal independence. In this paper we develop models for binary data with the same independence structure. The models can be parameterized based on Möbius inversion and maximum likelihood estimation can be performed using a versi ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
A number of authors have considered multivariate Gaussian models for marginal independence. In this paper we develop models for binary data with the same independence structure. The models can be parameterized based on Möbius inversion and maximum likelihood estimation can be performed using a version of the Iterated Conditional Fitting algorithm. The approach is illustrated on a simple example. Relations to multivariate logistic and dependence ratio models are discussed.
Graphical methods for efficient likelihood inference in gaussian covariance models
 Journal of Machine Learning
, 2008
"... Abstract. In graphical modelling, a bidirected graph encodes marginal independences among random variables that are identified with the vertices of the graph. We show how to transform a bidirected graph into a maximal ancestral graph that (i) represents the same independence structure as the origi ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Abstract. In graphical modelling, a bidirected graph encodes marginal independences among random variables that are identified with the vertices of the graph. We show how to transform a bidirected graph into a maximal ancestral graph that (i) represents the same independence structure as the original bidirected graph, and (ii) minimizes the number of arrowheads among all ancestral graphs satisfying (i). Here the number of arrowheads of an ancestral graph is the number of directed edges plus twice the number of bidirected edges. In Gaussian models, this construction can be used for more efficient iterative maximization of the likelihood function and to determine when maximum likelihood estimates are equal to empirical counterparts. 1.
Analysis of Gene Sets Based on the Underlying Regulatory Network
"... Networks are often used to represent the interactions among genes and proteins. These interactions are known to play an important role in vital cell functions and should be included in the analysis of genes that are differentially expressed. Methods of gene set analysis take advantage of external bi ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
Networks are often used to represent the interactions among genes and proteins. These interactions are known to play an important role in vital cell functions and should be included in the analysis of genes that are differentially expressed. Methods of gene set analysis take advantage of external biological information and analyze a priori defined sets of genes. These methods can potentially preserve the correlation among genes, however, they do not directly incorporate the information about the gene network. In this paper, we propose a latent variable model that directly incorporates the network information. We then use the theory of mixed linear models to present a general inference framework for the problem of testing the significance of subnetworks. Several possible test procedures are discussed and a network based method for testing the changes in expression levels of genes as well as the structure of the network is presented. The performance of the proposed method is compared with methods of gene set analysis using both simulation studies as well as real data on genes related to Galactose Utilization pathway in yeast. 2 1
Clique Matrices for Statistical Graph Decomposition and Parameterising Restricted Positive Definite Matrices
"... We introduce Clique Matrices as an alternative representation of undirected graphs, being a generalisation of the incidence matrix representation. Here we use clique matrices to decompose a graph into a set of possibly overlapping clusters, defined as wellconnected subsets of vertices. The decomposi ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
We introduce Clique Matrices as an alternative representation of undirected graphs, being a generalisation of the incidence matrix representation. Here we use clique matrices to decompose a graph into a set of possibly overlapping clusters, defined as wellconnected subsets of vertices. The decomposition is based on a statistical description which encourages clusters to be well connected and few in number. Inference is carried out using a variational approximation. Clique matrices also play a natural role in parameterising positive definite matrices under zero constraints on elements of the matrix. We show that clique matrices can parameterise all positive definite matrices restricted according to a decomposable graph and form a structured Factor Analysis approximation in the nondecomposable case. 1
Hifh dimensional sparse covariance estimation via directed acyclic graphs
, 2009
"... We present a graphbased technique for estimating sparse covariance matrices and their inverses from highdimensional data. The method is based on learning a directed acyclic graph (DAG) and estimating parameters of a multivariate Gaussian distribution based on a DAG. For inferring the underlying DA ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
We present a graphbased technique for estimating sparse covariance matrices and their inverses from highdimensional data. The method is based on learning a directed acyclic graph (DAG) and estimating parameters of a multivariate Gaussian distribution based on a DAG. For inferring the underlying DAG we use the PCalgorithm [27] and for estimating the DAGbased covariance matrix and its inverse, we use a Cholesky decomposition approach which provides a positive (semi)definite sparse estimate. We present a consistency result in the highdimensional framework and we compare our method with the Glasso [12, 8, 2] for simulated and real data.
Covariance Estimation: The GLM and Regularization Perspectives
"... Finding an unconstrained and statistically interpretable reparameterization of a covariance matrix is still an open problem in statistics. Its solution is of central importance in covariance estimation, particularly in the recent highdimensional data environment where enforcing the positivedefinit ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Finding an unconstrained and statistically interpretable reparameterization of a covariance matrix is still an open problem in statistics. Its solution is of central importance in covariance estimation, particularly in the recent highdimensional data environment where enforcing the positivedefiniteness constraint could be computationally expensive. We provide a survey of the progress made in modeling covariance matrices from the perspectives of generalized linear models (GLM) or parsimony and use of covariates in low dimensions, regularization (shrinkage, sparsity) for highdimensional data, and the role of various matrix factorizations. A viable and emerging regressionbased setup which is suitable for both the GLM and the regularization approaches is to link a covariance matrix, its inverse or their factors to certain regression models and then solve the relevant (penalized) least squares problems. We point out several instances of this regressionbased setup in the literature. A notable case is in the Gaussian graphical models where linear regressions with LASSO penalty are used to estimate the neighborhood of one node at a time (Meinshausen and Bühlmann, 2006). Some advantages
Identifying Graph Clusters using Variational Inference and links to
"... Finding clusters of wellconnected nodes in a graph is useful in many domains, including Social Network, Web and molecular interaction analyses. From a computational viewpoint, finding these clusters or graph communities is a difficult problem. We consider the framework of Clique Matrices to decompo ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Finding clusters of wellconnected nodes in a graph is useful in many domains, including Social Network, Web and molecular interaction analyses. From a computational viewpoint, finding these clusters or graph communities is a difficult problem. We consider the framework of Clique Matrices to decompose a graph into a set of possibly overlapping clusters, defined as wellconnected subsets of vertices. The decomposition is based on a statistical description which encourages clusters to be well connected and few in number. The formal intractability of inferring the clusters is addressed using a variational approximation which has links to meanfield theories in statistical mechanics. Clique matrices also play a natural role in parameterising positive definite matrices under zero constraints on elements of the matrix. We show that clique matrices can parameterise all positive definite matrices restricted according to a decomposable graph and form a structured Factor Analysis approximation in the nondecomposable case.