• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Sparsistency and rates of convergence in large covariance matrix estimation. (2009)

by C Lam, J Fan
Venue:Ann. Stat.,
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 110
Next 10 →

Sure independence screening for ultra-high dimensional feature space

by Jianqing Fan, Jinchi Lv , 2006
"... Variable selection plays an important role in high dimensional statistical modeling which nowa-days appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality p, estimation accuracy and computational cost are two top concerns. In a recent paper, ..."
Abstract - Cited by 283 (26 self) - Add to MetaCart
Variable selection plays an important role in high dimensional statistical modeling which nowa-days appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality p, estimation accuracy and computational cost are two top concerns. In a recent paper, Candes and Tao (2007) propose the Dantzig selector using L1 regularization and show that it achieves the ideal risk up to a logarithmic factor log p. Their innovative procedure and remarkable result are challenged when the dimensionality is ultra high as the factor log p can be large and their uniform uncertainty principle can fail. Motivated by these concerns, we introduce the concept of sure screening and propose a sure screening method based on a correlation learning, called the Sure Independence Screening (SIS), to reduce dimensionality from high to a moderate scale that is below sample size. In a fairly general asymptotic framework, the SIS is shown to have the sure screening property for even exponentially growing dimensionality. As a methodological extension, an iterative SIS (ISIS) is also proposed to enhance its finite sample performance. With dimension reduced accurately from high to below sample size, variable selection can be improved on both speed and accuracy, and can then be ac-

A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers

by Sahand Negahban, Pradeep Ravikumar, Martin J. Wainwright, Bin Yu
"... ..."
Abstract - Cited by 218 (32 self) - Add to MetaCart
Abstract not found

Sparse Permutation Invariant Covariance Estimation

by Adam Rothman, Peter Bickel, Elizaveta Levina, Ji Zhu - Electronic Journal of Statistics , 2008
"... The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in high-dimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lasso-type penalty. We establish a rate of con-vergence in the Fro ..."
Abstract - Cited by 164 (8 self) - Add to MetaCart
The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in high-dimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lasso-type penalty. We establish a rate of con-vergence in the Frobenius norm as both data dimension p and sample size n are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlation-based version of the method exhibits better rates in the operator norm. The estimator is required to be positive definite, but we avoid having to use semi-definite programming by re-parameterizing the objective function

Covariance regularization by thresholding

by J. Bickel, Elizaveta Levina , 2007
"... This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or sub-Ga ..."
Abstract - Cited by 148 (11 self) - Add to MetaCart
This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or sub-Gaussian, and (log p)/n → 0, and obtain explicit rates. The results are uniform over families of covariance matrices which satisfy a fairly natural notion of sparsity. We discuss an intuitive resampling scheme for threshold selection and prove a general cross-validation result that justifies this approach. We also compare thresholding to other covariance estimators in simulations and on an example from climate data. 1. Introduction. Estimation
(Show Context)

Citation Context

...computationally intensive. A faster algorithm that employs the lasso was proposed by Friedman et al. [16]. This approach has also been extended to more general penalties like SCAD [15] by Lam and Fan =-=[25]-=- and Fan et al. [14]. In specific applications, there have been other permutation-invariant approaches that use different notions of sparsity: Zou et al. [35] apply the lasso penalty to loadings in PC...

High dimensional covariance matrix estimation using a factor model

by Jianqing Fan, Yuan Liao, Martina Mincheva - Princeton Univ , 2008
"... ar ..."
Abstract - Cited by 99 (13 self) - Add to MetaCart
Abstract not found

Optimal rates of convergence for covariance matrix estimation

by T. Tony Cai, Cun-hui Zhang, Harrison, H. Zhou - Ann. Statist , 2010
"... Covariance matrix plays a central role in multivariate statistical analysis. Significant advances have been made recently on developing both theory and methodology for estimating large covariance matrices. However, a minimax theory has yet been developed. In this paper we establish the optimal rates ..."
Abstract - Cited by 88 (19 self) - Add to MetaCart
Covariance matrix plays a central role in multivariate statistical analysis. Significant advances have been made recently on developing both theory and methodology for estimating large covariance matrices. However, a minimax theory has yet been developed. In this paper we establish the optimal rates of convergence for estimating the covariance matrix under both the operator norm and Frobenius norm. It is shown that optimal procedures under the two norms are different and consequently matrix estimation under the operator norm is fundamentally different from vector estimation. The minimax upper bound is obtained by constructing a special class of tapering estimators and by studying their risk properties. A key step in obtaining the optimal rate of convergence is the derivation of the minimax lower bound. The technical analysis requires new ideas that are quite different from those used in the more conventional function/sequence estimation problems. 1. Introduction. Suppose

Latent Variable Graphical Model Selection via Convex Optimization

by Venkat Chandrasekaran, Pablo A. Parrilo, Alan S. Willsky , 2010
"... Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistic ..."
Abstract - Cited by 76 (4 self) - Add to MetaCart
Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistical model over the entire collection of variables? We address this question in the setting in which the latent and observed variables are jointly Gaussian, with the conditional statistics of the observed variables conditioned on the latent variables being specified by a graphical model. As a first step we give natural conditions under which such latent-variable Gaussian graphical models are identifiable given marginal statistics of only the observed variables. Essentially these conditions require that the conditional graphical model among the observed variables is sparse, while the effect of the latent variables is “spread out ” over most of the observed variables. Next we propose a tractable convex program based on regularized maximum-likelihood for model selection in this latent-variable setting; the regularizer uses both the ℓ1 norm and the nuclear norm. Our modeling framework can be viewed as a combination of dimensionality reduction (to identify latent variables) and graphical modeling (to capture remaining statistical structure not attributable to the latent variables), and it consistently estimates both the number of hidden components and the conditional graphical model structure among the observed variables. These results are applicable in the high-dimensional setting in which the number of latent/observed variables grows with the number of samples of the observed variables. The geometric properties of the algebraic varieties of sparse matrices and of low-rank matrices play an important role in our analysis.

High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence

by Pradeep Ravikumar, Martin J. Wainwright, Garvesh Raskutti, Bin Yu , 2008
"... ..."
Abstract - Cited by 67 (15 self) - Add to MetaCart
Abstract not found

Generalized thresholding of large covariance matrices,

by Adam J Rothman , Elizaveta Levina , Ji Zhu - J. Amer. Statist. Assoc. , 2009
"... We propose a new class of generalized thresholding operators that combine thresholding with shrinkage, and study generalized thresholding of the sample covariance matrix in high dimensions. Generalized thresholding of the covariance matrix has good theoretical properties and carries almost no compu ..."
Abstract - Cited by 64 (4 self) - Add to MetaCart
We propose a new class of generalized thresholding operators that combine thresholding with shrinkage, and study generalized thresholding of the sample covariance matrix in high dimensions. Generalized thresholding of the covariance matrix has good theoretical properties and carries almost no computational burden. We obtain an explicit convergence rate in the operator norm that shows the tradeoff between the sparsity of the true model, dimension, and the sample size, and shows that generalized thresholding is consistent over a large class of models as long as the dimension p and the sample size n satisfy log p/n ! 0. In addition, we show that generalized thresholding has the ''sparsistency'' property, meaning it estimates true zeros as zeros with probability tending to 1, and, under an additional mild condition, is sign consistent for nonzero elements. We show that generalized thresholding covers, as special cases, hard and soft thresholding, smoothly clipped absolute deviation, and adaptive lasso, and compare different types of generalized thresholding in a simulation study and in an example of gene clustering from a microarray experiment with tumor tissues.

Non-concave penalized likelihood with NP-dimensionality

by Jianqing Fan, Jinchi Lv - IEEE Trans. Inform. Theor , 2011
"... Abstract—Penalized likelihood methods are fundamental to ultrahigh dimensional variable selection. How high dimensionality such methods can handle remains largely unknown. In this paper, we show that in the context of generalized linear models, such methods possess model selection consistency with o ..."
Abstract - Cited by 55 (14 self) - Add to MetaCart
Abstract—Penalized likelihood methods are fundamental to ultrahigh dimensional variable selection. How high dimensionality such methods can handle remains largely unknown. In this paper, we show that in the context of generalized linear models, such methods possess model selection consistency with oracle properties even for dimensionality of nonpolynomial (NP) order of sample size, for a class of penalized likelihood approaches using folded-concave penalty functions, which were introduced to ameliorate the bias problems of convex penalty functions. This fills a long-standing gap in the literature where the dimensionality is allowed to grow slowly with the sample size. Our results are also applicable to penalized likelihood with the L1-penalty, which is a convex function at the boundary of the class of folded-concave penalty functions under consideration. The coordinate optimization is implemented for finding the solution paths, whose performance is evaluated by a few simulation examples and the real data analysis. Index Terms—Coordinate optimization, folded-concave penalty, high dimensionality, Lasso, nonconcave penalized likelihood, oracle property, SCAD, variable selection, weak oracle property. I.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University