• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Tracy-Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices. The Annals of Probability, (2007)

by Noureddine El Karoui
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 68
Next 10 →

Covariance regularization by thresholding

by J. Bickel, Elizaveta Levina , 2007
"... This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or sub-Ga ..."
Abstract - Cited by 148 (11 self) - Add to MetaCart
This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or sub-Gaussian, and (log p)/n → 0, and obtain explicit rates. The results are uniform over families of covariance matrices which satisfy a fairly natural notion of sparsity. We discuss an intuitive resampling scheme for threshold selection and prove a general cross-validation result that justifies this approach. We also compare thresholding to other covariance estimators in simulations and on an example from climate data. 1. Introduction. Estimation
(Show Context)

Citation Context

...large. Many results in random matrix theory illustrate this, from the classical Marĉenko-Pastur law [24] to the more recent work of Johnstone and his students on the theory of the largest eigenvalues =-=[12, 20, 25]-=- and associated eigenvectors [21]. However, with the exception of a method for estimating the covariance spectrum [11], these probabilistic results do not offer alternatives to the sample covariance m...

Operator norm consistent estimation of large-dimensional sparse covariance matrices

by El Karoui - Annals of Statistics
"... Estimating covariance matrices is a problem of fundamental importance in multivariate statistics. In practice it is increasingly frequent to work with data matrices X of dimension n×p, where p and n are both large. Results from random matrix theory show very clearly that in this setting, standard es ..."
Abstract - Cited by 69 (1 self) - Add to MetaCart
Estimating covariance matrices is a problem of fundamental importance in multivariate statistics. In practice it is increasingly frequent to work with data matrices X of dimension n×p, where p and n are both large. Results from random matrix theory show very clearly that in this setting, standard estimators like the sample covariance matrix perform in general very poorly. In this “large n, large p ” setting, it is sometimes the case that practitioners are willing to assume that many elements of the population covariance matrix are equal to 0, and hence this matrix is sparse. We develop an estimator to handle this situation. The estimator is shown to be consistent in operator norm, when, for instance, we have p ≍ n as n → ∞. In other words the largest singular value of the difference between the estimator and the population covariance matrix goes to zero. This implies consistency of all the eigenvalues and consistency of eigenspaces associated to isolated eigenvalues. We also propose a notion of sparsity for matrices, that is, “compatible” with spectral analysis and is independent of the ordering of the variables. 1. Introduction. Estimating
(Show Context)

Citation Context

...r strong distributional assumptions, one can characterize the fluctuation behavior of the largest eigenvalue of sample covariance matrices for quite a large class of population covariance (see, e.g., =-=[11]-=- for recent results), or the fluctuation behavior of linear functionals of eigenvalues (see [1, 3, 19]). However, until very recently there has been less work in the direction of using these powerful ...

Beta ensembles, stochastic Airy spectrum, and a diffusion

by José A. Ramírez, Brian Rider, Bálint Virág , 2008
"... We prove that the largest eigenvalues of the beta ensembles of random matrix theory converge in distribution to the low-lying eigenvalues of the random Schrödinger operator − d2 dx 2 + x + 2 √ β b ′ x restricted to the positive half-line, where b ′ x is white noise. In doing so we extend the definit ..."
Abstract - Cited by 67 (9 self) - Add to MetaCart
We prove that the largest eigenvalues of the beta ensembles of random matrix theory converge in distribution to the low-lying eigenvalues of the random Schrödinger operator − d2 dx 2 + x + 2 √ β b ′ x restricted to the positive half-line, where b ′ x is white noise. In doing so we extend the definition of the Tracy-Widom(β) distributions to all β> 0, and also analyze their tails. Last, in a parallel development, we provide a second characterization of these laws in terms of a one-dimensional diffusion. The proofs rely on the associated tridiagonal matrix models and a universality result showing that the spectrum of such models converge to that of their continuum operator limit. In particular, we show how Tracy-Widom laws arise from a functional central limit theorem.
(Show Context)

Citation Context

...eralizes the so-called “null” Wishart ensembles, distinguishing the important class of WΣW † type matrices with non-identity Σ. For progress on the spectral edge in the non-null case, consult [2] and =-=[12]-=-. We now proceed with the proof of Theorem 1.4. It suffices to prove the claim along a further subsequence of any given subsequence. This allows us to assume that κ = κ(n) is an increasing function of...

SPECTRUM ESTIMATION FOR LARGE DIMENSIONAL COVARIANCE MATRICES USING RANDOM MATRIX THEORY

by Noureddine El Karoui - SUBMITTED TO THE ANNALS OF STATISTICS
"... Estimating the eigenvalues of a population covariance matrix from a sample covariance matrix is a problem of fundamental importance in multivariate statistics; the eigenvalues of covariance matrices play a key role in many widely techniques, in particular in Principal Component Analysis (PCA). In ma ..."
Abstract - Cited by 66 (4 self) - Add to MetaCart
Estimating the eigenvalues of a population covariance matrix from a sample covariance matrix is a problem of fundamental importance in multivariate statistics; the eigenvalues of covariance matrices play a key role in many widely techniques, in particular in Principal Component Analysis (PCA). In many modern data analysis problems, statisticians are faced with large datasets where the sample size, n, is of the same order of magnitude as the number of variables p. Random matrix theory predicts that in this context, the eigenvalues of the sample covariance matrix are not good estimators of the eigenvalues of the population covariance. We propose to use a fundamental result in random matrix theory, the Marčenko-Pastur equation, to better estimate the eigenvalues of large dimensional covariance matrices. The Marčenko-Pastur equation holds in very wide generality and under weak assumptions. The estimator we obtain can be thought of as “shrinking ” in a non linear fashion the eigenvalues of the sample covariance matrix to estimate the population eigenvalues. Inspired by ideas of random matrix theory, we also suggest a change of point of view when thinking about estimation of high-dimensional vectors: we do not try to estimate directly the vectors but rather a probability measure that describes them. We think this is a theoretically more fruitful way to think about these problems. Our estimator gives fast and good or very good results in extended simulations. Our algorithmic approach is based on convex optimization. We also show that the proposed estimator is consistent.

FINITE SAMPLE APPROXIMATION RESULTS FOR PRINCIPAL COMPONENT ANALYSIS: A MATRIX PERTURBATION APPROACH

by Boaz Nadler
"... Principal Component Analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the non-asymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite ..."
Abstract - Cited by 66 (15 self) - Add to MetaCart
Principal Component Analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the non-asymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size n, to those of the limiting population PCA as n → ∞. As in machine learning, we present a finite sample theorem which holds with high probability for the closeness between the leading eigenvalue and eigenvector of sample PCA and population PCA under a spiked covariance model. In addition, we also consider the relation between finite sample PCA and the asymptotic results in the joint limit p, n → ∞, with p/n = c. We present a matrix perturbation view of the “phase transition phenomenon”, and a simple linear-algebra based derivation of the eigenvalue and eigenvector overlap in this asymptotic limit. Moreover, our analysis also applies for finite p, n where we show that although there is no sharp phase transition as in the infinite case, either as a function of noise level or as a function of sample size n, the eigenvector of sample PCA may exhibit a sharp ”loss of tracking”, suddenly losing its relation to the (true) eigenvector of the population PCA matrix. This occurs due to a crossover between the eigenvalue due to the signal and the largest eigenvalue due to noise, whose eigenvector points in a random direction.
(Show Context)

Citation Context

...to our matrix perturbation analysis, the value T(α ∗ ) is equal to the spectral norm C—the covariance matrix of the noise. We remark that a formula similar to (2.24) was recently derived by El Karoui =-=[11]-=-, who also studied the finite p,n fluctuations around this mean. Corollary 3. Consider the general spiked covariance model as in Theorem 2.4, and assume c = p/n ≫ 1. Let ∫ µ1 = ρh(ρ)dρ, µ 2 ∫ 2 = (ρ −...

The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices

by Florent Benaych-Georges , Raj Rao Nadakuditi , 2011
"... ..."
Abstract - Cited by 63 (8 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...we dramatically extend the results found in the literature for the eigenvalue phase transition in such finite, low rank perturbation models well beyond the Gaussian [3, 4, 28, 21, 17, 13, 6], Wishart =-=[16, 27, 25]-=- and Jacobi settings [24]. In our situation, the distribution µX in Figure 1 can be any probability measure. Consequently, the aforementioned results in the literature can be rederived rather simply u...

High dimensional statistical inference and random matrices

by Iain M. Johnstone - IN: PROCEEDINGS OF INTERNATIONAL CONGRESS OF MATHEMATICIANS , 2006
"... Multivariate statistical analysis is concerned with observations on several variables which are thought to possess some degree of inter-dependence. Driven by problems in genetics and the social sciences, it first flowered in the earlier half of the last century. Subsequently, random matrix theory ..."
Abstract - Cited by 49 (1 self) - Add to MetaCart
Multivariate statistical analysis is concerned with observations on several variables which are thought to possess some degree of inter-dependence. Driven by problems in genetics and the social sciences, it first flowered in the earlier half of the last century. Subsequently, random matrix theory (RMT) developed, initially within physics, and more recently widely in mathematics. While some of the central objects of study in RMT are identical to those of multivariate statistics, statistical theory was slow to exploit the connection. However, with vast data collection ever more common, data sets now often have as many or more variables than the number of individuals observed. In such contexts, the techniques and results of RMT have much to offer multivariate statistics. The paper reviews some of the progress to date.

Fundamental limit of sample generalized eigenvalue based detection of signals in noise using relatively few signal-bearing and noise-only samples

by Raj Rao Nadakuditi, Jack W. Silverstein
"... ..."
Abstract - Cited by 42 (6 self) - Add to MetaCart
Abstract not found

Sample eigenvalue based detection of high-dimensional signals in white noise using relatively few samples

by Raj Rao Nadakuditi, Alan Edelman , 2007
"... ..."
Abstract - Cited by 39 (3 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...hat did not require us to make any subjective decisions on setting threshold levels. Thus, we did not consider largest eigenvalue tests in sample-starved settings of the sort described in [40], [51], =-=[56]-=-, and the references therein. Nevertheless, if the performance can be significantly improved using a sequence of nested hypothesis tests, then this might be a price we might be ready to pay. This is e...

Dynamic factor models

by James H. Stock, Mark W. Watson - PREPARED FOR THE OXFORD HANDBOOK OF ECONOMIC FORECASTING , 2010
"... ..."
Abstract - Cited by 38 (1 self) - Add to MetaCart
Abstract not found
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University