Results 1  10
of
16
Regularized estimation of large covariance matrices
 Ann. Statist
, 2008
"... This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (log p)/n → ..."
Abstract

Cited by 89 (13 self)
 Add to MetaCart
This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (log p)/n → 0, and obtain explicit rates. The results are uniform over some fairly natural wellconditioned families of covariance matrices. We also introduce an analogue of the Gaussian white noise model and show that if the population covariance is embeddable in that model and wellconditioned, then the banded approximations produce consistent estimates of the eigenvalues and associated eigenvectors of the covariance matrix. The results can be extended to smooth versions of banding and to nonGaussian distributions with sufficiently short tails. A resampling approach is proposed for choosing the banding parameter in practice. This approach is illustrated numerically on both simulated and real data. 1. Introduction. Estimation
Sparse Permutation Invariant Covariance Estimation
 Electronic Journal of Statistics
, 2008
"... The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in highdimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lassotype penalty. We establish a rate of convergence in the Fro ..."
Abstract

Cited by 78 (5 self)
 Add to MetaCart
The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in highdimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lassotype penalty. We establish a rate of convergence in the Frobenius norm as both data dimension p and sample size n are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlationbased version of the method exhibits better rates in the operator norm. The estimator is required to be positive definite, but we avoid having to use semidefinite programming by reparameterizing the objective function
Covariance regularization by thresholding
, 2007
"... This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or subGa ..."
Abstract

Cited by 62 (9 self)
 Add to MetaCart
This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or subGaussian, and (log p)/n → 0, and obtain explicit rates. The results are uniform over families of covariance matrices which satisfy a fairly natural notion of sparsity. We discuss an intuitive resampling scheme for threshold selection and prove a general crossvalidation result that justifies this approach. We also compare thresholding to other covariance estimators in simulations and on an example from climate data. 1. Introduction. Estimation
Weak and Strong Cross Section Dependence and Estimation of Large Panels
, 2009
"... This paper introduces the concepts of timespecific weak and strong cross section dependence. A doubleindexed process is said to be cross sectionally weakly dependent at a given point in time, t, if its weighted average along the cross section dimension (N) converges to its expectation in quadratic ..."
Abstract

Cited by 37 (19 self)
 Add to MetaCart
This paper introduces the concepts of timespecific weak and strong cross section dependence. A doubleindexed process is said to be cross sectionally weakly dependent at a given point in time, t, if its weighted average along the cross section dimension (N) converges to its expectation in quadratic mean, as N is increased without bounds for all weights that satisfy certain ‘granularity’ conditions. Relationship with the notions of weak and strong common factors is investigated and an application to the estimation of panel data models with an infinite number of weak factors and a finite number of strong factors is also considered. The paper concludes with a set of Monte Carlo experiments where the small sample properties of estimators based on principal components and CCE estimators are investigated and compared under various assumptions on the nature of the unobserved common effects.
FINITE SAMPLE APPROXIMATION RESULTS FOR PRINCIPAL COMPONENT ANALYSIS: A MATRIX PERTURBATION APPROACH
"... Principal Component Analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite ..."
Abstract

Cited by 25 (11 self)
 Add to MetaCart
Principal Component Analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size n, to those of the limiting population PCA as n → ∞. As in machine learning, we present a finite sample theorem which holds with high probability for the closeness between the leading eigenvalue and eigenvector of sample PCA and population PCA under a spiked covariance model. In addition, we also consider the relation between finite sample PCA and the asymptotic results in the joint limit p, n → ∞, with p/n = c. We present a matrix perturbation view of the “phase transition phenomenon”, and a simple linearalgebra based derivation of the eigenvalue and eigenvector overlap in this asymptotic limit. Moreover, our analysis also applies for finite p, n where we show that although there is no sharp phase transition as in the infinite case, either as a function of noise level or as a function of sample size n, the eigenvector of sample PCA may exhibit a sharp ”loss of tracking”, suddenly losing its relation to the (true) eigenvector of the population PCA matrix. This occurs due to a crossover between the eigenvalue due to the signal and the largest eigenvalue due to noise, whose eigenvector points in a random direction.
The largest eigenvalue of finite rank deformation of large Wigner matrices: convergence and nonuniversality of the fluctuations
, 2007
"... ..."
Large panels with common factors and spatial correlations
 IZA DISCUSSION PAPER
, 2007
"... This paper considers the statistical analysis of large panel data sets where even after conditioning on common observed effects the cross section units might remain dependently distributed. This could arise when the cross section units are subject to unobserved common effects and/or if there are spi ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
This paper considers the statistical analysis of large panel data sets where even after conditioning on common observed effects the cross section units might remain dependently distributed. This could arise when the cross section units are subject to unobserved common effects and/or if there are spill over effects due to spatial or other forms of local dependencies. The paper provides an overview of the literature on cross section dependence, introduces the concepts of timespecific weak and strong cross section dependence and shows that the commonly used spatial models are examples of weak cross section dependence. It is then established that the Common Correlated Effects (CCE) estimator of panel data model with a multifactor error structure, recently advanced by Pesaran (2006), continues to provide consistent estimates of the slope coefficient, even in the presence of spatial error processes. Small sample properties of the CCE estimator under various patterns of cross section dependence, including spatial forms, are investigated by Monte Carlo experiments. Results show that the CCE approach works well in the presence of weak and/or strong cross sectionally correlated errors. We also explore the role of certain characteristics of spatial processes in determining the performance of CCE estimators, such as the form and intensity of spatial dependence, and the sparseness of the spatial weight matrix.
The largest eigenvalue of rank one deformation of large Wigner matrices
 COMM. MATH. PHYS
, 2008
"... The purpose of this paper is to establish universality of the fluctuations of the largest eigenvalue of some non necessarily Gaussian complex Deformed Wigner Ensembles. The real model is also considered. Our approach is close to the one used by A. Soshnikov (c.f. [12]) in the investigations of class ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
The purpose of this paper is to establish universality of the fluctuations of the largest eigenvalue of some non necessarily Gaussian complex Deformed Wigner Ensembles. The real model is also considered. Our approach is close to the one used by A. Soshnikov (c.f. [12]) in the investigations of classical real or complex Wigner Ensembles. It is based on the computation of moments of traces of high powers of the random matrices under consideration.
Structural estimation of highdimensional factor models
, 2007
"... We develop econometric theory for the estimation of large N, T factor models in structural macrofinance. We employ noncommutative probability theory to derive a new estimator for the number of latent factors based on the moments of the eigenvalue distribution of the empirical covariance matrix. Ou ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
We develop econometric theory for the estimation of large N, T factor models in structural macrofinance. We employ noncommutative probability theory to derive a new estimator for the number of latent factors based on the moments of the eigenvalue distribution of the empirical covariance matrix. Our test combines a minimum distance procedure for the estimation of structural model parameters with a specification test on the empirical eigenvalues to solve the problem of separating the factors from the noise. We also relate the second order unbiased estimation of factor loadings to instrumental variable methods where the number of instruments is large relative to the sample size, and derive a number of alternatives to principal components with excellent finite sample properties. Using a large dataset of international stock returns, we estimate global supply and demand shocks in a structural New Keynesian macrofinance model of the US economy. We uncover 23 global factors over the period 19732006, many of which impact the supply side of the US economy. We show that omitting these factors masks