Results 1 
7 of
7
Weak and Strong Cross Section Dependence and Estimation of Large Panels
, 2009
"... This paper introduces the concepts of timespecific weak and strong cross section dependence. A doubleindexed process is said to be cross sectionally weakly dependent at a given point in time, t, if its weighted average along the cross section dimension (N) converges to its expectation in quadratic ..."
Abstract

Cited by 38 (20 self)
 Add to MetaCart
This paper introduces the concepts of timespecific weak and strong cross section dependence. A doubleindexed process is said to be cross sectionally weakly dependent at a given point in time, t, if its weighted average along the cross section dimension (N) converges to its expectation in quadratic mean, as N is increased without bounds for all weights that satisfy certain ‘granularity’ conditions. Relationship with the notions of weak and strong common factors is investigated and an application to the estimation of panel data models with an infinite number of weak factors and a finite number of strong factors is also considered. The paper concludes with a set of Monte Carlo experiments where the small sample properties of estimators based on principal components and CCE estimators are investigated and compared under various assumptions on the nature of the unobserved common effects.
High dimensional statistical inference and random matrices
 IN: PROCEEDINGS OF INTERNATIONAL CONGRESS OF MATHEMATICIANS
, 2006
"... Multivariate statistical analysis is concerned with observations on several variables which are thought to possess some degree of interdependence. Driven by problems in genetics and the social sciences, it first flowered in the earlier half of the last century. Subsequently, random matrix theory ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
Multivariate statistical analysis is concerned with observations on several variables which are thought to possess some degree of interdependence. Driven by problems in genetics and the social sciences, it first flowered in the earlier half of the last century. Subsequently, random matrix theory (RMT) developed, initially within physics, and more recently widely in mathematics. While some of the central objects of study in RMT are identical to those of multivariate statistics, statistical theory was slow to exploit the connection. However, with vast data collection ever more common, data sets now often have as many or more variables than the number of individuals observed. In such contexts, the techniques and results of RMT have much to offer multivariate statistics. The paper reviews some of the progress to date.
FINITE SAMPLE APPROXIMATION RESULTS FOR PRINCIPAL COMPONENT ANALYSIS: A MATRIX PERTURBATION APPROACH
"... Principal Component Analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite ..."
Abstract

Cited by 24 (11 self)
 Add to MetaCart
Principal Component Analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size n, to those of the limiting population PCA as n → ∞. As in machine learning, we present a finite sample theorem which holds with high probability for the closeness between the leading eigenvalue and eigenvector of sample PCA and population PCA under a spiked covariance model. In addition, we also consider the relation between finite sample PCA and the asymptotic results in the joint limit p, n → ∞, with p/n = c. We present a matrix perturbation view of the “phase transition phenomenon”, and a simple linearalgebra based derivation of the eigenvalue and eigenvector overlap in this asymptotic limit. Moreover, our analysis also applies for finite p, n where we show that although there is no sharp phase transition as in the infinite case, either as a function of noise level or as a function of sample size n, the eigenvector of sample PCA may exhibit a sharp ”loss of tracking”, suddenly losing its relation to the (true) eigenvector of the population PCA matrix. This occurs due to a crossover between the eigenvalue due to the signal and the largest eigenvalue due to noise, whose eigenvector points in a random direction.
Large panels with common factors and spatial correlations
 IZA DISCUSSION PAPER
, 2007
"... This paper considers the statistical analysis of large panel data sets where even after conditioning on common observed effects the cross section units might remain dependently distributed. This could arise when the cross section units are subject to unobserved common effects and/or if there are spi ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
This paper considers the statistical analysis of large panel data sets where even after conditioning on common observed effects the cross section units might remain dependently distributed. This could arise when the cross section units are subject to unobserved common effects and/or if there are spill over effects due to spatial or other forms of local dependencies. The paper provides an overview of the literature on cross section dependence, introduces the concepts of timespecific weak and strong cross section dependence and shows that the commonly used spatial models are examples of weak cross section dependence. It is then established that the Common Correlated Effects (CCE) estimator of panel data model with a multifactor error structure, recently advanced by Pesaran (2006), continues to provide consistent estimates of the slope coefficient, even in the presence of spatial error processes. Small sample properties of the CCE estimator under various patterns of cross section dependence, including spatial forms, are investigated by Monte Carlo experiments. Results show that the CCE approach works well in the presence of weak and/or strong cross sectionally correlated errors. We also explore the role of certain characteristics of spatial processes in determining the performance of CCE estimators, such as the form and intensity of spatial dependence, and the sparseness of the spatial weight matrix.
2006) “Instrumental Variable Estimation in a Data Rich Environment”, mimeo
"... We consider estimation of parameters in a regression model in which the endogenous regressors are just a few of the many other endogenous variables driven by a small number of unobservable exogenous common shocks. We show the method of principal components can be used to estimate factors that can be ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
We consider estimation of parameters in a regression model in which the endogenous regressors are just a few of the many other endogenous variables driven by a small number of unobservable exogenous common shocks. We show the method of principal components can be used to estimate factors that can be used as instrumental variables. These are not only valid instruments, they are more efficient than the observed variables in our framework. Consistency and asymptotic normality of the single equation factor instrumental variable estimator (FIV) is established. We also show that consistent estimates can be obtained from large panel data regressions by constructing valid instruments from the endogenous regressors that are themselves invalid instrument in a conventional sense. To reduce the bias that might arise from using too many instruments, we use boosting to select out the most relevant ones. Boosting necessitates a stopping rule. We derive the condition on the stopping parameter that arises from boosting estimated factors instead of observed variables.
Minimax Rank Estimation for Subspace Tracking
, 2009
"... Rank estimation is a classical model order selection problem that arises in a variety of important statistical signal and array processing systems, yet is addressed relatively infrequently in the extant literature. Here we present sample covariance asymptotics stemming from random matrix theory, and ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Rank estimation is a classical model order selection problem that arises in a variety of important statistical signal and array processing systems, yet is addressed relatively infrequently in the extant literature. Here we present sample covariance asymptotics stemming from random matrix theory, and bring them to bear on the problem of optimal rank estimation in the context of the standard array observation model with additive white Gaussian noise. The most significant of these results demonstrates the existence of a phase transition threshold, below which eigenvalues and associated eigenvectors of the sample covariance fail to provide any information on population eigenvalues. We then develop a decisiontheoretic rank estimation framework that leads to a simple ordered selection rule based on thresholding; in contrast to competing approaches, however, it admits asymptotic minimax optimality and is free of tuning parameters. We analyze the asymptotic performance of our rank selection procedure and conclude with a brief simulation study demonstrating its practical efficacy in the context of subspace tracking.
ON THE IMPORTANCE OF SECTORAL AND REGIONAL SHOCKS FOR PRICESETTING 1
, 1334
"... In 2011 all ECB publications feature a motif taken from the €100 banknote. NOTE: This Working Paper should not be reported as representing the views of the European Central Bank (ECB). The views expressed are those of the authors and do not necessarily reflect those of the ECB. This paper can be dow ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In 2011 all ECB publications feature a motif taken from the €100 banknote. NOTE: This Working Paper should not be reported as representing the views of the European Central Bank (ECB). The views expressed are those of the authors and do not necessarily reflect those of the ECB. This paper can be downloaded without charge from