Determining the Number of Factors in Approximate Factor Models
 Econometrica
, 2002
"... In this paper we develop some statistical theory for factor models of large dimensions. The focus is the determination of the number of factors, which is an unresolved issue in the rapidly growing literature on multifactor models. We propose a panel Cp criterion and show that the number of factors c ..."
Abstract

Cited by 224 (19 self)
In this paper we develop some statistical theory for factor models of large dimensions. The focus is the determination of the number of factors, which is an unresolved issue in the rapidly growing literature on multifactor models. We propose a panel Cp criterion and show that the number of factors can be consistently estimated using the criterion. The theory is developed under the framework of large crosssections (N) and large time dimensions (T). No restriction is imposed on the relation between N and T. Simulations show that the proposed criterion yields almost precise estimates of the number of factors for configurations of the panel data encountered in practice. The idea that variations in a large number of economic variables can be modelled bya small number of reference variables is appealing and is used in manyeconomic analysis. In the finance literature, the arbitrage pricing theory(APT) of Ross (1976) assumes that a small number of factors can be used to explain a large number of asset returns. 1
The Generalized Dynamic Factor Model: Identification and Estimation
 Review of Economics and Statistics
, 2000
"... This paper proposes a factor model with infinite dynamics and nonorthogonal idiosyncratic components. The model, which we call the generalized dynamic factor model, isnovel to the literature, and generalizes the static approximate factor model of Chamberlain and Rothschild (1983), as well as the ex ..."
Abstract

Cited by 77 (16 self)
This paper proposes a factor model with infinite dynamics and nonorthogonal idiosyncratic components. The model, which we call the generalized dynamic factor model, isnovel to the literature, and generalizes the static approximate factor model of Chamberlain and Rothschild (1983), as well as the exact factor model àlaSargent and Sims (1977). We provide identification conditions, propose an estimator of the common components, prove convergence as both time and crosssectional size go to infinity at appropriate rates and present simulation results. We use our model to construct a coincident index for the European Union. Such index is defined as the common component of real GDP within a model including several macroeconomic variables for each European country.
Are more data always better for factor analysis
 Journal of Econometrics
, 2006
"... Factors estimated from large macroeconomic panels are being used in an increasing number of applications. However, little is known about how the size and composition of the data affect the factor estimates. In this paper, we question whether it is possible to use more series to extract the factors a ..."
Abstract

Cited by 65 (0 self)
Factors estimated from large macroeconomic panels are being used in an increasing number of applications. However, little is known about how the size and composition of the data affect the factor estimates. In this paper, we question whether it is possible to use more series to extract the factors and that yet the resulting factors are less useful for forecasting, and the answer is yes. Such a problem tends to arise when the idiosyncratic errors are crosscorrelated. It can also arise if forecasting power is provided by a factor that is dominant in a small dataset but is a dominated factor in a larger dataset. In a real time forecasting exercise, we find that factors extracted from as few as 40 prescreened series often yield satisfactory or even better results than using all 147 series. Our simulation analysis is unique in that special attention is paid to crosscorrelated idiosyncratic errors, and we also allow the factors to have weak loadings on groups of series. It thus allows us to better understand the properties of the principal components estimator in empirical applications.
Confidence intervals for diffusion index forecasts and inference for factoraugmented regressions
, 2003
"... We consider the situation when there is a large number of series, N,eachwithTob servations, and each series has some predictive ability for some variable of interest. A methodology of growing interest is first to estimate common factors from the panel of data by the method of principal components an ..."
Abstract

Cited by 57 (11 self)
We consider the situation when there is a large number of series, N,eachwithTob servations, and each series has some predictive ability for some variable of interest. A methodology of growing interest is first to estimate common factors from the panel of data by the method of principal components and then to augment an otherwise standard regression with the estimated factors. In this paper, we show that the least squares estimates obtained from these factoraugmented regressions are √ T consistent and asymptotically normal if √ T/N → 0. The conditional mean predicted by the estimated factors is min [ √ T � √ N] consistent and asymptotically normal. Except when T/N goes to zero, inference should take into account the effect of “estimated regressors ” on the estimated conditional mean. We present analytical formulas for prediction intervals that are valid regardless of the magnitude of N/T and that can also be used when the factors are nonstationary.
Implications of dynamic factor models for VAR analysis
 NBER, WORKING PAPER
, 2005
"... This paper considers VAR models incorporating many time series that interact through a few dynamic factors. Several econometric issues are addressed including estimation of the number of dynamic factors and tests for the factor restrictions imposed on the VAR. Structural VAR identification based on ..."
Abstract

Cited by 50 (4 self)
This paper considers VAR models incorporating many time series that interact through a few dynamic factors. Several econometric issues are addressed including estimation of the number of dynamic factors and tests for the factor restrictions imposed on the VAR. Structural VAR identification based on timing restrictions, long run restrictions, and restrictions on factor loadings are discussed and practical computational methods suggested. Empirical analysis using U.S. data suggest several (7) dynamic factors, rejection of the exact dynamic factor model but support for an approximate factor model, and sensible results for a SVAR that identifies money policy shocks using timing restrictions.
A PANIC Attack on Unit Roots and Cointegration
, 2003
"... This paper develops a new methodology that makes use of the factor structure of large dimensional panels to understand the nature of nonstationarity in the data. We refer to it as PANIC – a ‘Panel Analysis of Nonstationarity in Idiosyncratic and Common components’. PANIC consists of univariate and ..."
Abstract

Cited by 47 (2 self)
This paper develops a new methodology that makes use of the factor structure of large dimensional panels to understand the nature of nonstationarity in the data. We refer to it as PANIC – a ‘Panel Analysis of Nonstationarity in Idiosyncratic and Common components’. PANIC consists of univariate and panel tests with a number of novel features. It can detect whether the nonstationarity is pervasive, or variablespecific, or both. It tests the components of the data instead of the observed series. Inference is therefore more accurate when the components have different orders of integration. PANIC also permits the construction of valid panel tests even when crosssection correlation invalidates pooling of statistics constructed using the observed data. The key to PANIC is consistent estimation of the components even when the regressions are individually spurious. We provide a rigorous theory for estimation and inference. In Monte Carlo simulations, the tests have very good size and power. PANIC is applied to a panel of inflation series.
Panel Data Models with Interactive Fixed Effects
, 2005
"... This paper considers large N and large T panel data models with unobservable multiple interactive effects. These models are useful for both micro and macro econometric modelings. In earnings studies, for example, workers ’ motivation, persistence, and diligence combined to influence the earnings in ..."
Abstract

Cited by 40 (4 self)
This paper considers large N and large T panel data models with unobservable multiple interactive effects. These models are useful for both micro and macro econometric modelings. In earnings studies, for example, workers ’ motivation, persistence, and diligence combined to influence the earnings in addition to the usual argument of innate ability. In macroeconomics, the interactive effects represent unobservable common shocks and their heterogeneous responses over cross sections. Since the interactive effects are allowed to be correlated with the regressors, they are treated as fixed effects parameters to be estimated along with the common slope coefficients. The model is estimated by the least squares method, which provides the interactiveeffects counterpart of the within estimator. We first consider model identification, and then derive the rate of convergence and the limiting distribution of the interactiveeffects estimator of the common slope coefficients. The estimator is shown to be √ NT consistent. This rate is valid even in the presence of correlations and heteroskedasticities in both dimensions, a striking contrast with fixed T framework in which serial correlation and heteroskedasticity imply unidentification. The asymptotic distribution is not necessarily centered at zero. Biased corrected estimators are derived. We also derive the constrained estimator and its limiting distribution, imposing additivity coupled with interactive effects. The problem of testing additive versus interactive effects is also studied. We also derive identification conditions for models with grand mean, timeinvariant regressors, and common regressors. It is shown that there exists a set of necessary and sufficient identification conditions for those models. Given identification, the rate of convergence and limiting results continue to hold. Key words and phrases: incidental parameters, additive effects, interactive effects, factor
TwoPass Tests of Asset Pricing Models with Useless Factors
, 1997
"... In this paper we investigate the properties of the standard twopass methodology of testing beta pricing models with misspecified factors. In a setting where a factor is useless, defined as being independent of all the asse t returns, we provide theoretical results and simulation evidence that the s ..."
Abstract

Cited by 38 (4 self)
In this paper we investigate the properties of the standard twopass methodology of testing beta pricing models with misspecified factors. In a setting where a factor is useless, defined as being independent of all the asse t returns, we provide theoretical results and simulation evidence that the secondpass crosssectional regression tends to find the beta risk of the useless factor priced more often than it should. More surprisingly, this misspecification bias exacerbates when the number of time series observations increases. Possible ways of detecting useless factors are also examined. When testing asset pricing models relating risk premiums on assets to their betas, the primary question of interest is whether the beta risk of a particular factor is priced (i.e., whether the estimated risk premium associated with a given factor is significantly di#erent from zero). Black, Jensen, and Scholes (1972) and Fama and MacBeth (1973) develop a twopass methodology in which the beta of each asset with respect to a factor is estimated in a firstpass time series regression, and estimated betas are then used in secondpass crosssectional regressions (CSRs) to estimate the risk premium of the factor. This twopass methodology is very intuitive and has been widely used in the literature. The properties of the test statistics and goodnessoffit measures under the twopass methodology are usually developed under the assumptions that the asset pricing model is correctly specified and that the factors are correctly identified. Shanken (1992) provides an excellent discussion of this twopass methodology, especially the large sample properties of the twopass CSR for the correctly specified model under the assumption that returns are conditionally homoskedastic. Jagannathan and Wa...