Results 1  10
of
66
Probabilistic Principal Component Analysis
 Journal of the Royal Statistical Society, Series B
, 1999
"... Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximumlikelihood estimation of paramet ..."
Abstract

Cited by 474 (5 self)
 Add to MetaCart
Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximumlikelihood estimation of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss, with illustrative examples, the advantages conveyed by this probabilistic approach to PCA. Keywords: Principal component analysis
Mixtures of Probabilistic Principal Component Analysers
, 1998
"... Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a com ..."
Abstract

Cited by 396 (6 self)
 Add to MetaCart
Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a combination of local linear PCA projections. However, conventional PCA does not correspond to a probability density, and so there is no unique way to combine PCA models. Previous attempts to formulate mixture models for PCA have therefore to some extent been ad hoc. In this paper, PCA is formulated within a maximumlikelihood framework, based on a specific form of Gaussian latent variable model. This leads to a welldefined mixture model for probabilistic principal component analysers, whose parameters can be determined using an EM algorithm. We discuss the advantages of this model in the context of clustering, density modelling and local dimensionality reduction, and we demonstrate its applicat...
The Effect of Schooling and Ability on Achievement Test Scores,” The Journal of Econometrics (forthcoming
, 2004
"... 1This research is supported by grants from NICHD40404300085261 and NSF SES0099195. We thank Chris Winship for a stimulating discussion which inßuenced this paper. (See his related research Winship, 2001). We have beneÞtted from numerous comments by Derek Neal and Chris Winship on various aspec ..."
Abstract

Cited by 66 (27 self)
 Add to MetaCart
1This research is supported by grants from NICHD40404300085261 and NSF SES0099195. We thank Chris Winship for a stimulating discussion which inßuenced this paper. (See his related research Winship, 2001). We have beneÞtted from numerous comments by Derek Neal and Chris Winship on various aspects of this paper.
Dynamic Discrete Choice and Dynamic Treatment Effects
, 2005
"... This paper considers semiparametric identification of structural dynamic discrete choice models and models for dynamic treatment effects. Time to treatment and counterfactual outcomes associated with treatment times are jointly analyzed. We examine the implicit assumptions of the dynamic treatment m ..."
Abstract

Cited by 65 (15 self)
 Add to MetaCart
This paper considers semiparametric identification of structural dynamic discrete choice models and models for dynamic treatment effects. Time to treatment and counterfactual outcomes associated with treatment times are jointly analyzed. We examine the implicit assumptions of the dynamic treatment model using the structural model as a benchmark. For the structural model we show the gains from using cross equation restrictions connecting choices to associated measurements and outcomes. In the dynamic discrete choice model, we identify both subjective and objective outcomes, distinguishing ex post and ex ante outcomes. We show how to identify agent information sets.
Panel Data Models with Interactive Fixed Effects
, 2005
"... This paper considers large N and large T panel data models with unobservable multiple interactive effects. These models are useful for both micro and macro econometric modelings. In earnings studies, for example, workers ’ motivation, persistence, and diligence combined to influence the earnings in ..."
Abstract

Cited by 40 (4 self)
 Add to MetaCart
This paper considers large N and large T panel data models with unobservable multiple interactive effects. These models are useful for both micro and macro econometric modelings. In earnings studies, for example, workers ’ motivation, persistence, and diligence combined to influence the earnings in addition to the usual argument of innate ability. In macroeconomics, the interactive effects represent unobservable common shocks and their heterogeneous responses over cross sections. Since the interactive effects are allowed to be correlated with the regressors, they are treated as fixed effects parameters to be estimated along with the common slope coefficients. The model is estimated by the least squares method, which provides the interactiveeffects counterpart of the within estimator. We first consider model identification, and then derive the rate of convergence and the limiting distribution of the interactiveeffects estimator of the common slope coefficients. The estimator is shown to be √ NT consistent. This rate is valid even in the presence of correlations and heteroskedasticities in both dimensions, a striking contrast with fixed T framework in which serial correlation and heteroskedasticity imply unidentification. The asymptotic distribution is not necessarily centered at zero. Biased corrected estimators are derived. We also derive the constrained estimator and its limiting distribution, imposing additivity coupled with interactive effects. The problem of testing additive versus interactive effects is also studied. We also derive identification conditions for models with grand mean, timeinvariant regressors, and common regressors. It is shown that there exists a set of necessary and sufficient identification conditions for those models. Given identification, the rate of convergence and limiting results continue to hold. Key words and phrases: incidental parameters, additive effects, interactive effects, factor
Learning with Matrix Factorization
, 2004
"... Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning ..."
Abstract

Cited by 38 (4 self)
 Add to MetaCart
Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning for over a century, with many new formulations and models suggested in recent
Algebraic factor analysis: tetrads, pentads and beyond
"... Factor analysis refers to a statistical model in which observed variables are conditionally independent given fewer hidden variables, known as factors, and all the random variables follow a multivariate normal distribution. The parameter space of a factor analysis model is a subset of the cone of po ..."
Abstract

Cited by 28 (12 self)
 Add to MetaCart
Factor analysis refers to a statistical model in which observed variables are conditionally independent given fewer hidden variables, known as factors, and all the random variables follow a multivariate normal distribution. The parameter space of a factor analysis model is a subset of the cone of positive definite matrices. This parameter space is studied from the perspective of computational algebraic geometry. Gröbner bases and resultants are applied to compute the ideal of all polynomial functions that vanish on the parameter space. These polynomials, known as model invariants, arise from rank conditions on a symmetric matrix under elimination of the diagonal entries of the matrix. Besides revealing the geometry of the factor analysis model, the model invariants also furnish useful statistics for testing goodnessoffit. 1
Temporal BYY Learning for State Space Approach, Hidden Markov Model, and Blind Source Separation
, 2000
"... Temporal BYY (TBYY) learning has been presented for modeling signal in a general state space approach, which provides not only a unified point of view on Kalman filter, hidden Markov model (HMM), independent component analysis (ICA), and blind source separation (BSS) with extensions, but also furthe ..."
Abstract

Cited by 26 (20 self)
 Add to MetaCart
Temporal BYY (TBYY) learning has been presented for modeling signal in a general state space approach, which provides not only a unified point of view on Kalman filter, hidden Markov model (HMM), independent component analysis (ICA), and blind source separation (BSS) with extensions, but also further advances on these studies, including a higher order HMM, independent HMM for binary BSS, temporal ICA (TICA), and temporal factor analysis for real BSS without and with noise. Adaptive algorithms are developed for implementation and criteria are provided for selecting an appropriate number of states or sources. Moreover, theorems are given on the conditions for source separation by linear and nonlinear TICA. Particularly, it has been shown that not only nonGaussian but also Gaussian sources can also be separated by TICA via exploring temporal dependence. Experiments are also demonstrated.
BYY Harmony Learning, Independent State Space, and Generalized APT Financial Analyses
, 2001
"... First, the relationship between factor analysis (FA) and the wellknown arbitrage pricing theory (APT) for financial market has been discussed comparatively, with a number of tobeimproved problems listed. An overview has been made from a unified perspective on the related studies in the literature ..."
Abstract

Cited by 23 (20 self)
 Add to MetaCart
First, the relationship between factor analysis (FA) and the wellknown arbitrage pricing theory (APT) for financial market has been discussed comparatively, with a number of tobeimproved problems listed. An overview has been made from a unified perspective on the related studies in the literatures of statistics, control theory, signal processing, and neural networks. Second, we introduce the fundamentals of the Bayesian Ying Yang (BYY) system and the harmony learning principle which has been systematically developed in past several years as a unified statistical framework for parameter learning, regularization and model selection, in both nontemporal and temporal stochastic environments. We further show that a specific case of the framework, called BYY independent state space (ISS) system, provides a general guide for systematically tackling various FA related learning tasks and the above tobeimproved problems for the APT analyses. Third, on various specific cases of the BYY ISS s...
A Bayesian Approach to Blind Source Separation
, 1999
"... This paper adopts a Bayesian statistical approach and a linear synthesis model ..."
Abstract

Cited by 20 (6 self)
 Add to MetaCart
This paper adopts a Bayesian statistical approach and a linear synthesis model