Results 11  20
of
95
Automatic choice of dimensionality for PCA
, 2000
"... A central issue in principal component analysis (PCA) is choosing the number of principal components to be retained. By interpreting PCA as density estimation, we show how to use Bayesian model selection to estimate the true dimensionality of the data. The resulting estimate is simple to compute ..."
Abstract

Cited by 72 (1 self)
 Add to MetaCart
(Show Context)
A central issue in principal component analysis (PCA) is choosing the number of principal components to be retained. By interpreting PCA as density estimation, we show how to use Bayesian model selection to estimate the true dimensionality of the data. The resulting estimate is simple to compute yet guaranteed to pick the correct dimensionality, given enough data. The estimate involves an integral over the Steifel manifold of kframes, which is difficult to compute exactly. But after choosing an appropriate parameterization and applying Laplace's method, an accurate and practical estimator is obtained. In simulations, it is convincingly better than crossvalidation and other proposed algorithms, plus it runs much faster.
Model Selection and Accounting for Model Uncertainty in Linear Regression Models
, 1993
"... We consider the problems of variable selection and accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. The complete B ..."
Abstract

Cited by 49 (6 self)
 Add to MetaCart
(Show Context)
We consider the problems of variable selection and accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. The complete Bayesian solution to this problem involves averaging over all possible models when making inferences about quantities of interest. This approach is often not practical. In this paper we offer two alternative approaches. First we describe a Bayesian model selection algorithm called "Occam's "Window" which involves averaging over a reduced set of models. Second, we describe a Markov chain Monte Carlo approach which directly approximates the exact solution. Both these model averaging procedures provide better predictive performance than any single model which might reasonably have been selected. In the extreme case where there are many candidate predictors but there is no relationship between any of them and the response, standard variable selection procedures often choose some subset of variables that yields a high R² and a highly significant overall F value. We refer to this unfortunate phenomenon as "Freedman's Paradox" (Freedman, 1983). In this situation, Occam's vVindow usually indicates the null model as the only one to be considered, or else a small number of models including the null model, thus largely resolving the paradox.
H: Computing Bayes factors using thermodynamic integration
 Syst Biol
"... Abstract.—In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean ..."
Abstract

Cited by 42 (6 self)
 Add to MetaCart
Abstract.—In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean estimation procedure. In the present article, we propose to employ another method, based on an analogy with statistical physics, called thermodynamic integration. We describe the method, propose an implementation, and show on two analytical examples that this numerical method yields reliable estimates. In contrast, the harmonic mean estimator leads to a strong overestimation of the marginal likelihood, which is all the more pronounced as the model is higher dimensional. As a result, the harmonic mean estimator systematically favors more parameterrich models, an artefact that might explain some recent puzzling observations, based on harmonic mean estimates, suggesting that Bayes factors tend to overscore complex models. Finally, we apply our method to the comparison of several alternative models of aminoacid replacement. We confirm our previous observations, indicating that modeling pattern heterogeneity across sites tends to yield better models than standard empirical matrices. [Bayes factor; harmonic mean; mixture model; path sampling; phylogeny; thermodynamic integration.] Bayesian methods have become popular in molecular phylogenetics over the recent years. The simple and intuitive interpretation of the concept of probabilities
Learning Probabilistic Networks
 THE KNOWLEDGE ENGINEERING REVIEW
, 1998
"... A probabilistic network is a graphical model that encodes probabilistic relationships between variables of interest. Such a model records qualitative influences between variables in addition to the numerical parameters of the probability distribution. As such it provides an ideal form for combini ..."
Abstract

Cited by 41 (2 self)
 Add to MetaCart
A probabilistic network is a graphical model that encodes probabilistic relationships between variables of interest. Such a model records qualitative influences between variables in addition to the numerical parameters of the probability distribution. As such it provides an ideal form for combining prior knowledge, which might be limited solely to experience of the influences between some of the variables of interest, and data. In this paper, we first show how data can be used to revise initial estimates of the parameters of a model. We then progress to showing how the structure of the model can be revised as data is obtained. Techniques for learning with incomplete data are also covered.
Bayesian Deviance, the Effective Number of Parameters, and the Comparison of Arbitrarily Complex Models
, 1998
"... We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. We follow Dempster in examining the posterior distribution of the loglikelihood under each model, from which we derive measures of fit and complexity (the effective number of p ..."
Abstract

Cited by 35 (7 self)
 Add to MetaCart
(Show Context)
We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. We follow Dempster in examining the posterior distribution of the loglikelihood under each model, from which we derive measures of fit and complexity (the effective number of parameters). These may be combined into a Deviance Information Criterion (DIC), which is shown to have an approximate decisiontheoretic justification. Analytic and asymptotic identities reveal the measure of complexity to be a generalisation of a wide range of previous suggestions, with particular reference to the neural network literature. The contributions of individual observations to fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. The procedure is illustrated in a number of examples, and throughout it is emphasised that the required quantities are trivial to compute in a Markov chain Monte Carlo analysis, and require no analytic work for new...
PACBayesian generalization error bounds for Gaussian process classification
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2002
"... ..."
Modelling functional integration: a comparison of structural equation and dynamic causal models
 NeuroImage
, 2004
"... The brain appears to adhere to two fundamental principles of functional organisation, functional integration and functional specialisation, where the integration within and among specialised areas is mediated by effective connectivity. In this paper we review two different approaches to modelling ef ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
(Show Context)
The brain appears to adhere to two fundamental principles of functional organisation, functional integration and functional specialisation, where the integration within and among specialised areas is mediated by effective connectivity. In this paper we review two different approaches to modelling effective connectivity from fMRI data, Structural Equation Models (SEMs) and Dynamic Causal Models (DCMs). In common to both approaches are model comparison frameworks in which inferences can be made about effective connectivity per se and about how that connectivity can be changed by perceptual or cognitive set. Underlying the two approaches, however, are two very different generative models. In DCM a distinction is made between the ‘neuronal level ’ and the ‘hemodynamic level’. Experimental inputs cause changes in effective connectivity expressed at the level of neurodynamics which in turn cause changes in the observed hemodynamics. In SEM changes in effective connectivity lead directly to changes in the covariance structure of the observed hemodynamics. Because changes in effective connectivity in the brain occur at a neuronal level DCM is the preferred model for fMRI data. This review focuses on the underlying assumptions and limitations of each model and demonstrates their application to data from a study of attention to visual motion.
Multiple imputation for model checking: Completeddata plots with missing and latent data
 Biometrics
, 2005
"... Summary. In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset—corresponding to the observed data and imputed unobserved data—using standard procedures for completedata inference. Here, we ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
(Show Context)
Summary. In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset—corresponding to the observed data and imputed unobserved data—using standard procedures for completedata inference. Here, we extend this approach to model checking by demonstrating the advantages of the use of completeddata model diagnostics on imputed completed datasets. The approach is set in the theoretical framework of Bayesian posterior predictive checks (but, as with missingdata imputation, our methods of missingdata model checking can also be interpreted as “predictive inference ” in a nonBayesian context). We consider the graphical diagnostics within this framework. Advantages of the completeddata approach include: (1) One can often check model fit in terms of quantities that are of key substantive interest in a natural way, which is not always possible using observed data alone. (2) In problems with missing data, checks may be devised that do not require to model the missingness or inclusion mechanism; the latter is useful for the analysis of ignorable but unknown data collection mechanisms, such as are often assumed in the analysis of sample surveys and observational studies. (3) In many problems with latent data, it is possible to check qualitative features of the model (for example, independence of two variables) that can be naturally formalized with the help of the latent data. We illustrate with several applied examples.
model order selection and dynamic source models
 in Independent Component Analysis: Principles and Practice
"... ..."