Results 1 - 10
of
27
The practical implementation of Bayesian model selection
- Institute of Mathematical Statistics
, 2001
"... In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is r ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is relevant for model selection. However, the practical implementation of this approach often requires carefully tailored priors and novel posterior calculation methods. In this article, we illustrate some of the fundamental practical issues that arise for two different model selection problems: the variable selection problem for the linear model and the CART model selection problem.
Bayesian Wavelet Regression on Curves with Application to a Spectroscopic Calibration Problem
, 2000
"... Motivated by calibration problems in near infrared spectroscopy, we consider the linear regression setting where the many predictor variables arise from sampling an essentially continuous curve at equally spaced points, and where there may be multiple predictands. We tackle this regression problem b ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
Motivated by calibration problems in near infrared spectroscopy, we consider the linear regression setting where the many predictor variables arise from sampling an essentially continuous curve at equally spaced points, and where there may be multiple predictands. We tackle this regression problem by calculating the wavelet transforms of the discretized curves, and then applying a Bayesian variable selection method using mixture priors to the multivariate regression of predictands on wavelet coecients. For prediction purposes, we average over a set of likely models. Applied to a particular problem in near infrared spectroscopy this approach was able to nd subsets of the wavelet coecients with overall better predictive performance than the more usual approaches. In the application the predictors available are measurements of the near infrared reectance spectrum of biscuit dough pieces at 256 equally spaced wavelengths. The aim is to predict the composition, ie the fat, our, sugar an...
Variable selection in clustering via Dirichlet process mixture models
, 2006
"... The increased collection of high-dimensional data in various fields has raised a strong interest in clustering algorithms and variable selection procedures. In this paper, we propose a model-based method that addresses the two problems simultaneously. We introduce a latent binary vector to identify ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
The increased collection of high-dimensional data in various fields has raised a strong interest in clustering algorithms and variable selection procedures. In this paper, we propose a model-based method that addresses the two problems simultaneously. We introduce a latent binary vector to identify discriminating variables and use Dirichlet process mixture models to define the cluster structure. We update the variable selection index using a Metropolis algorithm and obtain inference on the cluster structure via a split-merge Markov chain Monte Carlo technique. We explore the performance of the methodology on simulated data and illustrate an application with a dna microarray study.
Evolutionary Monte Carlo: Applications to C_p Model Sampling and Change Point Problem
- STATISTICA SINICA
, 2000
"... Motivated by the success of genetic algorithms and simulated annealing in hard optimization problems, the authors propose a new Markov chain Monte Carlo (MCMC) algorithm so called an evolutionary Monte Carlo algorithm. This algorithm has incorporated several attractive features of genetic algorithms ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Motivated by the success of genetic algorithms and simulated annealing in hard optimization problems, the authors propose a new Markov chain Monte Carlo (MCMC) algorithm so called an evolutionary Monte Carlo algorithm. This algorithm has incorporated several attractive features of genetic algorithms and simulated annealing into the framework of MCMC. It works by simulating a population of Markov chains in parallel, where each chain is attached to a different temperature. The population is updated by mutation (Metropolis update), crossover (partial state swapping) and exchange operators (full state swapping). The algorithm is illustrated through examples of the Cp-based model selection and change-point identification. The numerical results and the extensive comparisons show that evolutionary Monte Carlo is a promising approach for simulation and optimization.
The choice of variables in multivariate regression: a non-conjugate Bayesian decision theory approach
, 1999
"... INTRODUCTION Choice of regressor variables in linear regression has attracted considerable attention in the literature, from forward, backward and stepwise regression, model choice criteria such as Akaike's information criterion, to Bayesian techniques. We will focus on the Bayesian University o ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
INTRODUCTION Choice of regressor variables in linear regression has attracted considerable attention in the literature, from forward, backward and stepwise regression, model choice criteria such as Akaike's information criterion, to Bayesian techniques. We will focus on the Bayesian University of Kent at Canterbury, Institute of Mathematics and Statistics, Cornwallis Building, Canterbury, CT2 7NF, UK. FAX 01227-827932, email Philip.J.Brown@ukc.ac.uk y University College London, UK z Texas A & M University, USA 1 decision theory framework, first given by Lindley (1968) for univariate multiple regression, where costs attach to the inclusion of regressor variables. Here it is required to predict a future vector observation Y f comprising r components. Predictions are judged by quadratic loss to which is added a cost penalty on the regressor variables, x f
Bayesian Variable Selection Using the Gibbs Sampler
, 2000
"... Specification of the linear predictor for a generalised linear model requires determining which variables to include. We consider Bayesian strategies for performing this variable selection. In particular we focus on approaches based on the Gibbs sampler. Such approaches may be implemented using the ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Specification of the linear predictor for a generalised linear model requires determining which variables to include. We consider Bayesian strategies for performing this variable selection. In particular we focus on approaches based on the Gibbs sampler. Such approaches may be implemented using the publically available software BUGS. We illustrate the methods using a simple example. BUGS code is provided in an appendix. 1 Introduction In a Bayesian analysis of a generalised linear model, model uncertainty may be incorporated coherently by specifying prior probabilities for plausible models and calculating posterior probabilities using f(mjy) = f(m)f(yjm) P m2M f(m)f(y jm) ; m 2 M (1.1) where m denotes the model, M is the set of all models under consideration, f (m) is the prior probability of model m and f (yjm; fi m ) the likelihood of the data y under model m. The observed data y contribute to the posterior model probabilities through f(yjm), the marginal likelihood calculated...
Spatial dynamic factor analysis
, 2006
"... Abstract. A new class of space-time models derived from standard dynamic factor models is proposed. The temporal dependence is modeled by latent factors while the spatial dependence is modeled by the factor loadings. Factor analytic arguments are used to help identify temporal components that summar ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Abstract. A new class of space-time models derived from standard dynamic factor models is proposed. The temporal dependence is modeled by latent factors while the spatial dependence is modeled by the factor loadings. Factor analytic arguments are used to help identify temporal components that summarize most of the spatial variation of a given region. The temporal evolution of the factors is described in a number of forms to account for different aspects of time variation such as trend and seasonality. The spatial dependence is incorporated into the factor loadings by a combination of deterministic and stochastic elements thus giving them more flexibility and generalizing previous approaches. The new structure implies nonseparable space-time variation to observables, despite its conditionally independent nature, while reducing the overall dimensionality, and hence complexity, of the problem. The number of factors is treated as another unknown parameter and fully Bayesian inference is performed via a reversible jump Markov Chain Monte Carlo algorithm. The new class of models is tested against one synthetic dataset and applied to pollution data obtained from the Clean Air Status and Trends Network (CASTNet). Our factor model exhibited better predictive performance when compared to benchmark models, while capturing important aspects of spatial and temporal behavior of the data.
Bayesian Input Variable Selection Using Posterior Probabilities and Expected Utilities
, 2002
"... We consider the input variable selection in complex Bayesian hierarchical models. Our goal is to find a model with the smallest number of input variables having statistically or practically at least the same expected utility as the full model with all the available inputs. A good estimate for the ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We consider the input variable selection in complex Bayesian hierarchical models. Our goal is to find a model with the smallest number of input variables having statistically or practically at least the same expected utility as the full model with all the available inputs. A good estimate for the expected utility can be computed using cross-validation predictive densities. In the case of input selection and a large number of input combinations, the computation of the cross-validation predictive densities for each model easily becomes computationally prohibitive. We propose to use the posterior probabilities obtained via variable dimension MCMC methods to find out potentially useful input combinations, for which the final model choice and assessment is done using the expected utilities.
P.: Identification of DNA Regulatory Motifs Using Bayesian Variable Selection Suite
- Bioinformatics
"... Motivation: Understanding the mechanisms that determine gene expression regula-tion is an important and challenging problem. A common approach consists of iden-tifying DNA-binding sites from a collection of co-regulated genes and their nearby non-coding DNA sequences. Here, we consider a regression ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Motivation: Understanding the mechanisms that determine gene expression regula-tion is an important and challenging problem. A common approach consists of iden-tifying DNA-binding sites from a collection of co-regulated genes and their nearby non-coding DNA sequences. Here, we consider a regression model that linearly re-lates gene expression levels to a sequence matching score of nucleotide patterns. We use Bayesian models and stochastic search techniques to select transcription factor binding site candidates, as an alternative to stepwise regression procedures used by other investigators. Bioinformatics Advance Access published April 29, 2004 Bioinfor matics © Oxford University Press 2004; all rights reserved. Results: We demonstrate through simulated data the improved performance of the Bayesian variable selection method compared to the stepwise procedure. We then analyze and discuss the results from experiments involving well studied pathways of S. cerevisiae and S. pombe. We identify regulatory motifs known to be related to the experimental conditions considered. Some of our selected motifs are also in agreement with recent findings by other researchers. In addition, our results include some novel motifs that constitute promising sets for further assessment. ∗ To whom correspondence should be addressed 1 Availability: The Matlab code for running the Bayesian variable selection method may be obtained from the corresponding author. Contact:

