Results 1  10
of
20
Bayesian Adaptive Sampling for Variable Selection and Model Averaging
"... For the problem of model choice in linear regression, we introduce a Bayesian adaptive sampling algorithm (BAS), that samples models without replacement from the space of models. For problems that permit enumeration of all models BAS is guaranteed to enumerate the model space in 2 p iterations where ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
(Show Context)
For the problem of model choice in linear regression, we introduce a Bayesian adaptive sampling algorithm (BAS), that samples models without replacement from the space of models. For problems that permit enumeration of all models BAS is guaranteed to enumerate the model space in 2 p iterations where p is the number of potential variables under consideration. For larger problems where sampling is required, we provide conditions under which BAS provides perfect samples without replacement. When the sampling probabilities in the algorithm are the marginal variable inclusion probabilities, BAS may be viewed as sampling models “near ” the median probability model of Barbieri and Berger. As marginal inclusion probabilities are not known in advance we discuss several strategies to estimate adaptively the marginal inclusion probabilities within BAS. We illustrate the performance of the algorithm using simulated and real data and show that BAS can outperform Markov chain Monte Carlo methods. The algorithm is implemented in the R package BAS available at CRAN.
Bayesian Variable Selection in Structured HighDimensional Covariate Spaces with Applications in Genomics
 Journal of the American Statistical Association
, 2010
"... We consider the problem of variable selection in regression modeling in high dimensional spaces where there is known structure among the covariates. This is an unconventional variable selection problem for two reasons: (1) The dimension of the covariate space is comparable, and often much larger, ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
We consider the problem of variable selection in regression modeling in high dimensional spaces where there is known structure among the covariates. This is an unconventional variable selection problem for two reasons: (1) The dimension of the covariate space is comparable, and often much larger, than the number of subjects in the study, and (2) the covariate space is highly structured, and in some cases it is desirable to incorporate this structural information in to the model building process. We approach this problem through the Bayesian variable selection framework, where we assume that the covariates lie on an undirected graph and formulate an Ising prior on the model space for incorporating structural information. Certain computational and statistical problems arise that are unique to such high dimensional, structured settings, the most interesting being the phenomenon of phase transitions. We propose theoretical and computational schemes to mitigate these problems. We illustrate our methods on two different graph structures: the linear chain and the regular graph of degree 푘. Finally, we use our methods to study a specific application in genomics: the modeling of transcription factor binding sites in DNA sequences.
Feature Selection via BlockRegularized Regression
"... Identifying covarying causal elements in very high dimensional feature space with internal structures, e.g., a space with as many as millions of linearly ordered features, as one typically encounters in problems such as whole genome association (WGA) mapping, remains an open problem in statistical ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Identifying covarying causal elements in very high dimensional feature space with internal structures, e.g., a space with as many as millions of linearly ordered features, as one typically encounters in problems such as whole genome association (WGA) mapping, remains an open problem in statistical learning. We propose a blockregularized regression model for sparse variable selection in a highdimensional space where the covariates are linearly ordered, and are possibly subject to local statistical linkages (e.g., block structures) due to spacial or temporal proximity of the features. Our goal is to identify a small subset of relevant covariates that are not merely from random positions in the ordering, but grouped as contiguous blocks from large number of ordered covariates. Following a typical linear regression framework between the features and the response, our proposed model employs a sparsityenforcing Laplacian prior for the regression coefficients, augmented by a 1storder Markovian process along the feature sequence that “activates” the regression coefficients in a coupled fashion. We describe a samplingbased learning algorithm and demonstrate the performance of our method on simulated and biological data for marker identification under WGA. 1
Dependency networks for genomewide data
"... We describe a new stochastic search algorithm for linear regression models called the bounded mode stochastic search (BMSS). We make use of BMSS to perform variable selection and classification as well as to construct sparse dependency networks. Furthermore, we show how to determine genetic networks ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We describe a new stochastic search algorithm for linear regression models called the bounded mode stochastic search (BMSS). We make use of BMSS to perform variable selection and classification as well as to construct sparse dependency networks. Furthermore, we show how to determine genetic networks from genomewide data that involves any combination of continuous and discrete variables. We illustrate our methodology with several simulated and realworld datasets.
Bayesian variable selection for logistic models using auxiliary mixture sampling
 Journal of Computational and Graphical Statistics
, 2008
"... provided by the University Library and the ITServices. The aim is to enable open access to the scholarly output of the WU. ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
provided by the University Library and the ITServices. The aim is to enable open access to the scholarly output of the WU.
RaoBlackwellization for Bayesian Variable Selection and Model Averaging in Linear and Binary Regression: A Novel Data Augmentation Approach
"... Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm addresses the problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, where the posterior distribution over models is ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm addresses the problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, where the posterior distribution over models is used to select models or combine them via Bayesian model averaging (BMA). Although conceptually straightforward, BMA is often difficult to implement in practice, since either the number of covariates is too large for enumeration of all subsets, calculations cannot be done analytically, or both. For orthogonal designs with the appropriate choice of prior, the posterior probability of any model can be calculated without having to enumerate the entire model space and scales linearly with the number of predictors, p. In this article we extend this idea to a much broader class of nonorthogonal design matrices. We propose a novel method which augments the observed nonorthogonal design by at most p new rows to obtain a design matrix with orthogonal columns and generate the “missing ” response variables in a data augmentation algorithm. We show that our data augmentation approach keeps the original posterior distribution of interest unaltered, and develop methods to construct RaoBlackwellized estimates of several quantities of interest, including posterior model probabilities of any model, which may not be available from an ordinary Gibbs sampler. Our method can be used for BMA in linear regression and binary regression with nonorthogonal design matrices in conjunction with independent “spike and slab ” priors with a continuous prior component that is a Cauchy or other heavy tailed distribution that may be represented as a scale mixture of normals. We provide simulated and real examples to illustrate the methodology. Supplemental materials for the manuscript are available online.
Modeling uncertainty in macroeconomic growth determinants using Gaussian graphical models
, 2009
"... Model uncertainty has become a central focus of policy discussion surrounding the determinants of economic growth. Over 140 regressors have been employed in growth empirics due to the proliferation of several new growth theories in the past two decades. Recently Bayesian model averaging (BMA) has be ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Model uncertainty has become a central focus of policy discussion surrounding the determinants of economic growth. Over 140 regressors have been employed in growth empirics due to the proliferation of several new growth theories in the past two decades. Recently Bayesian model averaging (BMA) has been employed to address model uncertainty and to provide clear policy implications by identifying robust growth determinants. The BMA approaches were, however, limited to linear regression models that abstract from possible dependencies embedded in the covariance structures of growth determinants. The recent empirical growth literature has developed jointness measures to highlight such dependencies. We address model uncertainty and covariate dependencies in a comprehensive Bayesian framework that allows for structural learning in linear regressions and Gaussian graphical models. A common prior specification across the entire comprehensive framework provides consistency. Gaussian graphical models allow for a principled analysis of dependency structures, which allows us to generate a much more parsimonious set of fundamental growth determinants. Our empirics are based on a prominent growth dataset with 41 potential economic factors that has been utilized in numerous previous analyses to account for model uncertainty as well as jointness.
Compatibility of prior specifications across linear models
 Statistical Science
, 2008
"... Abstract. Bayesian model comparison requires the specification of a prior distribution on the parameter space of each candidate model. In this connection two concerns arise: on the one hand the elicitation task rapidly becomes prohibitive as the number of models increases; on the other hand numerous ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. Bayesian model comparison requires the specification of a prior distribution on the parameter space of each candidate model. In this connection two concerns arise: on the one hand the elicitation task rapidly becomes prohibitive as the number of models increases; on the other hand numerous prior specifications can only exacerbate the wellknown sensitivity to prior assignments, thus producing less dependable conclusions. Within the subjective framework, both difficulties can be counteracted by linking priors across models in order to achieve simplification and compatibility; we discuss links with related objective approaches. Given an encompassing, or full, model together with a prior on its parameter space, we review and summarize a few procedures for deriving priors under a submodel, namely marginalization, conditioning, and Kullback–Leibler projection. These techniques are illustrated and discussed with reference to variable selection in linear models adopting a conventional gprior; comparisons with existing standard approaches are provided. Finally, the relative merits of each procedure are evaluated through simulated and real data sets. Key words and phrases: Bayes factor, compatible prior, conjugate prior, gprior, hypothesis testing, Kullback–Leibler projection, nested model, variable selection.
Bayesian Restoration of Digital Images Employing Markov Chain Monte Carlo a Review
, 2006
"... A review of Bayesian restoration of digital images based on Monte Carlo techniques is presented. The topics covered include Likelihood, Prior and Posterior distributions, Poisson, Binary symmetric channel and Gaussian channel models of Likelihood distribution, Ising and Potts spin models of Prior di ..."
Abstract
 Add to MetaCart
(Show Context)
A review of Bayesian restoration of digital images based on Monte Carlo techniques is presented. The topics covered include Likelihood, Prior and Posterior distributions, Poisson, Binary symmetric channel and Gaussian channel models of Likelihood distribution, Ising and Potts spin models of Prior distribution, restoration of an image through Posterior maximization, statistical estimation of a true image from Posterior ensembles, Markov Chain Monte Carlo methods and cluster algorithms.
Orthogonal Data Augmentation for Bayesian Model Averaging
"... Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm can easily deal with this problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, where the posterior distribution ov ..."
Abstract
 Add to MetaCart
(Show Context)
Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm can easily deal with this problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, where the posterior distribution over models is used to select models or combine them in Bayesian model averaging. Although conceptually straightforward, it is often difficult to implement in practice, since either the number of covariates is too large, or calculations cannot be done analytically, or both. For orthogonal designs with the appropriate choice of prior, the posterior probability of any model can be calculated without having to enumerate the entire model space. In this article we propose a novel method, which augments the observed nonorthogonal design by new rows to obtain a design matrix with orthogonal columns. We show that our data augmentation approach keeps the original posterior distribution of interest unaltered, and develop methods to construct RaoBlackwellized estimates of several quantities of interest, including posterior model probabilities, which may not be available from an ordinary Gibbs sampler. The method can be used for BMA in linear regression with Cauchy or other heavy tailed priors that may be represented as a scale mixture of normals, as well as binary regression. We provide simulated and real examples to illustrate the methodology. Supplemental materials for the manuscript are available online.