Results 1 - 10
of
13
Bayesian Model Averaging for Linear Regression Models
- Journal of the American Statistical Association
, 1997
"... We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem in ..."
Abstract
-
Cited by 133 (12 self)
- Add to MetaCart
We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem involves averaging over all possible models (i.e., combinations of predictors) when making inferences about quantities of
Approaches for Bayesian variable selection
- Statistica Sinica
, 1997
"... Abstract: This paper describes and compares various hierarchical mixture prior formulations of variable selection uncertainty in normal linear regression models. These include the nonconjugate SSVS formulation of George and McCulloch (1993), as well as conjugate formulations which allow for analytic ..."
Abstract
-
Cited by 75 (4 self)
- Add to MetaCart
Abstract: This paper describes and compares various hierarchical mixture prior formulations of variable selection uncertainty in normal linear regression models. These include the nonconjugate SSVS formulation of George and McCulloch (1993), as well as conjugate formulations which allow for analytical simplification. Hyperparameter settings which base selection on practical significance, and the implications of using mixtures with point priors are discussed. Computational methods for posterior evaluation and exploration are considered. Rapid updating methods are seen to provide feasible methods for exhaustive evaluation using Gray Code sequencing in moderately sized problems, and fast Markov Chain Monte Carlo exploration in large problems. Estimation of normalization constants is seen to provide improved posterior estimates of individual model probabilities and the total visited probability. Various procedures are illustrated on simulated sample problems and on a real problem concerning the construction of financial index tracking portfolios.
Benchmark Priors for Bayesian Model Averaging
- FORTHCOMING IN THE JOURNAL OF ECONOMETRICS
, 2001
"... In contrast to a posterior analysis given a particular sampling model, posterior model probabilities in the context of model uncertainty are typically rather sensitive to the specification of the prior. In particular, “diffuse” priors on model-specific parameters can lead to quite unexpected consequ ..."
Abstract
-
Cited by 61 (3 self)
- Add to MetaCart
In contrast to a posterior analysis given a particular sampling model, posterior model probabilities in the context of model uncertainty are typically rather sensitive to the specification of the prior. In particular, “diffuse” priors on model-specific parameters can lead to quite unexpected consequences. Here we focus on the practically relevant situation where we need to entertain a (large) number of sampling models and we have (or wish to use) little or no subjective prior information. We aim at providing an “automatic” or “benchmark” prior structure that can be used in such cases. We focus on the Normal linear regression model with uncertainty in the choice of regressors. We propose a partly noninformative prior structure related to a Natural Conjugate g-prior specification, where the amount of subjective information requested from the user is limited to the choice of a single scalar hyperparameter g0j. The consequences of different choices for g0j are examined. We investigate theoretical properties, such as consistency of the implied Bayesian procedure. Links with classical information criteria are provided. More importantly, we examine the finite sample implications of several choices of g0j in a simulation study. The use of the MC3 algorithm of Madigan and York (1995), combined with efficient coding in Fortran, makes it feasible to conduct large simulations. In addition to posterior criteria, we shall also compare the predictive performance of different priors. A classic example concerning the economics of crime will also be provided and contrasted with results in the literature. The main findings of the paper will lead us to propose a “benchmark” prior specification in a linear regression context with model uncertainty.
The practical implementation of Bayesian model selection
- Institute of Mathematical Statistics
, 2001
"... In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is r ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is relevant for model selection. However, the practical implementation of this approach often requires carefully tailored priors and novel posterior calculation methods. In this article, we illustrate some of the fundamental practical issues that arise for two different model selection problems: the variable selection problem for the linear model and the CART model selection problem.
Bayesian model averaging
- STAT.SCI
, 1999
"... Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to over-con dent inferences and decisions tha ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to over-con dent inferences and decisions that are more risky than one thinks they are. Bayesian model averaging (BMA) provides a coherent mechanism for accounting for this model uncertainty. Several methods for implementing BMA haverecently emerged. We discuss these methods and present anumber of examples. In these examples, BMA provides improved out-of-sample predictive performance. We also provide a catalogue of
On the Relationship Between Markov Chain Monte Carlo Methods for Model Uncertainty
- JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
, 2001
"... This article considers Markov chain computational methods for incorporating uncertainty about the dimension of a parameter when performing inference within a Bayesian setting. A general class of methods is proposed for performing such computations, based upon a product space representation of the ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
This article considers Markov chain computational methods for incorporating uncertainty about the dimension of a parameter when performing inference within a Bayesian setting. A general class of methods is proposed for performing such computations, based upon a product space representation of the problem which is similar to that of Carlin and Chib. It is shown that all of the existing algorithms for incorporation of model uncertainty into Markov chain Monte Carlo (MCMC) can be derived as special cases of this general class of methods. In particular, we show that the popular reversible jump method is obtained when a special form of Metropolis--Hastings (M--H) algorithm is applied to the product space. Furthermore, the Gibbs sampling method and the variable selection method are shown to derive straightforwardly from the general framework. We believe that these new relationships between methods, which were until now seen as diverse procedures, are an important aid to the understanding of MCMC model selection procedures and may assist in the future development of improved procedures. Our discussion also sheds some light upon the important issues of "pseudo-prior" selection in the case of the Carlin and Chib sampler and choice of proposal distribution in the case of reversible jump. Finally, we propose efficient reversible jump proposal schemes that take advantage of any analytic structure that may be present in the model. These proposal schemes are compared with a standard reversible jump scheme for the problem of model order uncertainty in autoregressive time series, demonstrating the improvements which can be achieved through careful choice of proposals
Discovering and Reconciling Value Conflicts for Numerical Data Integration
, 2001
"... The built-up in Information Technology capital fueled by the Internet and cost-e#ectiveness of new telecommunications technologies has led to a proliferation of information systems that are in dire need to exchange information but incapable of doing so due to the lack of semantic interoperability. I ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
The built-up in Information Technology capital fueled by the Internet and cost-e#ectiveness of new telecommunications technologies has led to a proliferation of information systems that are in dire need to exchange information but incapable of doing so due to the lack of semantic interoperability. It is now evident that physical connectivity (the ability to exchange bits and bytes) is no longer adequate: the integration of data from autonomous and heterogeneous systems calls for the prior identification and resolution of semantic conflicts that may be present. Unfortunately, this requires the system integrator to sift through the data from disparate systems in a painstaking manner. We suggest that this process can be partially automated by presenting a methodology and technique for the discovery of potential semantic conflicts as well as the underlying data transformation needed to resolve the conflicts. Our methodology begins by classifying data value conflicts into two categories: context independent and context dependent. While context independent conflicts are usually caused by unexpected errors, the context dependent conflicts are primarily a result of the heterogeneity of underlying data sources. To facilitate data integration, data value conversion rules are proposed to describe the quantitative relationships among data values involving context dependent conflicts. A general approach is proposed to discover data value conversion rules from the data. The approach consists of the five major steps: relevant attribute analysis, candidate model selection, conversion function generation, conversion function selection and conversion rule formation. It is being implemented in a prototype system, DIRECT, for business data using statistics based techniques. Preliminary stu...
Bayesian Variable Selection Using the Gibbs Sampler
, 2000
"... Specification of the linear predictor for a generalised linear model requires determining which variables to include. We consider Bayesian strategies for performing this variable selection. In particular we focus on approaches based on the Gibbs sampler. Such approaches may be implemented using the ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Specification of the linear predictor for a generalised linear model requires determining which variables to include. We consider Bayesian strategies for performing this variable selection. In particular we focus on approaches based on the Gibbs sampler. Such approaches may be implemented using the publically available software BUGS. We illustrate the methods using a simple example. BUGS code is provided in an appendix. 1 Introduction In a Bayesian analysis of a generalised linear model, model uncertainty may be incorporated coherently by specifying prior probabilities for plausible models and calculating posterior probabilities using f(mjy) = f(m)f(yjm) P m2M f(m)f(y jm) ; m 2 M (1.1) where m denotes the model, M is the set of all models under consideration, f (m) is the prior probability of model m and f (yjm; fi m ) the likelihood of the data y under model m. The observed data y contribute to the posterior model probabilities through f(yjm), the marginal likelihood calculated...
Direct: a system for mining data value conversion rules from disparate sources, Decision Support Systems
, 2001
"... may be quoted without explicit permission, provided that full credit including © notice is given to the source. This paper also can be downloaded without charge from the ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
may be quoted without explicit permission, provided that full credit including © notice is given to the source. This paper also can be downloaded without charge from the

