Results 1  10
of
20
Bayesian Model Averaging for Linear Regression Models
 Journal of the American Statistical Association
, 1997
"... We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem in ..."
Abstract

Cited by 184 (13 self)
 Add to MetaCart
We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem involves averaging over all possible models (i.e., combinations of predictors) when making inferences about quantities of
Model Selection and Accounting for Model Uncertainty in Linear Regression Models
, 1993
"... We consider the problems of variable selection and accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. The complete B ..."
Abstract

Cited by 47 (6 self)
 Add to MetaCart
We consider the problems of variable selection and accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. The complete Bayesian solution to this problem involves averaging over all possible models when making inferences about quantities of interest. This approach is often not practical. In this paper we offer two alternative approaches. First we describe a Bayesian model selection algorithm called "Occam's "Window" which involves averaging over a reduced set of models. Second, we describe a Markov chain Monte Carlo approach which directly approximates the exact solution. Both these model averaging procedures provide better predictive performance than any single model which might reasonably have been selected. In the extreme case where there are many candidate predictors but there is no relationship between any of them and the response, standard variable selection procedures often choose some subset of variables that yields a high R² and a highly significant overall F value. We refer to this unfortunate phenomenon as "Freedman's Paradox" (Freedman, 1983). In this situation, Occam's vVindow usually indicates the null model as the only one to be considered, or else a small number of models including the null model, thus largely resolving the paradox.
Media bias and reputation
, 2005
"... A Bayesian consumer who is uncertain about the quality of an information source will infer that the source is of higher quality when its reports conform to the consumer’s prior expectations. We use this fact to build a model of media bias in which firms slant their reports toward the prior beliefs o ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
A Bayesian consumer who is uncertain about the quality of an information source will infer that the source is of higher quality when its reports conform to the consumer’s prior expectations. We use this fact to build a model of media bias in which firms slant their reports toward the prior beliefs of their customers in order to build a reputation for quality. Bias emerges in our model even though it can make all market participants worse off. The model predicts that bias will be less severe when consumers receive independent evidence on the true state of the world, and that competition between independently owned news outlets can reduce bias. We present a variety of empirical evidence consistent with these predictions. JEL classification: L82, L10, D83
Assessing the calibration of naive Bayes’ posterior estimates
, 2000
"... In this paper, we give evidence that the posterior distribution of Naive Bayes goes to zero or one exponentially with document length. While exponential change may be expected as new bits of information are added, adding new words does not always correspond to new information. Essentially as a resul ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
In this paper, we give evidence that the posterior distribution of Naive Bayes goes to zero or one exponentially with document length. While exponential change may be expected as new bits of information are added, adding new words does not always correspond to new information. Essentially as a result of its independence assumption, the estimates grow too quickly. Weinvestigate one parametric family that attempts to downweight the growth rate. The parameters of this family are estimated using a maximum likelihood scheme, and the results are evaluated.
The effects of averaging subjective probability estimates between and within judges
 Journal of Experimental Psychology: Applied
, 2000
"... The average probability estimate of J> 1 judges is generally better than its components. Two studies test 3 predictions regarding averaging that follow from theorems based on a cognitive model of the judges and idealizations of the judgment situation. Prediction 1 is that the average of conditionall ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
The average probability estimate of J> 1 judges is generally better than its components. Two studies test 3 predictions regarding averaging that follow from theorems based on a cognitive model of the judges and idealizations of the judgment situation. Prediction 1 is that the average of conditionally pairwise independent estimates will be highly diagnostic, and Prediction 2 is that the average of dependent estimates (differing only by independent error terms) may be well calibrated. Prediction 3 contrasts between and withinsubject averaging. Results demonstrate the predictions ' robustness by showing the extent to which they hold as the information conditions depart from the ideal and as J increases. Practical consequences are that (a) substantial improvement can be obtained with as few as 2 6 judges and (b) the decision maker can estimate the nature of the expected improvement by considering the information conditions. On many occasions, experts are required to provide decision makers or policymakers with subjective probability estimates of uncertain events (Morgan & Henrion, 1990). The extensive literature (e.g., Harvey, 1997; McClelland & Bolger, 1994) on the topic shows that in general, but with clear exceptions, subjective
The expected value of information and the probability of surprise
 Risk Anal
, 1999
"... Risk assessors attempting to use probabilistic approaches to describe uncertainty often find themselves in a datasparse situation: available data are only partially relevant to the parameter of interest, so one needs to adjust empirical distributions, use explicit judgmental distributions, or colle ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
Risk assessors attempting to use probabilistic approaches to describe uncertainty often find themselves in a datasparse situation: available data are only partially relevant to the parameter of interest, so one needs to adjust empirical distributions, use explicit judgmental distributions, or collect new data. In determining whether or not to collect additional data, whether by measurement or by elicitation of experts, it is useful to consider the expected value of the additional information. The expected value of information depends on the prior distribution used to represent current information; if the prior distribution is too narrow, in many riskanalytic cases the calculated expected value of information will be biased downward. The welldocumented tendency toward overconfidence, including the neglect of potential surprise, suggests this bias may be substantial. We examine the expected value of information, including the role of surprise, test for bias in estimating the expected value of information, and suggest procedures to guard against overconfidence and underestimation of the expected value of information when developing prior distributions and when combining distributions obtained from multiple experts. The methods are illustrated with applications to potential carcinogens in food, commercial energy demand, and global climate change. KEY WORDS: Probability; uncertainty; data; risk assessment. 1.
Enhancing the Predictive Performance of Bayesian Graphical Models
 Communications in Statistics – Theory and Methods
, 1995
"... Both knowledgebased systems and statistical models are typically concerned with making predictions about future observables. Here we focus on assessment of predictive performance and provide two techniques for improving the predictive performance of Bayesian graphical models. First, we present Baye ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Both knowledgebased systems and statistical models are typically concerned with making predictions about future observables. Here we focus on assessment of predictive performance and provide two techniques for improving the predictive performance of Bayesian graphical models. First, we present Bayesian model averaging, a technique for accounting for model uncertainty. Second, we describe a technique for eliciting a prior distribution for competing models from domain experts. We explore the predictive performance of both techniques in the context of a urological diagnostic problem. KEYWORDS: Prediction; Bayesian graphical model; Bayesian network; Decomposable model; Model uncertainty; Elicitation. 1 Introduction Both statistical methods and knowledgebased systems are typically concerned with combining information from various sources to make inferences about prospective measurements. Inevitably, to combine information, we must make modeling assumptions. It follows that we should car...
On Optimal Sequential Prediction for General Processes
 IEEE Transactions on Information Theory
, 2001
"... In the stochastic sequential prediction problem, the elements of a random process X 1 , X 2 , ... 2 R are successively revealed to a forecaster. At each time t the forecaster makes a prediction F t of X t based only on X 1 , ..., X t 1 , when X t is revealed, the forecaster incurs a loss `(F t , X t ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
In the stochastic sequential prediction problem, the elements of a random process X 1 , X 2 , ... 2 R are successively revealed to a forecaster. At each time t the forecaster makes a prediction F t of X t based only on X 1 , ..., X t 1 , when X t is revealed, the forecaster incurs a loss `(F t , X t ). This paper considers several aspects of the sequential prediction problem for unbounded, nonstationary processes under pth power loss , 1 < p < 1. In the first part of the paper it is shown that Bayes prediction schemes are Cesaro optimal under general conditions, that Cesaro optimal prediction schemes are unique in a natural sense, and that Cesaro optimality is equivalent to a form of weak calibration. Extensions of the existence and uniqueness results to generalized prediction, and prediction from observations with additive noise, are established.
Twostage dynamic signal detection: A theory of choice, decision time, and confidence
 In
, 2010
"... The 3 most oftenused performance measures in the cognitive and decision sciences are choice, response or decision time, and confidence. We develop a random walk/diffusion theory—2stage dynamic signal detection (2DSD) theory—that accounts for all 3 measures using a common underlying process. The mo ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
The 3 most oftenused performance measures in the cognitive and decision sciences are choice, response or decision time, and confidence. We develop a random walk/diffusion theory—2stage dynamic signal detection (2DSD) theory—that accounts for all 3 measures using a common underlying process. The model uses a drift diffusion process to account for choice and decision time. To estimate confidence, we assume that evidence continues to accumulate after the choice. Judges then interrupt the process to categorize the accumulated evidence into a confidence rating. The model explains all known interrelationships between the 3 indices of performance. Furthermore, the model also accounts for the distributions of each variable in both a perceptual and general knowledge task. The dynamic nature of the model also reveals the moderating effects of time pressure on the accuracy of choice and confidence. Finally, the model specifies the optimal solution for giving the fastest choice and confidence rating for a given level of choice and confidence accuracy. Judges are found to act in a manner consistent with the optimal solution when making confidence judgments.
Interpreting and Unifying Outlier Scores
"... Outlier scores provided by different outlier models differ widely in their meaning, range, and contrast between different outlier models and, hence, are not easily comparable or interpretable. We propose a unification of outlier scores provided by various outlier models and a translation of the arbi ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Outlier scores provided by different outlier models differ widely in their meaning, range, and contrast between different outlier models and, hence, are not easily comparable or interpretable. We propose a unification of outlier scores provided by various outlier models and a translation of the arbitrary “outlier factors ” to values in the range [0, 1] interpretable as values describing the probability of a data object of being an outlier. As an application, we show that this unification facilitates enhanced ensembles for outlier detection. 1