## Statistical Methods for Eliciting Probability Distributions (2005)

Venue: | Journal of the American Statistical Association |

Citations: | 52 - 2 self |

### BibTeX

@ARTICLE{Garthwaite05statisticalmethods,

author = {Paul H. Garthwaite and Joseph B. Kadane},

title = {Statistical Methods for Eliciting Probability Distributions},

journal = {Journal of the American Statistical Association},

year = {2005},

pages = {680--700}

}

### Years of Citing Articles

### OpenURL

### Abstract

Elicitation is a key task for subjectivist Bayesians. While skeptics hold that it cannot (or perhaps should not) be done, in practice it brings statisticians closer to their clients and subjectmatter-expert colleagues. This paper reviews the state-of-the-art, reflecting the experience of statisticians informed by the fruits of a long line of psychological research into how people represent uncertain information cognitively, and how they respond to questions about that information. In a discussion of the elicitation process, the first issue to address is what it means for an elicitation to be successful, i.e. what criteria should be employed? Our answer is that a successful elicitation faithfully represents the opinion of the person being elicited. It is not necessarily “true ” in some objectivistic sense, and cannot be judged that way. We see elicitation as simply part of the process of statistical modeling. Indeed in a hierarchical model it is ambiguous at which point the likelihood ends and the prior begins. Thus the same kinds of judgment that inform statistical modeling in general also inform elicitation of prior distributions.

### Citations

1620 | Judgement under uncertainty: Heuristics and biases - Kahneman, Solvic, et al. - 1982 |

547 |
Availability: A Heuristic for Judging Frequency and Probability
- Tversky, Kahneman
- 1973
(Show Context)
Citation Context ...n by their third letter (e.g. park, bird, wire, . . . ). Hence, most people judge that “r” is more likely to be the first letter of a word, rather than the third letter, although the reverse is true (=-=Tversky and Kahneman, 1973-=-). Recall is also affected by factors such as familiarity, salience and recency, and newsworthy events also 10impact disproportionately on our memory, so you might overestimate the probability of a p... |

489 |
Uncertainty : a guide to dealing with uncertainty in quantitative risk and policy analysis
- Morgan, Henrion
- 1990
(Show Context)
Citation Context ... line whose endpoints are 0 and 1 (for probabilities) or 0–100% for proportion. Probability wheels (Spetzler and Stael von Holstein 1975) are another visual aid that have been used with some success (=-=Morgan and Henrion 1990-=-, p. 126). In its simplest form, a probability wheel comprises a round pieshaped disc of one color that is partly covered by a “slice” of a different color and a pointer. The size of the slice can be ... |

402 |
On the psychology of prediction
- Kahneman, Tversky
- 1973
(Show Context)
Citation Context ...which Mr. X is representative of these stereotypes. They completely ignore base rates, such as the relative number of salesmen to librarians, and assign a high probability to Mr. X being a librarian (=-=Kahneman and Tversky, 1973-=-). Similar results have been obtained by Hammerton (1975), and Nisbett et al. (1976). Another commonly used heuristic is judgement by availability. This is used when a person estimates the frequency o... |

376 |
Influence diagrams
- Howard, Matheson
- 1984
(Show Context)
Citation Context ... to influence how people consider probabilities have also been explored, such as asking them to suggest scenarios that would lead to an unlikely event (Slovic and Fischhoff 1977), influence diagrams (=-=Howard and Matheson 1984-=-), getting subjects to think carefully about the substantive details of each judgment (Koriat, Lichtenstein, and Fischoff 1980), and disaggregating an implicit hypothesis into its constituent hypothes... |

293 |
Decision Analysis: Introductory Lectures on Choices Under Uncertainty
- RAIFFA
- 1968
(Show Context)
Citation Context ...eristics have an effect on how an expert views a problem and the assessments that are elicited. Visual aids to help people quantify their opinions have been tried, such as urns full of colored balls (=-=Raiffa 1968-=-), light pens on colored screens (Barclay and Randall 1975), or simply asking assessors to mark a point on a line whose endpoints are 0 and 1 (for probabilities) or 0–100% for proportion. Probability ... |

284 |
The growth of logical thinking: From childhood to adolescence
- Inhelder, Piaget
- 1958
(Show Context)
Citation Context ...ailable data, sometimes basing their judgments on just the proportion of time that the positive outcome for one of the binary variables occurred with a positive outcome for the other (Smedslund 1963; =-=Inhelder and Piaget 1958-=-; Jenkins and Ward 1965; Ward and Jenkins 1965). Statistically, it is important to distinguish between eliciting an expert’s beliefs about a population correlation coefficient and eliciting the value ... |

218 |
Combining probability distributions: a critique and an annotated bibliography
- Genest, Zidek
- 1986
(Show Context)
Citation Context ...rt’s opinion that a certain set has zerosGarthwaite, Kadane, and O’Hagan: Eliciting Probability Distributions 697 probability implies that the pool must also assign zero probability to that set. (See =-=Genest and Zidek 1986-=- for a wide-ranging discussion of these issues.) Both linear and logarithmic pools allow the assignment of different weights to the experts, which can be used to give more weight to experts whose prob... |

207 |
Calibration of probabilities: The state of the art in 1980
- Lichtenstein, Fischhoff, et al.
- 1982
(Show Context)
Citation Context ... of them actually occur. Then the thought is that when the person announces p as his or her probability of some event, knowing better, the user of this information has g(p) as his or her probability (=-=Lichtenstein et al. 1982-=-). Such a program has the following flaw. Suppose that the person being elicited is faced with a coin that he or she believes to be fair, and hence announces p = 1 2 as the elicited probability of “ta... |

193 | Belief in the law of small numbers
- Tversky, Kahneman
- 1971
(Show Context)
Citation Context ...o tailed). You now have cause to run an additional group of ten subjects. What do you think the probability is that the results will be significant, by a one-tailed test, separately for this group?” (=-=Tversky and Kahneman, 1971-=-, p105) The median answer from the two groups was 0.85. However, if one assumes a non-informative prior distribution for the mean before the first sample was taken, then the true probability is only 0... |

183 | Support Theory: A Nonextensional Representation of Subjective Probability - Tversky, Koehler - 1994 |

149 |
A progress report on the training of probability assessors
- Alpert, Raiffa
- 1969
(Show Context)
Citation Context ...feel there is only a 1% probability the true answer would exceed your estimate. (b) Make a low estimate such that you feel there is only a 1% probability the true answer would be below this estimate (=-=Alpert and Raiffa 1969-=-, pp. 16–17). It should have been somewhat of a “surprise” to a subject to find the true value of a quantity falling outside an interval; 43% of all assessments produced such surprises. This informati... |

132 | Experts in Uncertainty: Opinion and Subjective - Cooke - 1991 |

128 | Comparison of Bayesian and regression approaches to the study of information processing - Slovic, Lichtenstein - 1971 |

124 | On Narrow Norms and Vague Heuristics: A Reply to Kahneman and Tversky
- Gigerenzer
- 1996
(Show Context)
Citation Context ...omplete it within 5 years?” rather than as single (one-shot) events, like “if a new Ph.D. student is picked at random, what is the probability that he or she will complete the Ph.D. within 5 years?” (=-=Gigerenzer 1996-=-; Koehler 1996). As noted earlier, a good elicitation method should yield a probability distribution that accurately reflects the expert’s opinion, but this is hard to check. A pragmatic alternative i... |

110 | Reasons for confidence
- Koriat, Lichtenstein, et al.
- 1980
(Show Context)
Citation Context ...t would lead to an unlikely event (Slovic and Fischhoff, 1977), influence 19diagrams (Howard and Matheson, 1984), getting subjects to think carefully about the substantive details of each judgement (=-=Koriat et al., 1980-=-), and disaggregating an implicit hypothesis into its constituent hypotheses (Johnson et al., 1993). As an example of the effect of disaggregation, Fischhoff, Slovic and Lichtenstein (1978) questioned... |

96 |
Framing, probability distortions, and insurance decisions
- Johnson, Hershey, et al.
- 1993
(Show Context)
Citation Context ... Matheson, 1984), getting subjects to think carefully about the substantive details of each judgement (Koriat et al., 1980), and disaggregating an implicit hypothesis into its constituent hypotheses (=-=Johnson et al., 1993-=-). As an example of the effect of disaggregation, Fischhoff, Slovic and Lichtenstein (1978) questioned experts (car mechanics) about the probable reasons for a car not starting. The experts assessed t... |

75 | Predictive Model Selection
- Laud, Ibrahim
- 1995
(Show Context)
Citation Context ...e or she then states the updated median of their absolute difference. The approach of I&L uses assessments of the mean and variance of the precision (σ −2 ) to determine ω and δ, and in related work (=-=Laud and Ibrahim 1995-=-), I&L used assessments of the median and the 95th percentile of the distribution of the precision. Oman did not elicit ω and δ, and restricted his posterior analysis to inferences that depend only on... |

71 | Kendall’s Advanced Theory of Statistics, Volume 2b: Bayesian Inference - O’Hagan - 1994 |

70 |
Eliciting and analyzing expert judgment: A practical guide
- Meyer, Booker
- 1991
(Show Context)
Citation Context ...echniques commonly require novel assessment tasks. An appreciation of the strategies people use to quantify their opinions can give an indication of how (and how well) these tasks might be performed (=-=Meyer and Booker, 2001-=-). One commonly used heuristic is judgement by representativeness. This is applicable for questions of the form: What is the probability that an object A belongs to a class B? What is the probability ... |

69 |
The use of nonlinear, noncompensatory models
- Einhorn
- 1970
(Show Context)
Citation Context ...used to forecast a second sample of predictions. The forecasts produced in this way were only slightly less accurate than those produced by a model actually based on the second sample of predictions (=-=Einhorn 1971-=-; Slovic and Lichtenstein 1968; Wiggins and Hoffman 1968). Experiments also show that, provided that cues are monotonically related to the predicted variable, a simple linear combination of main effec... |

69 |
Judgment of contingency between responses and outcomes
- Jenkins, Ward
- 1965
(Show Context)
Citation Context ...basing their judgements on just the proportion of time the positive outcome for one of the binary variables occurred with a positive outcome for the other (Smedsland, 1963; Inhelder and Piaget, 1958; =-=Jenkins and Ward, 1965-=-; Ward and Jenkins, 1965). Statistically, it is important to distinguish between eliciting an expert’s beliefs about a population correlation coefficient and eliciting the value of the correlation in ... |

66 | Tversky A. Variants of Uncertainty - Kahneman - 1982 |

61 | Interactive elicitation of opinion for a normal linear model - Kadane, Dickey, et al. - 1980 |

54 | Group consensus probability distributions: a critical survey - French - 1985 |

54 |
The relative importance of probabilities and payoffs in risk taking
- Slovic, Lichtenstein
- 1968
(Show Context)
Citation Context ...st a second sample of predictions. The forecasts produced in this way were only slightly less accurate than those produced by a model actually based on the second sample of predictions (Einhorn 1971; =-=Slovic and Lichtenstein 1968-=-; Wiggins and Hoffman 1968). Experiments also show that, provided that cues are monotonically related to the predicted variable, a simple linear combination of main effects will do a remarkably good j... |

52 | Fault trees: Sensitivity of estimated failure probabilities to problem representation. Journal of Experimental Psychology: Human Perception and Performance - Fischhoff, Slovic, et al. - 1978 |

52 |
Measuring the vague meanings of probability terms
- Wallsten, Budescu, et al.
- 1986
(Show Context)
Citation Context ...the probabilities that different people attach to the same phrase, and the context also affects the probability that a person associates with a phrase (Lichtenstein and Newman 1967; Beyth-Marom 1982; =-=Wallsten et al. 1986-=-). The response mode in which subjects are asked to give assessments also affects judgments. For example, Gigerenzer (1996) found that numeric expressions that are formally equivalent, such as frequen... |

51 |
Man as an intuitive statistician
- Peterson, Beach
- 1967
(Show Context)
Citation Context ...rform well, and sample sizes, sequence lengths, and prior probabilities have been varied. Some of these changes have influenced the degree of conservatism, but they have not eliminated it (see, e.g., =-=Peterson and Beach 1967-=-, pp. 32–33, for a review of such experiments). In some experiments, however, the basic experimental situation has been modified so as to make it more complex, and in these more complex situations, co... |

50 | I knew it would happen: Remembered probabilities of once-future things.” Organizational Behavior and Human Performance - Fischhoff, Beyth - 1975 |

50 | Eliciting expert beliefs in substantial practical applications - O’Hagan, A - 1998 |

49 | A probabilistic analysis of the Sacco and Vanzetti evidence - Kadane, Schum - 1996 |

46 | An Introduction to Bayesian Inference and Decision - Winkler - 1972 |

45 | Social Choice and Individual - ARROW - 1963 |

45 | Scoring rules for continuous probability distributions - Matheson, Winkler - 1976 |

44 |
Separating probability elicitation from utilities
- Kadane, Winkler
- 1988
(Show Context)
Citation Context ...attention (and grants) depends on how urgent society perceives the issues the expert studies to be. Hence such an expert has an incentive to emphasize the dangers. (For more on this kind of bias, see =-=Kadane and Winkler 1988-=-.) In Section 5 we consider the case of multiple experts, where often the desire is to combine the expertise of several people. In such a case it is sensible to try to ensure that the experts’ knowled... |

42 | An overview of robust Bayesian analysis (with Discussion - Berger - 1994 |

41 |
Ratio scales and category scales for a dozen perceptual continua
- STEVENS, &GALANTER
- 1957
(Show Context)
Citation Context ...t any form of prior distribution. Several experiments have investigated subjects’ capability to judge sample proportions (Erlick 1964; Nash 1964; Pitz 1965, 1966; Shuford 1961; Simpson and Voss 1961; =-=Stevens and Galanter 1957-=-). In these experiments, binary data were displayed to subjects for a limited period of time, and the subjects were then asked to estimate one of the sample proportions. For example, Shuford (1961) pr... |

41 | Correlations and copulas for decision and risk analysis - Clemen, Reilly - 1999 |

40 | Probability encoding in decision analysis - Spetzler, Holstein - 1975 |

39 | Quantifying prior opinion - Diaconis, Ylvisaker - 1985 |

39 | Decision analysis expert use - Morris - 1974 |

38 |
The assessment of prior distributions in Bayesian analysis
- Winkler
- 1967
(Show Context)
Citation Context ...e value of sample information. The quantile method tends to yield distributions that are again too tight, but slightly less tight than the PDF method and much less tight than the HFS and EPS methods (=-=Winkler 1967-=-). On this basis, the quantile method seems preferable, and it also seems to be the method of choice when judged by scoring rules. (Scoring rules are discussed in Section 4.3.) Some experiments have e... |

35 | Marginalization and linear opinion pools - McConway - 1981 |

34 | Cognitive processes and the assessment of subjective probability distributions - Hogarth - 1975 |

32 | Encoding subjective probabilities: A psychological and psychometric review - Wallsten, Budescu - 1983 |

31 | On the psychology of experimental surprises
- Slovic, Fischhoff
- 1977
(Show Context)
Citation Context ...’s probability for some specified event. Efforts to influence how people consider probabilities have also been explored, such as asking them to suggest scenarios that would lead to an unlikely event (=-=Slovic and Fischhoff, 1977-=-), influence 19diagrams (Howard and Matheson, 1984), getting subjects to think carefully about the substantive details of each judgement (Koriat et al., 1980), and disaggregating an implicit hypothes... |

30 | Reconciliation of discrete probability distributions - Lindley - 1985 |

30 |
Empirical scaling of common verbal phrases associated with numerical probabilities
- LICHTENSTEIN, NEWMAN
- 1967
(Show Context)
Citation Context ...ically. Unfortunately, there is considerable variation in the probabilities different people attach to the same phrase, and the context also affects the probability a person associates with a phrase (=-=Lichtenstein and Newman, 1967-=-; Beyth-Marom, 1982; Wallsten et al., 1986). The response mode in which subjects are asked to give assessments 20also affects judgements. For example, Gigerenzer (1996) found that numeric expressions... |

29 |
How probable is probable? A numerical translation of verbal probability expressions
- Beyth-Marom
- 1982
(Show Context)
Citation Context ...able variation in the probabilities that different people attach to the same phrase, and the context also affects the probability that a person associates with a phrase (Lichtenstein and Newman 1967; =-=Beyth-Marom 1982-=-; Wallsten et al. 1986). The response mode in which subjects are asked to give assessments also affects judgments. For example, Gigerenzer (1996) found that numeric expressions that are formally equiv... |