#### DMCA

## Bayesian Data Analysis (1995)

Citations: | 2115 - 60 self |

### Citations

1873 |
Information Theory, Inference, and Learning Algorithms
- MacKay
- 2003
(Show Context)
Citation Context ...check the fit of the model. There are many arguments which make such an approach compelling. Without entering into philosophical and epistemological arguments on the nature of Science (Jeffreys 1939, =-=MacKay 2002-=-, Jaynes 2003), we briefly state what we view as the main practical appealing features of introducing a prior probability on θ. First such an approach allows to incorporate prior information in a natu... |

1831 | Statistical Decision Theory and Bayesian Analysis (Springer Series in Statistics - Berger - 1993 |

1752 | Bayes Factors
- Kass, Raftery
- 1995
(Show Context)
Citation Context ...terior distribution than under the alternative, which is a very intuitive answer. To constrain the impact of the prior probabilities, a different quantity is usually adopted, namely the Bayes factor (=-=Kass and Raftery 1995-=-), which is defined by Jeffreys (1939), Jaynes (2003) as B01 = pi(Θ0|x) pi(Θ1|x) / pi(Θ0) pi(Θ1) = ∫ Θ0 f(x|θ)pi0(θ)dθ∫ Θ1 f(x|θ)pi1(θ)dθ . Note that the posterior odds can be recovered from the Bayes... |

815 | Discrete Multivariate Analysis: Theory and Practice - Bishop, Fienberg, et al. - 1975 |

605 | Games and decisions - Luce, Raiffa - 1957 |

555 | Marginal Likelihood from the Gibbs Output - Chib - 1995 |

544 | Bayesian model selection in social research - Raftery - 1995 |

411 | Prior distributions for variance parameters in hierarchical models (comment on article by - Gelman - 2006 |

368 |
Statistical Decision Functions
- Wald
- 1950
(Show Context)
Citation Context ...1983, Berger 1985, Robert 2001). Such extensions are justified for a variety of reasons, ranging from topological coherence—limits of Bayesian procedures often partake of their optimality properties (=-=Wald 1950-=-) and should therefore be included in the range of possible procedures—to robustness—a measure with an infinite mass is much more robust than a true probability distribution with a large variance—and ... |

361 | Model selection and accounting for model uncertainty in graphical models using Occam’s window - Madigan, Raftery - 1994 |

329 | Posterior predictive assessment of model fitness via realized discrepancies - Gelman, Meng, et al. - 1996 |

271 | The selection of prior distributions by formal rules - Kass, Wasserman - 1996 |

233 |
The intrinsic Bayes factor for model selection and prediction
- Berger, Pericchi
- 1996
(Show Context)
Citation Context ...bed in Section 4, because the Bayes factor is simply not defined under improper priors, for any sample size. Solutions have been proposed, akin to cross-validation techniques in the classical domain (=-=Berger and Pericchi 1996-=-, Berger et al. 1998a), but they are somehow too ad-hoc to convince the entire community (and obviously beyond). In some situations, when parameters shared by both models have the same meaning in each... |

200 |
Reference posterior distributions for Bayesian inference (with discussion
- Bernardo
- 1979
(Show Context)
Citation Context ...sian statistics). In particular, in the one-dimensional parameter case, the Jeffreys prior is also the matching prior (see Robert 2001, Chapters 3 and 8), and the reference prior defined by Bernardo (=-=Bernardo 1979-=-, Clarke and Barron 1990). For instance, when Pθ is a location family, i.e. when f(x|θ) = g(x− θ), the Fisher information is constant and thus the Jeffreys prior is pi(θ) = 1. Note that in many cases ... |

190 | B.: Theory of Probability - Finetti - 1974 |

157 | Bayesian nonparametrics - Ghosh, Ramamoorthi - 2003 |

153 | Sampling and Bayes’ Inference in Scientific Modelling and Robustness - Box - 1980 |

139 | Information-theoretic asymptotics of Bayes methods
- Clarke, Barron
- 1990
(Show Context)
Citation Context ...). In particular, in the one-dimensional parameter case, the Jeffreys prior is also the matching prior (see Robert 2001, Chapters 3 and 8), and the reference prior defined by Bernardo (Bernardo 1979, =-=Clarke and Barron 1990-=-). For instance, when Pθ is a location family, i.e. when f(x|θ) = g(x− θ), the Fisher information is constant and thus the Jeffreys prior is pi(θ) = 1. Note that in many cases like the above the Jeffr... |

134 | On the consistency of Bayes estimates - Diaconis, Freedman - 1986 |

112 | and Empirical Bayes Methods for Data Analysis. 2nd ed, Boca Raton - Carlin, Louis, et al. - 2000 |

112 |
Information Theory and
- Jaynes
- 1957
(Show Context)
Citation Context ... of the model. There are many arguments which make such an approach compelling. Without entering into philosophical and epistemological arguments on the nature of Science (Jeffreys 1939, MacKay 2002, =-=Jaynes 2003-=-), we briefly state what we view as the main practical appealing features of introducing a prior probability on θ. First such an approach allows to incorporate prior information in a natural way in th... |

87 | Likelihood Methods in Statistics - Severini - 2000 |

68 |
Bayes Theory
- Hartigan
- 1983
(Show Context)
Citation Context ...∞, since, provided that∫ Θ f(x|θ)pi(θ)dν(θ) < +∞, (3) almost everywhere (in x), the quantity (2) is still well-defined as a probability density as when using a regular posterior probability as prior (=-=Hartigan 1983-=-, Berger 1985, Robert 2001). Such extensions are justified for a variety of reasons, ranging from topological coherence—limits of Bayesian procedures often partake of their optimality properties (Wald... |

51 | On the Bernstein-von Mises theorem with infinitedimensional parameters. Ann Statist 27(4):1119–1140 - Freedman - 1999 |

48 | Bayes factors and marginal distributions in invariant situations
- Berger, Pericchi, et al.
- 1998
(Show Context)
Citation Context ...the Bayes factor is simply not defined under improper priors, for any sample size. Solutions have been proposed, akin to cross-validation techniques in the classical domain (Berger and Pericchi 1996, =-=Berger et al. 1998-=-a), but they are somehow too ad-hoc to convince the entire community (and obviously beyond). In some situations, when parameters shared by both models have the same meaning in each of the models, an i... |

48 | Formal Rules for Selecting Prior Distributions: A Review and Annotated Bibilography
- KASS, WASSERMAN
- 1994
(Show Context)
Citation Context ...be replaced by the less judgemental reference prior denomination, we nonetheless follow suit and use it in the following subsections, since it is the most common denomination found in the literature (=-=Kass and Wasserman 1996-=-). 2.3 Non informative priors Non informative priors are expected to be flat distributions, possibly improper. An apparently natural way of constructing such priors would be to consider a uniform prio... |

48 |
The Bayesian Choice. 2nd ed
- Robert
- 2001
(Show Context)
Citation Context ...iously put to use in a debate about Bayesian statistics published as Robert (2010). analysis. In the following, we refrain from embarking upon philosophical discussions about the nature of knowledge (=-=Robert 2001-=-, Chapter 10) and the possibility of induction (Popper and Miller 1983), opting instead for a mathematically sound presentation of a statistical methodology. We indeed believe that the most convincing... |

43 | 2004, ‘Estimating marginal likelihoods for mixture and Markov switching models using bridge sampling techniques - Frühwirth-Schnatter |

40 |
On formulae for confidence points based on integrals of weighted likelihoods
- WELCH, PEERS
- 1963
(Show Context)
Citation Context ...confidence sets C. In a wide generality, they further attain good frequentist coverage in the sense that Pθ(θ ∈ C) = 1 − α + O(n−1/2) for most prior distributions pi, where n denotes the sample size (=-=Welch and Peers 1963-=-, Robert 2001, Chapter 5). Credible regions however suffer from a lack of invariance to changes of parameterisation, i.e. if θ is a given parameterisation of interest and Cpiα is the HPD region constr... |

35 | Bayesian dynamic modeling of latent trait distributions - Dunson - 2006 |

30 |
On the development of the reference prior method
- Berger, Bernardo
- 1991
(Show Context)
Citation Context ...sional normal vector. The ultimate attempt to define a non informative prior is in our opinion Bernardo’s (1979) definition through the information theoretical device of Kullback divergence (see also =-=Berger and Bernardo 1992-=- or Berger et al. 2009). The idea is to split the parameter into groups say (θ(1), ..., θ(p)) where θ(1) is more interesting than θ(2), which is more interesting than θ(3) and so on. This can be seen ... |

29 |
Introducing Monte Carlo methods with R
- Robert, Casella
- 2010
(Show Context)
Citation Context ...isused—as about any other methodology— and while Bayesian simulation seems stuck in an infinite regress of inferential uncertainty (Gelman 2008), there exist enough convergence assessment techniques (=-=Robert and Casella 2009-=-) to ensure a reasonable confidence about the approximation provided by those simulation methods. Thus, as rightly stressed by Bernardo (2008), the discussion of computational issues should not be all... |

23 | Bayesian Core - Marin, Robert - 2007 |

20 |
Theory of Probability, 1st ed
- Jeffreys
- 1939
(Show Context)
Citation Context ...pothesis or to check the fit of the model. There are many arguments which make such an approach compelling. Without entering into philosophical and epistemological arguments on the nature of Science (=-=Jeffreys 1939-=-, MacKay 2002, Jaynes 2003), we briefly state what we view as the main practical appealing features of introducing a prior probability on θ. First such an approach allows to incorporate prior informat... |

19 |
Estimation of accuracy in testing
- Hwang, Casella, et al.
- 1992
(Show Context)
Citation Context ... words, a hypothesis that may be true may be rejected because it had not predicted observable results that have not occurred, especially when considering that p-values may be inadmissible estimators (=-=Hwang et al. 1992-=-). From a decisional perspective— with which the frequentist properties should relate—, a classical Neyman-Pearson-Fisher procedure is never evaluated in terms of the consequences of rejecting the nul... |

18 | A bayesian approach to the selection and testing of mixture models - Berkhof, Mechelen, et al. - 2003 |

18 | Bernstein-von Mises theorem for linear functionals of the density. Ann. Statist., forthcoming - Rivoirard, Rousseau - 2012 |

16 |
Statistical Hypothesis Testing in Intraspecific Phylogeography: Nested Clade Phylogeographical Analysis vs. Approxi- mate Bayesian Computation.” Molecular Ecology 18: 319–331
- Templeton
(Show Context)
Citation Context ... imply a subsequent action towards the choice of an alternative model. Therefore, complaining that having a high relative probability does not mean that a hypothesis is true or supported by the data (=-=Templeton 2008-=-), simply because the Bayesian approach is relative in that it posits two or more alternative hypotheses and tests their relative fits to some observed statistics (Templeton 2008), is missing the main... |

13 | Hierarchical logistic regression models for imputation of unresolved enumeration status in undercount estimation (with discussion - Belin, Diffendal, et al. - 1993 |

12 | Importance sampling methods for Bayesian discrimination between embedded models. ArXiv e-prints - Marin, Robert - 2009 |

12 | Computational methods for Bayesian model choice
- Robert, Wraith
(Show Context)
Citation Context ...f (θ, σ2) for the normal model, x1, . . . , xn ∼ N (θ, σ2) with x = 0, s2 = 1 and n = 10, under Jeffreys’ prior, along with the pointwise approximation to the 10% HPD region (in darker hues) (Source: =-=Robert and Wraith 2009-=-). interest, say ψ within θ = (ψ, λ), where ψ is the parameter of interest and λ is the nuisance parameter. Dealing with nuisance parameters is quite problematic in a frequentist framework, whether on... |

10 |
Objections to Bayesian statistics
- Gelman
(Show Context)
Citation Context ...ayesian data analysis, emphasising that it is a method for summarising uncertainty and making estimates and predictions using probability statements conditional on observed data and an assumed model (=-=Gelman 2008-=-)—which makes it valuable and useful in Statistics, Econometrics, and Biostatistics, among other fields. We first describe the basic elements of Bayesian ∗C.P. Robert is Professor of Statistics at Uni... |

9 | Bayesian inference on mixtures of distributions
- Lee, Marin, et al.
- 2008
(Show Context)
Citation Context ...h example is when the prior is pi(θ) = exp(+θ2) and the observations are iid Cauchy. Another and less anecdotic example occurs in mixture models, under exchangeable improper priors on the components (=-=Lee et al. 2008-=-). 1.3 Bayesian decision theory As a general modus vivendi, let us first stress that inference as a whole is meaningless unless it is evaluated. The evaluation of a statistical procedure, i.e. determi... |

8 |
Natural induction: An objective Bayesian approach
- Berger, Bernardo, et al.
- 2009
(Show Context)
Citation Context ...timate attempt to define a non informative prior is in our opinion Bernardo’s (1979) definition through the information theoretical device of Kullback divergence (see also Berger and Bernardo 1992 or =-=Berger et al. 2009-=-). The idea is to split the parameter into groups say (θ(1), ..., θ(p)) where θ(1) is more interesting than θ(2), which is more interesting than θ(3) and so on. This can be seen as a generalisation of... |

7 | Approximating the marginal likelihood in mixture models,” Recherche - Marin, Robert - 2008 |

7 |
The impossibility of inductive probability
- Popper, Miller
- 1983
(Show Context)
Citation Context ...lished as Robert (2010). analysis. In the following, we refrain from embarking upon philosophical discussions about the nature of knowledge (Robert 2001, Chapter 10) and the possibility of induction (=-=Popper and Miller 1983-=-), opting instead for a mathematically sound presentation of a statistical methodology. We indeed believe that the most convincing arguments for adopting a Bayesian version of data analyses are in the... |

6 |
Approximations to the Bayes factor in model selection problems and consistency issues
- Berger, Ghosh, et al.
- 2003
(Show Context)
Citation Context ...ally to a likelihood ratio with a 10 penalty of the form d∗ log n∗/2 where d∗ and n∗ can be viewed as the effective dimension of the model and the effective number of observations, respectively, see (=-=Berger et al. 2003-=-, Chambaz and Rousseau 2008). The Bayes factor therefore offers the major interest that it does not require to compute a complexity measure (or penalty term)—in other words, to define what is d∗ and w... |

6 |
Theory of Probability revisited (with discussion). Statist. Science
- Robert, Chopin, et al.
- 2009
(Show Context)
Citation Context ...rmation matrix and |i(θ)| denotes its determinant. The above construction is obviously invariant per reparameterisation and has many other interesting features specially in onedimensional setups (see =-=Robert et al. 2009-=- for a reassessment of Jeffreys’ impact on Bayesian statistics). In particular, in the one-dimensional parameter case, the Jeffreys prior is also the matching prior (see Robert 2001, Chapters 3 and 8)... |

5 | Bounds for Bayesian order identification with application to mixtures
- Chambaz, Rousseau
- 2008
(Show Context)
Citation Context ... ratio with a 10 penalty of the form d∗ log n∗/2 where d∗ and n∗ can be viewed as the effective dimension of the model and the effective number of observations, respectively, see (Berger et al. 2003, =-=Chambaz and Rousseau 2008-=-). The Bayes factor therefore offers the major interest that it does not require to compute a complexity measure (or penalty term)—in other words, to define what is d∗ and what is n∗—, which often is ... |

5 |
J.M.: Invariant HPD credible sets and MAP estimators
- Druilhet, Marin
- 2007
(Show Context)
Citation Context ...nterest and Cpiα is the HPD region constructed as above, then if η = g(θ) is another parameterisation, g(Cpiα) = {η = g(θ); θ ∈ Cpiα} is not necessarily the HPD region for the η parameterisation (see =-=Druilhet and Marin 2007-=- for a detailed analysis of this phenomenon). 4 Nuisance parameters : integrated likelihood In many applied problems, one is only interested in some components of the parameter, the remaining part of ... |

5 | Approximating interval hypotheses: p-values and Bayes factors - Rousseau - 2007 |

4 |
Estimation of quadratic functions: reference priors for noncentrality parameters
- Berger, Philippe, et al.
- 1998
(Show Context)
Citation Context ...the Bayes factor is simply not defined under improper priors, for any sample size. Solutions have been proposed, akin to cross-validation techniques in the classical domain (Berger and Pericchi 1996, =-=Berger et al. 1998-=-a), but they are somehow too ad-hoc to convince the entire community (and obviously beyond). In some situations, when parameters shared by both models have the same meaning in each of the models, an i... |

4 | 2006): “The elimination of nuisance parameters - Liseo |

3 |
A history of the Metropolis– Hastings algorithm
- Hitchcock
- 2003
(Show Context)
Citation Context ...g been derided for providing optimal answers that could not be computed. With the advent of early Monte Carlo methods, of personal computers, and, more recently, of more powerful Monte Carlo methods (=-=Hitchcock 2003-=-), the pendulum appears to have switched to the other extreme and Bayesian methods seem to quickly move to elaborate computation (Gelman 2008), a feature that does not make them less suspicious: a sim... |

3 | Comment on article by Gelman - Senn - 2008 |

3 |
Comment on an article by Gelman
- Wasserman
- 2008
(Show Context)
Citation Context ...put, not to show that we can get similar answers to those of a least square analysis since, else, if the Bayes estimator has good frequency behaviour then we might as well use the frequentist method (=-=Wasserman 2008-=-). (While computing issues are addressed in the following Chapter, we stress that all items in the table of Figure 3 are obtained via closed form formulae.) The major criticism addressed to the Bayesi... |

2 | Occam’s razor and Bayesian analysis - Jaynes - 1983 |

2 | Comment on article by Gelman - Bernardo - 2008 |

2 | Modern Bayesian inference: Foundations and objective methods - Bernardo |

2 | Bayesian estimation of movement probabilities in open populations using hidden Markov chains - Dupuis - 1995 |

1 | Bayesian model building by pure thought - GELMAN - 1996 |

1 | Some principals and examples. Statist. Sinica 6 215–232. MR1379058 - GELMAN - 2003 |

1 | Introduction to Probability and Statistics from a Bayesian Viewpoint. Cambridge Univ - Ser - 1965 |

1 | On the relevance of the Bayesian approach to Statistics. The Review of Economic Analysis, arXiv:0909.5365 - Robert - 2010 |

1 | A mixture approach to Bayesian goodness of fit - Dey, Sun, et al. - 2002 |