Results 1 - 10
of
25
Convergence rates of posterior distributions
- Ann. Statist
, 2000
"... We consider the asymptotic behavior of posterior distributions and Bayes estimators for infinite-dimensional statistical models. We give general results on the rate of convergence of the posterior measure. These are applied to several examples, including priors on finite sieves, log-spline models, D ..."
Abstract
-
Cited by 26 (8 self)
- Add to MetaCart
We consider the asymptotic behavior of posterior distributions and Bayes estimators for infinite-dimensional statistical models. We give general results on the rate of convergence of the posterior measure. These are applied to several examples, including priors on finite sieves, log-spline models, Dirichlet processes and interval censoring. 1. Introduction. Suppose
Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities
- Ann. Statist
, 2001
"... We study the rates of convergence of the maximum likelihood estimator (MLE) and posterior distribution in density estimation problems, where the densities are location or location-scale mixtures of normal distributions with the scale parameter lying between two positive numbers. The true density is ..."
Abstract
-
Cited by 23 (9 self)
- Add to MetaCart
We study the rates of convergence of the maximum likelihood estimator (MLE) and posterior distribution in density estimation problems, where the densities are location or location-scale mixtures of normal distributions with the scale parameter lying between two positive numbers. The true density is also assumed to lie in this class with the true mixing distribution either compactly supported or having sub-Gaussian tails. We obtain bounds for Hellinger bracketing entropies for this class, and from these bounds, we deduce the convergence rates of (sieve) MLEs in Hellinger distance. The rate turns out to be �log n � κ / √ n, where κ ≥ 1 is a constant that depends on the type of mixtures and the choice of the sieve. Next, we consider a Dirichlet mixture of normals as a prior on the unknown density. We estimate the prior probability of a certain Kullback-Leibler type neighborhood and then invoke a general theorem that computes the posterior convergence rate in terms the growth rate of the Hellinger entropy and the concentration rate of the prior. The posterior distribution is also seen to converge at the rate �log n � κ / √ n in, where κ now depends on the tail behavior of the base measure of the Dirichlet process. 1. Introduction. A
Spline adaptation in extended linear models
- Statistical Science
, 2002
"... Abstract. In many statistical applications, nonparametric modeling can provide insight into the features of a dataset that are not obtainable by other means. One successful approach involves the use of (univariate or multivariate) spline spaces. As a class, these methods have inherited much from cla ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Abstract. In many statistical applications, nonparametric modeling can provide insight into the features of a dataset that are not obtainable by other means. One successful approach involves the use of (univariate or multivariate) spline spaces. As a class, these methods have inherited much from classical tools for parametric modeling. For example, stepwise variable selection with spline basis terms is a simple scheme for locating knots (breakpoints) in regions where the data exhibit strong, local features. Similarly, candidate knot con gurations (generated by this or some other search technique), are routinely evaluated with traditional selection criteria like AIC or BIC. In short, strategies typically applied in parametric model selection have proved useful in constructing exible, low-dimensional models for nonparametric problems. Until recently, greedy, stepwise procedures were most frequently suggested in the literature. Researchinto Bayesian variable selection, however, has given rise to a number of new spline-based methods that primarily rely on some form of Markov chain Monte Carlo to identify promising knot locations. In this paper, we consider various alternatives to greedy, deterministic schemes, and present aBayesian framework for studying adaptation in the context of an extended linear model (ELM). Our major test cases are Logspline density estimation and (bivariate) Triogram regression models. We selected these because they illustrate a number of computational and methodological issues concerning model adaptation that arise in ELMs.
On Posterior Consistency of Survival Models
- ANN. STATIST
, 1999
"... Ghosh and Ramamoorthi (1995) studied the posterior consistency for survival models and showed that the posterior was consistent, when the prior on the distribution of survival times was the Dirichlet process prior. In this paper, we study the posterior consistency of survival models with neutral to ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Ghosh and Ramamoorthi (1995) studied the posterior consistency for survival models and showed that the posterior was consistent, when the prior on the distribution of survival times was the Dirichlet process prior. In this paper, we study the posterior consistency of survival models with neutral to the right process priors which include Dirichlet process priors. A set of sufficient conditions for the posterior consistency with neutral to the right process priors is given. Interestingly, not all the neutral to the right process priors have consistent posteriors, but most of the popular priors such as Dirichlet processes, beta processes and gamma processes have consistent posteriors. With a class of priors which includes beta processes, a necessary and sufficient condition for the consistency is also established. An interesting counter intuitive phenomenon is found. Suppose there are two priors centered at the true parameter value with finite variances. Surprisingly, the posterior with s...
Convergence rates of posterior distributions for noniid observations
- Ann. Statist
, 2007
"... We consider the asymptotic behavior of posterior distributions and Bayes estimators based on observations which are required to be neither independent nor identically distributed. We give general results on the rate of convergence of the posterior measure relative to distances derived from a testing ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
We consider the asymptotic behavior of posterior distributions and Bayes estimators based on observations which are required to be neither independent nor identically distributed. We give general results on the rate of convergence of the posterior measure relative to distances derived from a testing criterion. We then specialize our results to independent, nonidentically distributed observations, Markov processes, stationary Gaussian time series and the white noise model. We apply our general results to several examples of infinite-dimensional statistical models including nonparametric regression with normal errors, binary regression, Poisson regression, an interval censoring model, Whittle estimation of the spectral density of a time series and a nonlinear autoregressive model.: θ ∈ Θ) be a sequence of statistical experiments with observations X (n), where the parameter set Θ is arbitrary and n is an indexing parameter, usually the sample size. We put a prior distribution Πn on θ ∈ Θ and study the rate of convergence of the posterior
Misspecification in infinite-dimensional Bayesian statistics
- Annals of Statistics
, 2006
"... We consider the asymptotic behavior of posterior distributions if the model is misspecified. Given a prior distribution and a random sample from a distribution P0, which may not be in the support of the prior, we show that the posterior concentrates its mass near the points in the support of the pri ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We consider the asymptotic behavior of posterior distributions if the model is misspecified. Given a prior distribution and a random sample from a distribution P0, which may not be in the support of the prior, we show that the posterior concentrates its mass near the points in the support of the prior that minimize the Kullback–Leibler divergence with respect to P0. An entropy condition and a prior-mass condition determine the rate of convergence. The method is applied to several examples, with special interest for infinite-dimensional models. These include Gaussian mixtures, nonparametric regression and parametric models.
Zanten, Bayesian inference with rescaled Gaussian process priors, Electron
- Mathematics Institute University of Warwick Coventry
"... Abstract: We use rescaled Gaussian processes as prior models for functional parameters in nonparametric statistical models. We show how the rate of contraction of the posterior distributions depends on the scaling factor. In particular, we exhibit rescaled Gaussian process priors yielding posteriors ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Abstract: We use rescaled Gaussian processes as prior models for functional parameters in nonparametric statistical models. We show how the rate of contraction of the posterior distributions depends on the scaling factor. In particular, we exhibit rescaled Gaussian process priors yielding posteriors that contract around the true parameter at optimal convergence rates. To derive our results we establish bounds on small deviation probabilities for smooth stationary Gaussian processes.
A large deviation principle for Dirichlet posteriors
, 1999
"... Let X k be a sequence of independent and identically distributed random variables taking values in a compact metric space\Omega\Gamma and consider the problem of estimating the law of X 1 in a Bayesian framework. A conjugate family of priors for non-parametric Bayesian inference is the Dirichlet pro ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Let X k be a sequence of independent and identically distributed random variables taking values in a compact metric space\Omega\Gamma and consider the problem of estimating the law of X 1 in a Bayesian framework. A conjugate family of priors for non-parametric Bayesian inference is the Dirichlet process priors popularized by Ferguson. We prove that if the prior distribution is Dirichlet, then the sequence of posterior distributions satisfies a large deviation principle, and give an explicit expression for the rate function. As an application, we obtain an asymptotic formula for the predictive probability of ruin in the classical gambler's ruin problem. 1 Introduction Let X be a Hausdorff topological space with Borel oe-algebra B, and let ¯ n be a sequence of probability measures on (X ; B). A rate function is a nonnegative lower semicontinuous function on X . We say that the sequence ¯ n satisfies the large deviation principle (LDP) with rate function I, if for all B 2 B, \Gamma inf...
Learning bounds for a generalized family of Bayesian posterior distributions
- In NIPS 03
, 2004
"... In this paper we obtain convergence bounds for the concentration of Bayesian posterior distributions (around the true distribution) using a novel method that simplifies and enhances previous results. Based on the analysis, we also introduce a generalized family of Bayesian posteriors, and show that ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In this paper we obtain convergence bounds for the concentration of Bayesian posterior distributions (around the true distribution) using a novel method that simplifies and enhances previous results. Based on the analysis, we also introduce a generalized family of Bayesian posteriors, and show that the convergence behavior of these generalized posteriors is completely determined by the local prior structure around the true distribution. This important and surprising robustness property does not hold for the standard Bayesian posterior in that it may not concentrate when there exist “bad ” prior structures even at places far away from the true distribution. 1
Sampling for Bayesian Computation With Large Datasets
, 2003
"... Multilevel models are extremely useful in handling large hierarchical datasets. However, computation can be a challenge, both in storage and CPU time per iteration of Gibbs sampler or other Markov chain Monte Carlo algorithms. We propose a computational strategy based on sampling the data, computing ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Multilevel models are extremely useful in handling large hierarchical datasets. However, computation can be a challenge, both in storage and CPU time per iteration of Gibbs sampler or other Markov chain Monte Carlo algorithms. We propose a computational strategy based on sampling the data, computing separate posterior distributions based on each sample, and then combining these to get a consensus posterior inference. With hierarchical data structures, we perform cluster sampling into subsets with the same structures as the original data. This reduces the number of parameters as well as sample size for each separate model fit. We illustrate with examples from climate modeling and newspaper marketing.

