Results 1  10
of
47
Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities
 Ann. Statist
, 2001
"... We study the rates of convergence of the maximum likelihood estimator (MLE) and posterior distribution in density estimation problems, where the densities are location or locationscale mixtures of normal distributions with the scale parameter lying between two positive numbers. The true density is ..."
Abstract

Cited by 36 (10 self)
 Add to MetaCart
We study the rates of convergence of the maximum likelihood estimator (MLE) and posterior distribution in density estimation problems, where the densities are location or locationscale mixtures of normal distributions with the scale parameter lying between two positive numbers. The true density is also assumed to lie in this class with the true mixing distribution either compactly supported or having subGaussian tails. We obtain bounds for Hellinger bracketing entropies for this class, and from these bounds, we deduce the convergence rates of (sieve) MLEs in Hellinger distance. The rate turns out to be �log n � κ / √ n, where κ ≥ 1 is a constant that depends on the type of mixtures and the choice of the sieve. Next, we consider a Dirichlet mixture of normals as a prior on the unknown density. We estimate the prior probability of a certain KullbackLeibler type neighborhood and then invoke a general theorem that computes the posterior convergence rate in terms the growth rate of the Hellinger entropy and the concentration rate of the prior. The posterior distribution is also seen to converge at the rate �log n � κ / √ n in, where κ now depends on the tail behavior of the base measure of the Dirichlet process. 1. Introduction. A
The interplay of bayesian and frequentist analysis
 Statist. Sci
, 2004
"... Statistics has struggled for nearly a century over the issue of whether the Bayesian or frequentist paradigm is superior. This debate is far from over and, indeed, should continue, since there are fundamental philosophical and pedagogical issues at stake. At the methodological level, however, the fi ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
Statistics has struggled for nearly a century over the issue of whether the Bayesian or frequentist paradigm is superior. This debate is far from over and, indeed, should continue, since there are fundamental philosophical and pedagogical issues at stake. At the methodological level, however, the fight has become considerably muted, with the recognition that each approach has a great deal to contribute to statistical practice and each is actually essential for full development of the other approach. In this article, we embark upon a rather idiosyncratic walk through some of these issues. Key words and phrases: Admissibility; Bayesian model checking; conditional frequentist; confidence intervals; consistency; coverage; design; hierarchical models; nonparametric
Posterior convergence rates of Dirichlet mixtures at smooth densities
 Ann. Statist
, 2007
"... We study the rates of convergence of the posterior distribution for Bayesian density estimation with Dirichlet mixtures of normal distributions as the prior. The true density is assumed to be twice continuously differentiable. The bandwidth is given a sequence of priors which is obtained by scaling ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
We study the rates of convergence of the posterior distribution for Bayesian density estimation with Dirichlet mixtures of normal distributions as the prior. The true density is assumed to be twice continuously differentiable. The bandwidth is given a sequence of priors which is obtained by scaling a single prior by an appropriate order. In order to handle this problem, we derive a new general rate theorem by considering a countable covering of the parameter space whose prior probabilities satisfy a summability condition together with certain individual bounds on the Hellinger metric entropy. We apply this new general theorem on posterior convergence rates by computing bounds for Hellinger (bracketing) entropy numbers for the involved class of densities, the error in the approximation of a smooth density by normal mixtures and the concentration rate of the prior. The best obtainable rate of convergence of the posterior turns out to be equivalent to the wellknown frequentist rate for integrated mean squared error n −2/5 up to a logarithmic factor. 1. Introduction. Kernel
Posterior consistency of Gaussian process prior for nonparametric binary regression
"... Consider binary observations whose response probability is an unknown smooth function of a set of covariates. Suppose that a prior on the response probability function is induced by a Gaussian process mapped to the unit interval through a link function. In this paper we study consistency of the resu ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
Consider binary observations whose response probability is an unknown smooth function of a set of covariates. Suppose that a prior on the response probability function is induced by a Gaussian process mapped to the unit interval through a link function. In this paper we study consistency of the resulting posterior distribution. If the covariance kernel has derivatives up to a desired order and the bandwidth parameter of the kernel is allowed to take arbitrarily small values, we show that the posterior distribution is consistent in the L1distance. As an auxiliary result to our proofs, we show that, under certain conditions, a Gaussian process assigns positive probabilities to the uniform neighborhoods of a continuous function. This result may be of independent interest in the literature for small ball probabilities of Gaussian processes. 1. Introduction. Consider
Convergence rates of posterior distributions for noniid observations
 Ann. Statist
, 2007
"... We consider the asymptotic behavior of posterior distributions and Bayes estimators based on observations which are required to be neither independent nor identically distributed. We give general results on the rate of convergence of the posterior measure relative to distances derived from a testing ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
We consider the asymptotic behavior of posterior distributions and Bayes estimators based on observations which are required to be neither independent nor identically distributed. We give general results on the rate of convergence of the posterior measure relative to distances derived from a testing criterion. We then specialize our results to independent, nonidentically distributed observations, Markov processes, stationary Gaussian time series and the white noise model. We apply our general results to several examples of infinitedimensional statistical models including nonparametric regression with normal errors, binary regression, Poisson regression, an interval censoring model, Whittle estimation of the spectral density of a time series and a nonlinear autoregressive model.: θ ∈ Θ) be a sequence of statistical experiments with observations X (n), where the parameter set Θ is arbitrary and n is an indexing parameter, usually the sample size. We put a prior distribution Πn on θ ∈ Θ and study the rate of convergence of the posterior
Zanten. Rates of contraction of posterior distributions based on Gaussian process priors. The Annals of Statistics
"... We derive rates of contraction of posterior distributions on nonparametric or semiparametric models based on Gaussian processes. The rate of contraction is shown to depend on the position of the true parameter relative to the reproducing kernel Hilbert space of the Gaussian process and the small bal ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
We derive rates of contraction of posterior distributions on nonparametric or semiparametric models based on Gaussian processes. The rate of contraction is shown to depend on the position of the true parameter relative to the reproducing kernel Hilbert space of the Gaussian process and the small ball probabilities of the Gaussian process. We determine these quantities for a range of examples of Gaussian priors and in several statistical settings. For instance, we consider the rate of contraction of the posterior distribution based on sampling from a smooth density model when the prior models the log density as a (fractionally integrated) Brownian motion. We also consider regression with Gaussian errors and smooth classification under a logistic or probit link function combined with various priors. 1. Introduction. Gaussian
Dirichlet Process Mixtures of Generalized Linear Models
"... We propose Dirichlet Process mixtures of Generalized Linear Models (DPGLMs), a new method of nonparametric regression that accommodates continuous and categorical inputs, models a response variable locally by a generalized linear model. We give conditions for the existence and asymptotic unbiasedne ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
We propose Dirichlet Process mixtures of Generalized Linear Models (DPGLMs), a new method of nonparametric regression that accommodates continuous and categorical inputs, models a response variable locally by a generalized linear model. We give conditions for the existence and asymptotic unbiasedness of the DPGLM regression mean function estimate; we then give a practical example for when those conditions hold. We evaluate DPGLM on several data sets, comparing it to modern methods of nonparametric regression including regression trees and Gaussian processes. 1
On rates of convergence for posterior distributions in infinitedimensional
 Ann. Statist
, 2007
"... This paper introduces a new approach to the study of rates of convergence for posterior distributions. It is a natural extension of a recent approach to the study of Bayesian consistency. In particular, we improve on current rates of convergence for models including the mixture of Dirichlet process ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
This paper introduces a new approach to the study of rates of convergence for posterior distributions. It is a natural extension of a recent approach to the study of Bayesian consistency. In particular, we improve on current rates of convergence for models including the mixture of Dirichlet process model and the random Bernstein polynomial model. 1. Introduction. Recently
Misspecification in infinitedimensional Bayesian statistics
 Annals of Statistics
, 2006
"... We consider the asymptotic behavior of posterior distributions if the model is misspecified. Given a prior distribution and a random sample from a distribution P0, which may not be in the support of the prior, we show that the posterior concentrates its mass near the points in the support of the pri ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
We consider the asymptotic behavior of posterior distributions if the model is misspecified. Given a prior distribution and a random sample from a distribution P0, which may not be in the support of the prior, we show that the posterior concentrates its mass near the points in the support of the prior that minimize the Kullback–Leibler divergence with respect to P0. An entropy condition and a priormass condition determine the rate of convergence. The method is applied to several examples, with special interest for infinitedimensional models. These include Gaussian mixtures, nonparametric regression and parametric models.
DYNAMICS OF BAYESIAN UPDATING WITH DEPENDENT DATA AND MISSPECIFIED MODELS
, 2009
"... Recent work on the convergence of posterior distributions under Bayesian updating has established conditions under which the posterior will concentrate on the truth, if the latter has a perfect representation within the support of the prior, and under various dynamical assumptions, such as the data ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
Recent work on the convergence of posterior distributions under Bayesian updating has established conditions under which the posterior will concentrate on the truth, if the latter has a perfect representation within the support of the prior, and under various dynamical assumptions, such as the data being independent and identically distributed or Markovian. Here I establish sufficient conditions for the convergence of the posterior distribution in nonparametric problems even when all of the hypotheses are wrong, and the datagenerating process has a complicated dependence structure. The main dynamical assumption is the generalized asymptotic equipartition (or “ShannonMcMillanBreiman”) property of information theory. I derive a kind of large deviations principle for the posterior measure, and discuss the advantages of predicting using a combination of models known to be wrong. An appendix sketches connections between the present results and the “replicator dynamics” of evolutionary theory.