Results 1  10
of
10
The consistency of the BIC Markov order estimator.
"... . The Bayesian Information Criterion (BIC) estimates the order of a Markov chain (with finite alphabet A) from observation of a sample path x 1 ; x 2 ; : : : ; x n , as that value k = k that minimizes the sum of the negative logarithm of the kth order maximum likelihood and the penalty term jAj ..."
Abstract

Cited by 55 (3 self)
 Add to MetaCart
. The Bayesian Information Criterion (BIC) estimates the order of a Markov chain (with finite alphabet A) from observation of a sample path x 1 ; x 2 ; : : : ; x n , as that value k = k that minimizes the sum of the negative logarithm of the kth order maximum likelihood and the penalty term jAj k (jAj\Gamma1) 2 log n: We show that k equals the correct order of the chain, eventually almost surely as n ! 1, thereby strengthening earlier consistency results that assumed an apriori bound on the order. A key tool is a strong ratiotypicality result for Markov sample paths. We also show that the Bayesian estimator or minimum description length estimator, of which the BIC estimator is an approximation, fails to be consistent for the uniformly distributed i.i.d. process. AMS 1991 subject classification: Primary 62F12, 62M05; Secondary 62F13, 60J10 Key words and phrases: Bayesian Information Criterion, order estimation, ratiotypicality, Markov chains. 1 Supported in part by a joint N...
Rates of Convergence of Posterior Distributions
, 1998
"... We compute the rate at which the posterior distribution concentrates around the true parameter value. The spaces we work in are quite general and include infinite dimensional cases. The rates are driven by two quantities: the size of the space, as measure by metric entropy or bracketing entropy, and ..."
Abstract

Cited by 47 (0 self)
 Add to MetaCart
We compute the rate at which the posterior distribution concentrates around the true parameter value. The spaces we work in are quite general and include infinite dimensional cases. The rates are driven by two quantities: the size of the space, as measure by metric entropy or bracketing entropy, and the degree to which the prior concentrates in a small ball around the true parameter. We apply the results to several examples. In some cases, natural priors give suboptimal rates of convergence and better rates can be obtained by using sievebased priors such as those introduced by Zhao (1993, 1998). AMS 1990 classification: Primary, 62A15, Secondary: 62E20, 62G15. KEYWORDS: Bayesian inference, asymptotic inference, nonparametric models, sieves. 1 Introduction. Nonparametric Bayesian methods have become quite popular lately, largely because of advances in computing; see Dey, Mueller and Sinha (1998) for a recent account. Because of their growing popularity, it is important to understand ...
Asymptotic normality of posterior distributions in highdimensional linear models, Bernoulli 5
, 1999
"... We study consistency and asymptotic normality of posterior distributions of the natural parameter for an exponential family when the dimension of the parameter grows with the sample size. Under certain growth restrictions on the dimension, we show that the posterior distributions concentrate in neig ..."
Abstract

Cited by 21 (5 self)
 Add to MetaCart
We study consistency and asymptotic normality of posterior distributions of the natural parameter for an exponential family when the dimension of the parameter grows with the sample size. Under certain growth restrictions on the dimension, we show that the posterior distributions concentrate in neighbourhoods of the true parameter and can be approximated by an appropriate normal distribution.
Consistency issues in Bayesian Nonparametrics
 IN ASYMPTOTICS, NONPARAMETRICS AND TIME SERIES: A TRIBUTE
, 1998
"... ..."
Consistency of Bayes estimates for nonparametric regression: normal theory
 Bernoulli
, 1998
"... Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
UNAWARENESS, PRIORS AND POSTERIORS
"... Abstract. This note contains first thoughts on awareness of unawareness in a simple dynamic context where a decision situation is repeated over time. The main consequence of increasing awareness is that the model the decision maker uses, and the prior which it contains, becomes richer over time. The ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract. This note contains first thoughts on awareness of unawareness in a simple dynamic context where a decision situation is repeated over time. The main consequence of increasing awareness is that the model the decision maker uses, and the prior which it contains, becomes richer over time. The decision maker is prepared to this change, and we show that if a projectionconsistency axiom is satisfied unawareness does not affect the value of her estimate of a payoffrelevant conditional probability (although it may weaken confidence in such estimate). Probabilityzero events however pose a challenge to this axiom, and if that fails, even estimate values will be different if the decision maker takes unawareness into account. In examining evolution of knowledge about relevant variable through time, we distinguish between transition from uncertainty to certainty and from unawareness to certainty directly, and argue that new knowledge may cause posteriors to jump more if it is also new awareness. Some preliminary considerations on convergence of estimates are included.
Consistency issues in Bayesian Nonparametrics
 In Asymptotics, Nonparametrics and Time Series: A Tribute
, 1998
"... this paper we are mainly concerned with consistency of the posterior. Informally, the posterior is said to be consistent at a true value ` 0 if the following holds: Suppose X 1 ; X 2 ; : : : ; Xn indeed arise from P `0 , then the posterior converges to the degenerate probability ffi `0 . Alternative ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
this paper we are mainly concerned with consistency of the posterior. Informally, the posterior is said to be consistent at a true value ` 0 if the following holds: Suppose X 1 ; X 2 ; : : : ; Xn indeed arise from P `0 , then the posterior converges to the degenerate probability ffi `0 . Alternatively with P `0 probability 1, the posterior probability of any neighborhood U of ` 0 converges to 1. Why would a Bayesian be interested in consistency? Think of an experiment in which an experimenter generates observations from a known (to the experimenter) distribution. The observations are presented to a Bayesian. It would be embarrassing if, even with large data, the Bayesian fails to come close to finding the mechanism used by the experimenter. Consistency can be thought of as a validation of the Bayesian method. It can also be interpreted as requiring that the data, at least eventually, overrides the prior opinion. Alternatively two Bayesians, with two different priors, presented with the same data eventually agree. A result of this kind relating "merging of opinions" and posterior consistency is discussed in Diaconis and Freedman [86a]. In fact, Diaconis and Freedman [86a] (henceforth abbreviated as DF) and the ensuing discussions contain a wealth of material pertaining to posterior consistency. An early result in posterior consistency is due to Doob [48], who showed that posterior consistency obtains on a set of prior measure 1. This result does not settle the question of consistency for a particular ` 0 of interest. In smooth finite dimensional problems, different methods show (for example Berk [66]) that consistency obtains at all parameter points. Freedman [63] exhibits a prior and points of inconsistency for the infinite cell multinomial. He also showed that this p...
Asymptotic Properties of Nonparametric Bayesian Procedures
, 1997
"... This chapter provides a brief review of some large sample frequentist properties of nonparametric Bayesian procedures. The review is not comprehensive, but rather, is meant to give a simple, heuristic introduction to some of the main concepts. We mainly focus on consistency but we touch on a few oth ..."
Abstract
 Add to MetaCart
This chapter provides a brief review of some large sample frequentist properties of nonparametric Bayesian procedures. The review is not comprehensive, but rather, is meant to give a simple, heuristic introduction to some of the main concepts. We mainly focus on consistency but we touch on a few other issues as well. 1 Introduction Nonparametric Bayesian procedures present a paradox in the Bayesian paradigm. On the one hand, they are most useful when we don't have precise information. On the other hand, they require huge amounts of prior information because nonparametric procedures involve high dimensional if not infinite dimensional parameter spaces. The usual hope that the data will dominate the prior was dashed by Freedman (1963, 1965) and then Diaconis and Freedman (1986) who showed that putting mass in weak neighborhoods the true distribution does not guarantee that the posterior accumulates in weak neighborhoods. Interest in properties like consistency derive from our desire tha...
Consistency Of The Bic Order Estimator
, 1999
"... . We announce two results on the problem of estimating the order of a Markov chain from observation of a sample path. First is that the Bayesian Information Criterion (BIC) leads to an almost surely consistent estimator. Second is that the Bayesian minimum description length estimator, of which the ..."
Abstract
 Add to MetaCart
. We announce two results on the problem of estimating the order of a Markov chain from observation of a sample path. First is that the Bayesian Information Criterion (BIC) leads to an almost surely consistent estimator. Second is that the Bayesian minimum description length estimator, of which the BIC estimator is an approximation, fails to be consistent for the uniformly distributed i.i.d. process. A key tool is a strong ratiotypicality result for empirical kblock distributions. Complete proofs are given in the authors' article to appear in The Annals of Statistics. 1. Introduction Let M k denote the class of Markov chains of order at most k, with values drawn from a finite set A, and let M = S 1 k=0 M k . An important problem is to estimate the order of a Markov chain from observation of a finite sample path. A popular method is the socalled Bayesian Information Criterion (BIC), first introduced by Schwarz, [12], which gives the estimator defined by k BIC = k BIC (x n 1 ) ...