Results 1 -
4 of
4
Learning to be Bayesian without supervision
- in Adv. Neural Information Processing Systems (NIPS*06
, 2007
"... Bayesian estimators are defined in terms of the posterior distribution. Typically, this is written as the product of the likelihood function and a prior probability density, both of which are assumed to be known. But in many situations, the prior density is not known, and is difficult to learn from ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
Bayesian estimators are defined in terms of the posterior distribution. Typically, this is written as the product of the likelihood function and a prior probability density, both of which are assumed to be known. But in many situations, the prior density is not known, and is difficult to learn from data since one does not have access to uncorrupted samples of the variable being estimated. We show that for a wide variety of observation models, the Bayes least squares (BLS) estimator may be formulated without explicit reference to the prior. Specifically, we derive a direct expression for the estimator, and a related expression for the mean squared estimation error, both in terms of the density of the observed measurements. Each of these prior-free formulations allows us to approximate the estimator given a sufficient amount of observed data. We use the first form to develop practical nonparametric approximations of BLS estimators for several different observation processes, and the second form to develop a parametric family of estimators for use in the additive Gaussian noise case. We examine the empirical performance of these estimators as a function of the amount of observed data. 1
Eaton's Markov chain, its conjugate partner and P-admissibility
- Annals of Statistics
, 1999
"... Suppose that X is a random variable with density f(xj`) and that ��(`jx) is a proper posterior corresponding to an improper prior (`). The prior is called P-admissible if the generalized Bayes estimator of every bounded function of ` is almost--admissible under squared error loss. Eaton (1992) s ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
Suppose that X is a random variable with density f(xj`) and that ��(`jx) is a proper posterior corresponding to an improper prior (`). The prior is called P-admissible if the generalized Bayes estimator of every bounded function of ` is almost--admissible under squared error loss. Eaton (1992) showed that recurrence of the Markov chain with transition density R(jj`) = R ��(jjx)f(xj`)dx is a sufficient condition for P-admissibility of (`). We show that Eaton's Markov chain is recurrent if and only if its conjugate partner, with transition density
Learning least squares estimators without assumed priors or supervision
, 2009
"... The two standard methods of obtaining a least-squares optimal estimator are (1) Bayesian estimation, in which one assumes a prior distribution on the true values and combines this with a model of the measurement process to obtain an optimal estimator, and (2) supervised regression, in which one opti ..."
Abstract
- Add to MetaCart
The two standard methods of obtaining a least-squares optimal estimator are (1) Bayesian estimation, in which one assumes a prior distribution on the true values and combines this with a model of the measurement process to obtain an optimal estimator, and (2) supervised regression, in which one optimizes a parametric estimator over a training set containing pairs of corrupted measurements and their associated true values. But many real-world systems do not have access to either supervised training examples or a prior model. Here, we study the problem of obtaining an optimal estimator given a measurement process with known statistics, and a set of corrupted measurements of random values drawn from an unknown prior. We develop a general form of nonparametric empirical Bayesian estimator that is written as a direct function of the measurement density, with no explicit reference to the prior. We study the observation conditions under which such “prior-free ” estimators may be obtained, and we derive specific forms for a variety of different corruption processes. Each of these prior-free estimators may also be used to express the mean squared estimation error as an expectation over the measurement density, thus generalizing Stein’s unbiased risk estimator (SURE) which provides such an expression for the additive Gaussian noise case. Minimizing this expression over measurement samples provides an “unsupervised
MINIMAX ESTIMATION WITH THRESHOLDING AND ITS APPLICATION TO WAVELET ANALYSIS
, 2005
"... Many statistical practices involve choosing between a full model and reduced models where some coefficients are reduced to zero. Data were used to select a model with estimated coefficients. Is it possible to do so and still come up with an estimator always better than the traditional estimator base ..."
Abstract
- Add to MetaCart
Many statistical practices involve choosing between a full model and reduced models where some coefficients are reduced to zero. Data were used to select a model with estimated coefficients. Is it possible to do so and still come up with an estimator always better than the traditional estimator based on the full model? The James–Stein estimator is such an estimator, having a property called minimaxity. However, the estimator considers only one reduced model, namely the origin. Hence it reduces no coefficient estimator to zero or every coefficient estimator to zero. In many applications including wavelet analysis, what should be more desirable is to reduce to zero only the estimators smaller than a threshold, called thresholding in this paper. Is it possible to construct this kind of estimators which are minimax? In this paper, we construct such minimax estimators which perform thresholding. We apply our recommended estimator to the wavelet analysis and show that it performs the best among the well-known estimators aiming simultaneously at estimation and model selection. Some of our estimators are also shown to be asymptotically optimal.

