Results 1 
8 of
8
Learning to be Bayesian without supervision
 in Adv. Neural Information Processing Systems (NIPS*06
, 2007
"... Bayesian estimators are defined in terms of the posterior distribution. Typically, this is written as the product of the likelihood function and a prior probability density, both of which are assumed to be known. But in many situations, the prior density is not known, and is difficult to learn from ..."
Abstract

Cited by 19 (8 self)
 Add to MetaCart
Bayesian estimators are defined in terms of the posterior distribution. Typically, this is written as the product of the likelihood function and a prior probability density, both of which are assumed to be known. But in many situations, the prior density is not known, and is difficult to learn from data since one does not have access to uncorrupted samples of the variable being estimated. We show that for a wide variety of observation models, the Bayes least squares (BLS) estimator may be formulated without explicit reference to the prior. Specifically, we derive a direct expression for the estimator, and a related expression for the mean squared estimation error, both in terms of the density of the observed measurements. Each of these priorfree formulations allows us to approximate the estimator given a sufficient amount of observed data. We use the first form to develop practical nonparametric approximations of BLS estimators for several different observation processes, and the second form to develop a parametric family of estimators for use in the additive Gaussian noise case. We examine the empirical performance of these estimators as a function of the amount of observed data. 1
Generalized SURE for exponential families: Applications to regularization
 IEEE Trans. on Signal Processing
, 2009
"... ..."
Eaton's Markov chain, its conjugate partner and Padmissibility
 Annals of Statistics
, 1999
"... Suppose that X is a random variable with density f(xj`) and that ï¿½ï¿½(`jx) is a proper posterior corresponding to an improper prior (`). The prior is called Padmissible if the generalized Bayes estimator of every bounded function of ` is almostadmissible under squared error loss. Eaton (1992) s ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
Suppose that X is a random variable with density f(xj`) and that ï¿½ï¿½(`jx) is a proper posterior corresponding to an improper prior (`). The prior is called Padmissible if the generalized Bayes estimator of every bounded function of ` is almostadmissible under squared error loss. Eaton (1992) showed that recurrence of the Markov chain with transition density R(jj`) = R ï¿½ï¿½(jjx)f(xj`)dx is a sufficient condition for Padmissibility of (`). We show that Eaton's Markov chain is recurrent if and only if its conjugate partner, with transition density
Optimal estimation: Prior free methods and physiological application
 Ph.D. dissertation, Courant Institute of Mathematical Sciences
, 2007
"... First and foremost, I would like to thank my advisors, Eero Simoncelli and Dan Tranchina. Dan supervised my work on cortical modeling, and his insight and advice were extremely helpful in carrying out the bulk of the work of Chapter 1. He also had many useful comments about the remainder of the mate ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
First and foremost, I would like to thank my advisors, Eero Simoncelli and Dan Tranchina. Dan supervised my work on cortical modeling, and his insight and advice were extremely helpful in carrying out the bulk of the work of Chapter 1. He also had many useful comments about the remainder of the material in the thesis. Over the years, I have learned a lot about computational neuroscience in general from discussions with him. Eero supervised my work on priorfree methods and applications, which make up the substance of Chapters 24. His intuition, insight and ideas were crucial in helping me progress in this line of research, and more importantly, in obtaining useful results. I also learned a lot from him about image processing, statistics and computational neuroscience, amongst other things. I would like to thank my third reader, Charlie Peskin, for his input to my thesis and defense and helpful discussions about the material. I would also like to thank Mehryar Mohri for being on my committee and for some useful discussions about VC type bounds for regression. As well, I would like to thank Francesca Chiaromonte for being on my committee, and for helpful discussions and comments about the material in the thesis. It was good to have a statistician’s point of view on the work. I would like to thank Bob Shapley for his helpful input, and for information about contrast dependent summation area. I would also like to thank him for letting me sit in on his ”new view ” class about visual cortex, where I read some very useful papers. I would like to thank members of the Laboratory for Computational v Vision, for helpful comments and discussions along the way. I would also like to thank LCV alumni Liam Paninski and Jonathan Pillow, who both had some particularly useful comments about the priorfree methods. I would also like thank the various people at Courant, too numerous to mention, who have provided help along the way.
Improving on the maximum likelihood estimators of the means in Poisson decomposable graphical models
, 2005
"... The METR technical reports are published as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electron ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
The METR technical reports are published as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.
Learning least squares estimators without assumed priors or supervision
, 2009
"... The two standard methods of obtaining a leastsquares optimal estimator are (1) Bayesian estimation, in which one assumes a prior distribution on the true values and combines this with a model of the measurement process to obtain an optimal estimator, and (2) supervised regression, in which one opti ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The two standard methods of obtaining a leastsquares optimal estimator are (1) Bayesian estimation, in which one assumes a prior distribution on the true values and combines this with a model of the measurement process to obtain an optimal estimator, and (2) supervised regression, in which one optimizes a parametric estimator over a training set containing pairs of corrupted measurements and their associated true values. But many realworld systems do not have access to either supervised training examples or a prior model. Here, we study the problem of obtaining an optimal estimator given a measurement process with known statistics, and a set of corrupted measurements of random values drawn from an unknown prior. We develop a general form of nonparametric empirical Bayesian estimator that is written as a direct function of the measurement density, with no explicit reference to the prior. We study the observation conditions under which such “priorfree ” estimators may be obtained, and we derive specific forms for a variety of different corruption processes. Each of these priorfree estimators may also be used to express the mean squared estimation error as an expectation over the measurement density, thus generalizing Stein’s unbiased risk estimator (SURE) which provides such an expression for the additive Gaussian noise case. Minimizing this expression over measurement samples provides an “unsupervised
ARTICLE Communicated by Konrad Paul Kording Least Squares Estimation Without Priors or Supervision
"... Selection of an optimal estimator typically relies on either supervised training samples (pairs of measurements and their associated true values) or a prior probability model for the true values. Here, we consider the problem of obtaining a least squares estimator given a measurement process with kn ..."
Abstract
 Add to MetaCart
Selection of an optimal estimator typically relies on either supervised training samples (pairs of measurements and their associated true values) or a prior probability model for the true values. Here, we consider the problem of obtaining a least squares estimator given a measurement process with known statistics (i.e., a likelihood function) and a set of unsupervised measurements, each arising from a corresponding true value drawn randomly from an unknown distribution. We develop a general expression for a nonparametric empirical Bayes least squares (NEBLS) estimator, which expresses the optimal least squares estimator in terms of the measurement density, with no explicit reference to the unknown (prior) density. We study the conditions under which such estimators exist and derive specific forms for a variety of different measurement processes. We further show that each of these NEBLS estimators may be used to express the mean squared estimation error as an expectation over the measurement density alone, thus generalizing Stein’s unbiased
1 Generalized SURE for Exponential Families: Applications to Regularization
, 804
"... Abstract — Stein’s unbiased risk estimate (SURE) was proposed by Stein for the independent, identically distributed (iid) Gaussian model in order to derive estimates that dominate leastsquares (LS). In recent years, the SURE criterion has been employed in a variety of denoising problems for choosin ..."
Abstract
 Add to MetaCart
Abstract — Stein’s unbiased risk estimate (SURE) was proposed by Stein for the independent, identically distributed (iid) Gaussian model in order to derive estimates that dominate leastsquares (LS). In recent years, the SURE criterion has been employed in a variety of denoising problems for choosing regularization parameters that minimize an estimate of the meansquared error (MSE). However, its use has been limited to the iid case which precludes many important applications. In this paper we begin by deriving a SURE counterpart for general, not necessarily iid distributions from the exponential family. This enables extending the SURE design technique to a much broader class of problems. Based on this generalization we suggest a new method for choosing regularization parameters in penalized LS estimators. We then demonstrate its superior performance over the conventional generalized cross validation approach and the discrepancy method in the context of image deblurring and deconvolution. The SURE technique can also be used to design estimates without predefining their structure. However, allowing for too many free parameters impairs the performance of the resulting estimates. To address this inherent tradeoff we propose a regularized SURE objective. Based on this design criterion, we derive a wavelet denoising strategy that is similar in sprit to the standard softthreshold approach but can lead to improved MSE performance. I.