Results 1  10
of
23
Learning to be Bayesian without supervision
 in Adv. Neural Information Processing Systems (NIPS*06
, 2007
"... Bayesian estimators are defined in terms of the posterior distribution. Typically, this is written as the product of the likelihood function and a prior probability density, both of which are assumed to be known. But in many situations, the prior density is not known, and is difficult to learn from ..."
Abstract

Cited by 19 (8 self)
 Add to MetaCart
Bayesian estimators are defined in terms of the posterior distribution. Typically, this is written as the product of the likelihood function and a prior probability density, both of which are assumed to be known. But in many situations, the prior density is not known, and is difficult to learn from data since one does not have access to uncorrupted samples of the variable being estimated. We show that for a wide variety of observation models, the Bayes least squares (BLS) estimator may be formulated without explicit reference to the prior. Specifically, we derive a direct expression for the estimator, and a related expression for the mean squared estimation error, both in terms of the density of the observed measurements. Each of these priorfree formulations allows us to approximate the estimator given a sufficient amount of observed data. We use the first form to develop practical nonparametric approximations of BLS estimators for several different observation processes, and the second form to develop a parametric family of estimators for use in the additive Gaussian noise case. We examine the empirical performance of these estimators as a function of the amount of observed data. 1
Prior Information and Uncertainty in Inverse Problems
, 2001
"... Solving any inverse problem requires understanding the uncertainties in the data to know what it means to fit the data. We also need methods to incorporate dataindependent prior information to eliminate unreasonable models that fit the data. Both of these issues involve subtle choices that may ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
Solving any inverse problem requires understanding the uncertainties in the data to know what it means to fit the data. We also need methods to incorporate dataindependent prior information to eliminate unreasonable models that fit the data. Both of these issues involve subtle choices that may significantly influence the results of inverse calculations. The specification of prior information is especially controversial. How does one quantify information? What does it mean to know something about a parameter a priori? In this tutorial we discuss Bayesian and frequentist methodologies that can be used to incorporate information into inverse calculations. In particular we show that apparently conservative Bayesian choices, such as representing interval constraints by uniform probabilities (as is commonly done when using genetic algorithms, for example) may lead to artificially small uncertainties. We also describe tools from statistical decision theory that can be used to...
Empirical Bayes least squares estimation without an explicit prior.” NYU Courant Inst
, 2007
"... Bayesian estimators are commonly constructed using an explicit prior model. In many applications, one does not have such a model, and it is difficult to learn since one does not have access to uncorrupted measurements of the variable being estimated. In many cases however, including the case of cont ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
Bayesian estimators are commonly constructed using an explicit prior model. In many applications, one does not have such a model, and it is difficult to learn since one does not have access to uncorrupted measurements of the variable being estimated. In many cases however, including the case of contamination with additive Gaussian noise, the Bayesian least squares estimator can be formulated directly in terms of the distribution of noisy measurements. We demonstrate the use of this formulation in removing noise from photographic images. We use a local approximation of the noisy measurement distribution by exponentials over adaptively chosen intervals, and derive an estimator from this approximate distribution. We demonstrate through simulations that this adaptive Bayesian estimator performs as well or better than previously published estimators based on simple prior models. 1
Optimal estimation: Prior free methods and physiological application
 Ph.D. dissertation, Courant Institute of Mathematical Sciences
, 2007
"... First and foremost, I would like to thank my advisors, Eero Simoncelli and Dan Tranchina. Dan supervised my work on cortical modeling, and his insight and advice were extremely helpful in carrying out the bulk of the work of Chapter 1. He also had many useful comments about the remainder of the mate ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
First and foremost, I would like to thank my advisors, Eero Simoncelli and Dan Tranchina. Dan supervised my work on cortical modeling, and his insight and advice were extremely helpful in carrying out the bulk of the work of Chapter 1. He also had many useful comments about the remainder of the material in the thesis. Over the years, I have learned a lot about computational neuroscience in general from discussions with him. Eero supervised my work on priorfree methods and applications, which make up the substance of Chapters 24. His intuition, insight and ideas were crucial in helping me progress in this line of research, and more importantly, in obtaining useful results. I also learned a lot from him about image processing, statistics and computational neuroscience, amongst other things. I would like to thank my third reader, Charlie Peskin, for his input to my thesis and defense and helpful discussions about the material. I would also like to thank Mehryar Mohri for being on my committee and for some useful discussions about VC type bounds for regression. As well, I would like to thank Francesca Chiaromonte for being on my committee, and for helpful discussions and comments about the material in the thesis. It was good to have a statistician’s point of view on the work. I would like to thank Bob Shapley for his helpful input, and for information about contrast dependent summation area. I would also like to thank him for letting me sit in on his ”new view ” class about visual cortex, where I read some very useful papers. I would like to thank members of the Laboratory for Computational v Vision, for helpful comments and discussions along the way. I would also like to thank LCV alumni Liam Paninski and Jonathan Pillow, who both had some particularly useful comments about the priorfree methods. I would also like thank the various people at Courant, too numerous to mention, who have provided help along the way.
Learning least squares estimators without assumed priors or supervision
, 2009
"... The two standard methods of obtaining a leastsquares optimal estimator are (1) Bayesian estimation, in which one assumes a prior distribution on the true values and combines this with a model of the measurement process to obtain an optimal estimator, and (2) supervised regression, in which one opti ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The two standard methods of obtaining a leastsquares optimal estimator are (1) Bayesian estimation, in which one assumes a prior distribution on the true values and combines this with a model of the measurement process to obtain an optimal estimator, and (2) supervised regression, in which one optimizes a parametric estimator over a training set containing pairs of corrupted measurements and their associated true values. But many realworld systems do not have access to either supervised training examples or a prior model. Here, we study the problem of obtaining an optimal estimator given a measurement process with known statistics, and a set of corrupted measurements of random values drawn from an unknown prior. We develop a general form of nonparametric empirical Bayesian estimator that is written as a direct function of the measurement density, with no explicit reference to the prior. We study the observation conditions under which such “priorfree ” estimators may be obtained, and we derive specific forms for a variety of different corruption processes. Each of these priorfree estimators may also be used to express the mean squared estimation error as an expectation over the measurement density, thus generalizing Stein’s unbiased risk estimator (SURE) which provides such an expression for the additive Gaussian noise case. Minimizing this expression over measurement samples provides an “unsupervised
Assessment of Spatial Variation of Risks in Small Populations
"... Often environmental hazards are assessed by examining the spatial variation of diseasespecific mortality or morbidity rates. These rates, when estimated for small local populations, can have a high degree of random variation or uncertainty associated with them. If those rate estimates are used to p ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Often environmental hazards are assessed by examining the spatial variation of diseasespecific mortality or morbidity rates. These rates, when estimated for small local populations, can have a high degree of random variation or uncertainty associated with them. If those rate estimates are used to prioritize environmental cleanup actions or to allocate resources, then those decisions may be influenced by this high degree ofuncertainty. Unfortunately, the effect of this uncertainty is not to add "random noise " into the decsionmaking process, but to systematicaily bias action toward the smaIlest populations where uncertainty is greatest and where extreme high and low rate deviations are most likely to be manifest by chance. We present a statistical procedure for adjusting rate estimates for differences in variability due to differentials in local area population sizes. Such adjustments produce rate estimates for areas that have better properties than the unadjusted rates for use in making statistically based decisions about the entire set of areas. Examples are provided for county variation in bladder, stomach, and lung cancer mortality rates for U.S white males for the period 1970 to 1979.
Bayesian Joint Estimation of Binomial Proportions
, 2004
"... Testing the hypothesis H that k>1 binomial parameters are equal and jointly estimating these parameters are related problems. A Bayesian argument can simultaneously answer these inference questions: to test the hypothesis H the posterior probability λ = λ(H  x) ofH given the experimental data x can ..."
Abstract
 Add to MetaCart
Testing the hypothesis H that k>1 binomial parameters are equal and jointly estimating these parameters are related problems. A Bayesian argument can simultaneously answer these inference questions: to test the hypothesis H the posterior probability λ = λ(H  x) ofH given the experimental data x can be used; to estimate each binomial parameter, their Bayesian estimates under H and its complement ¯ H are combined, with weights λ and 1 − λ, respectively.
Advisor
, 1996
"... Statistical contributions are made with applications in the area of occupational epidemiology. In particular, methodology is developed within the framework of reasonable models for shiftlong exposure that take into account potentially important sources of variability (e.g., betweenworker or dayto ..."
Abstract
 Add to MetaCart
Statistical contributions are made with applications in the area of occupational epidemiology. In particular, methodology is developed within the framework of reasonable models for shiftlong exposure that take into account potentially important sources of variability (e.g., betweenworker or daytoday variability), while also maintaining the wellsupported lognormality assumption. We consider estimation of key population parameters of the distribution of repeated shiftlong exposure measurements on workers in plants or factories. Assuming balanced data, uniformly minimum variance unbiased (UMVU) estimators for these parameters are presented under two exposure models. Under one of these models, we study in detail the efficiency of the UMVU estimator for the mean with respect to logical competitors (such as the MLE). The prediction of mean exposure for individual workers is also considered, with emphasis on mean squared error of prediction (MSEP). Theoretical and simulation studies compare the MSEPs of reasonable candidate predictors. For occupational exposure assessment, we pursue a hypothesis testing strategy emphasizing workerspecific mean exposure as a key predictor of longterm adverse health effects.
Summary
"... In this study we illustrate a Maximum Entropy (ME) methodology for modeling incomplete information and learning from repeated samples. The basis for this method has its roots in information theory and builds on the classical maximum entropy work of Janes (1957). We illustrate the use of this approac ..."
Abstract
 Add to MetaCart
In this study we illustrate a Maximum Entropy (ME) methodology for modeling incomplete information and learning from repeated samples. The basis for this method has its roots in information theory and builds on the classical maximum entropy work of Janes (1957). We illustrate the use of this approach, describe how to impose restrictions on the estimator, and how to examine the sensitivity of ME estimates to the parameter and error bounds. Our objective is to show how empirical measures of the value of information for microeconomic models can be estimated in the maximum entropy view. Keywords: Generalized Maximum Entropy, Generalized Cross Entropy, In this study we illustrate a Maximum Entropy (ME) methodology for modelling incomplete information and learning from repeated samples. The basis for this method has its roots in information theory (Shannon, 1948) and builds on the classical maximum entropy work of Janes (1957). We illustrate the use of this
REGULARIZED LEARNING WITH FEATURE NETWORKS
"... First and foremost, I would like to thank my academic advisor, Lyle Ungar. Lyle ..."
Abstract
 Add to MetaCart
First and foremost, I would like to thank my academic advisor, Lyle Ungar. Lyle