Results 1 
7 of
7
Comparison of Approximate Methods for Handling Hyperparameters
 NEURAL COMPUTATION
"... I examine two approximate methods for computational implementation of Bayesian hierarchical models, that is, models which include unknown hyperparameters such as regularization constants and noise levels. In the 'evidence framework' the model parameters are integrated over, and the resulting evid ..."
Abstract

Cited by 67 (1 self)
 Add to MetaCart
I examine two approximate methods for computational implementation of Bayesian hierarchical models, that is, models which include unknown hyperparameters such as regularization constants and noise levels. In the 'evidence framework' the model parameters are integrated over, and the resulting evidence is maximized over the hyperparameters. The optimized
Bayesian and Regularization Methods for Hyperparameter Estimation in Image Restoration
 IEEE Trans. Image Processing
, 1999
"... In this paper, we propose the application of the hierarchical Bayesian paradigm to the image restoration problem. We derive expressions for the iterative evaluation of the two hyperparameters applying the evidence and maximum a posteriori (MAP) analysis within the hierarchical Bayesian paradigm. We ..."
Abstract

Cited by 64 (26 self)
 Add to MetaCart
In this paper, we propose the application of the hierarchical Bayesian paradigm to the image restoration problem. We derive expressions for the iterative evaluation of the two hyperparameters applying the evidence and maximum a posteriori (MAP) analysis within the hierarchical Bayesian paradigm. We show analytically that the analysis provided by the evidence approach is more realistic and appropriate than the MAP approach for the image restoration problem. We furthermore study the relationship between the evidence and an iterative approach resulting from the set theoretic regularization approach for estimating the two hyperparameters, or their ratio, defined as the regularization parameter. Finally the proposed algorithms are tested experimentally.
Hyperparameters: optimize, or integrate out?
 IN MAXIMUM ENTROPY AND BAYESIAN METHODS, SANTA BARBARA
, 1996
"... I examine two approximate methods for computational implementation of Bayesian hierarchical models, that is, models which include unknown hyperparameters such as regularization constants. In the `evidence framework' the model parameters are integrated over, and the resulting evidence is maximized o ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
I examine two approximate methods for computational implementation of Bayesian hierarchical models, that is, models which include unknown hyperparameters such as regularization constants. In the `evidence framework' the model parameters are integrated over, and the resulting evidence is maximized over the hyperparameters. The optimized hyperparameters are used to define a Gaussian approximation to the posterior distribution. In the alternative `MAP' method, the true posterior probability is found by integrating over the hyperparameters. The true posterior is then maximized over the model parameters, and a Gaussian approximation is made. The similarities of the two approaches, and their relative merits, are discussed, and comparisons are made with the ideal hierarchical Bayesian solution. In moderately illposed problems, integration over hyperparameters yields a probability distribution with a skew peak which causes significant biases to arise in the MAP method. In contrast, the evidence framework is shown to introduce negligible predictive error, under straightforward conditions. General lessons are drawn concerning the distinctive properties of inference in many dimensions.
SplineBased Adaptive Resolution Image
"... Bayesian probability theory allows to infer an image given data constraints, prior knowledge, and background information. One ingredient of the background information is usually paid little attention, namely the imagegrid, i.e. the points on which the desired image is reconstructed. In many problem ..."
Abstract
 Add to MetaCart
Bayesian probability theory allows to infer an image given data constraints, prior knowledge, and background information. One ingredient of the background information is usually paid little attention, namely the imagegrid, i.e. the points on which the desired image is reconstructed. In many problems an equidistant mesh is used. The choice of the imagegrid can, however, strongly influence the reconstruction: if the grid is too coarse accuracy is wasted, if it is too fine artificial structures due to ringing and noisefitting can show up. In order to achieve the best resolution supported by the data we include the grid into the Bayesian analysis and allowed for locally varying resolution. We applied our procedure to one dimensional problems. The image is reconstructed at the imagegrid and interpolated in the interstitial regions by cubic splines. The bayesian analysis contains two competing tendencies: the dataconstraints tend towards a fine grid as it allows to reduce the misfit, while Ockham's factor favors a coarse grid to keep the image "simple". The Bayesian solution represents a tradeoff between the two trends and leads to results which are significantly improved over those obtained by fixedgrid approaches: overfitting is eliminated and ringing is strongly suppressed, while the sharp structures are improved considerably. We applied the adaptive resolution idea to different types of problems such as deconvolution and density estimation. For both applications we present a representative physical problem.
Adaptive Kernels And Occam's Razor In Inversion
 in MAXENT96  Proceedings of the Maximum Entropy Conference
, 1996
"... Following the adaptivekernel methods in density estimation theory we extend the quantified maximum entropy concept by locally varying correlations. The smoothing property of the adaptive kernels are determined selfconsistently in the framework of Bayesian probability theory with Occam's razor as d ..."
Abstract
 Add to MetaCart
Following the adaptivekernel methods in density estimation theory we extend the quantified maximum entropy concept by locally varying correlations. The smoothing property of the adaptive kernels are determined selfconsistently in the framework of Bayesian probability theory with Occam's razor as driving force for the simplest model consistent with the data. The power of the adaptivekernel approach, the suppression of ringing and noise fitting, is demonstrated on the density estimation problem of the Old Faithful geyser, on a mock data set representing common physical inversion problems and on a realworld problem.
ON THE IMPORTANCE OF ff MARGINALIZATION IN MAXIMUM ENTROPY
 in Maximum Entropy and Bayesian Methods
, 1996
"... The correct entropic prior, computed by marginalization over the regularization parameter ff, is used to invert photoemission data and to restore the famous "Susie" image. Comparison with the conventional maximum entropy procedure shows less overfitting of noise and demonstrates the residual ringing ..."
Abstract
 Add to MetaCart
The correct entropic prior, computed by marginalization over the regularization parameter ff, is used to invert photoemission data and to restore the famous "Susie" image. Comparison with the conventional maximum entropy procedure shows less overfitting of noise and demonstrates the residual ringing which is intrinsic to illposed inversion problems. An improvement to the steepest descent approximation reveals the reason for the overfitting. On top of that, the correct treatment of the regularization parameter is vital for the existence of the continuumlimit of MaxEnt.
Evidence Integrals
, 1996
"... Evidence integrals are a key ingredient of quantified maximum entropy (QME). They allow to evaluate hyperparameters which in turn determine the amount of regularization, the noise level, confidence intervals etc. Consequently, the exact evaluation of these multidimensional integrals is of central ..."
Abstract
 Add to MetaCart
Evidence integrals are a key ingredient of quantified maximum entropy (QME). They allow to evaluate hyperparameters which in turn determine the amount of regularization, the noise level, confidence intervals etc. Consequently, the exact evaluation of these multidimensional integrals is of central importance to the theory. Since the conventional 'steepest descent' (SD) approximation fails frequently a more refined approach is needed. Using diagrammatic techniques we derive a correction factor to the SD result which is readily incorporated in existing QME codes and which provides significantly improved results.