## Optimal denoising in redundant representations (2008)

### Cached

### Download Links

- [www.cns.nyu.edu]
- [www.cns.nyu.edu]
- [www.cns.nyu.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | IEEE TRANS. IMAGE PROCESS |

Citations: | 7 - 2 self |

### BibTeX

@ARTICLE{Raphan08optimaldenoising,

author = {Martin Raphan and Eero P. Simoncelli},

title = {Optimal denoising in redundant representations},

journal = {IEEE TRANS. IMAGE PROCESS},

year = {2008},

pages = {1342--1352}

}

### OpenURL

### Abstract

Image denoising methods are often designed to minimize mean-squared error (MSE) within the subbands of a multiscale decomposition. However, most high-quality denoising results have been obtained with overcomplete representations, for which minimization of MSE in the subband domain does not guarantee optimal MSE performance in the image domain. We prove that, despite this suboptimality, the expected image-domain MSE resulting from applying estimators to subbands that are made redundant through spatial replication of basis functions (e.g., cycle spinning) is always less than or equal to that resulting from applying the same estimators to the original nonredundant representation. In addition, we show that it is possible to further exploit overcompleteness by jointly optimizing the subband estimators for image-domain MSE. We develop an extended version of Stein’s unbiased risk estimate (SURE) that allows us to perform this optimization adaptively, for each observed noisy image. We demonstrate this methodology using a new class of estimator formed from linear combinations of localized “bump ” functions that are applied either pointwise or on local neighborhoods of subband coefficients. We show through simulations that the performance of these estimators applied to overcomplete subbands and optimized for image-domain MSE is substantially better than that obtained when they are optimized within each subband. This performance is, in turn, substantially better than that obtained when they are optimized for use on a nonredundant representation.

### Citations

804 | DL(1995). “De-noising by soft-thresholding - Donoho |

678 | Johnstone.,“Adapting to unknown smoothness via wavelet shrinkage
- Donoho, Iain
- 1995
(Show Context)
Citation Context ...impact on the quality of denoising results. Multiscale decompositions are a typical choice, and both empirical Bayes methods [3], [5], [8], and SURE adaptive methods have been used to optimize scalar =-=[31]-=-, [16]–[18], [22] and joint [15] estimators for application to subbands of multiscale decompositions. Empirical evidence indicates that redundant (overcomplete) multiscale representations are more eff... |

427 | Shiftable multi-scale transforms
- Simoncelli, Freeman, et al.
- 1992
(Show Context)
Citation Context ... inverse, and in particular for all tight frames. As such, we have also examined the behavior of SUREbumps when jointly optimizing estimators applied to a tight frame known as the “steerable pyramid” =-=[23]-=-. The representation is overcomplete by a factor of roughly , where is the number of orientation bands utilized. In our tests, we used , which1348 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 17, NO. ... |

351 | Image denoising using scale mixtures of gaussians in the wavelet domain
- Portilla, Strela, et al.
- 2003
(Show Context)
Citation Context ...ators that are applied independently to each transform coefficient (e.g., [1]–[8]), or are applied to neighborhoods of coefficients at adjacent spatial positions and/or from other subbands (e.g., [9]–=-=[12]-=-). Manuscript received August 20, 2007; revised March 27, 2008. First published June 24, 2008; last published July 11, 2008 (projected). The associate editor coordinating the review of this manuscript... |

289 |
Estimation of the Mean of a Multivariate Normal Distribution
- Stein
- 1981
(Show Context)
Citation Context ...sen to best account for the observed noisy image (typically, by maximizing likelihood) [3]. A parametric estimator may also be adaptively optimized by minimizing Stein’s unbiased risk estimate (SURE) =-=[14]-=-, which provides an approximation of the mean squared error (MSE) as a function of the observed noisy data. Assuming there is enough data to capture the statistics of a given image, adaptive methods w... |

281 | Curvelets—a surprisingly effective nonadaptive representation for objects with edges
- Candès, Donoho
- 2000
(Show Context)
Citation Context ...sform is a tight frame, for which . This includes orthogonal, cycle-spun and undecimated wavelet transforms, as well as other overcomplete decompositions such as the steerable pyramid [23], curvelets =-=[24]-=-, or complex dual-tree wavelets [25]. In this situation, the estimate is computed by transforming the original signal, applying an estimator in the transform domain, and then inverse transforming with... |

267 | Image denoising via sparse and redundant representations over learned dictionaries - Elad, Aharon - 2006 |

263 | A review of image denoising algorithms, with a new one, Multiscale Modeling and Simulation 4 (2
- Buades, Coll, et al.
- 2005
(Show Context)
Citation Context ...iterature has demonstrated that substantial improvements can be achieved with estimators that operate on the surrounding “context” of multiscale coefficients (e.g., [9], [10], [11], [12], [15], [26], =-=[27]-=-, [28]). In general, the surrounding neighborhood can include coefficients within the same subband, as well as coefficients in other subbands, and the neighborhoods are generally overlapping (i.e., ea... |

257 |
Complex wavelets for shift invariant analysis and filtering of signals
- Kingsbury
- 2001
(Show Context)
Citation Context ...This includes orthogonal, cycle-spun and undecimated wavelet transforms, as well as other overcomplete decompositions such as the steerable pyramid [23], curvelets [24], or complex dual-tree wavelets =-=[25]-=-. In this situation, the estimate is computed by transforming the original signal, applying an estimator in the transform domain, and then inverse transforming with Now consider a cycle-spun decomposi... |

226 | Fields of experts: a framework for learning image priors
- Roth, Black
- 2005
(Show Context)
Citation Context ...l noise levels. Although they do not quite achieve the performance level of current state-of-the-art methods (e.g., [28]), the results are roughly comparable to many recent results (e.g., [12], [32], =-=[33]-=-, [34]), especially at low to moderate levels of noise. Fig. 10 shows example denoised images. Increases in redundancy and image domain optimization are both seen to improve visual quality. VI. DISCUS... |

218 | Image denoising by sparse 3d transform-domain collaborative filtering
- Dabov, Foi, et al.
- 2007
(Show Context)
Citation Context ...ure has demonstrated that substantial improvements can be achieved with estimators that operate on the surrounding “context” of multiscale coefficients (e.g., [9], [10], [11], [12], [15], [26], [27], =-=[28]-=-). In general, the surrounding neighborhood can include coefficients within the same subband, as well as coefficients in other subbands, and the neighborhoods are generally overlapping (i.e., each coe... |

203 | Noise removal via Bayesian wavelet coring”, the in proceeding
- Simoncelli, Adelson
- 1996
(Show Context)
Citation Context ...enoised. For example, an “empirical Bayes” estimator may be derived from a prior density whose parameters are chosen to best account for the observed noisy image (typically, by maximizing likelihood) =-=[3]-=-. A parametric estimator may also be adaptively optimized by minimizing Stein’s unbiased risk estimate (SURE) [14], which provides an approximation of the mean squared error (MSE) as a function of the... |

201 | Wavelet thresholding via a Bayesian approach
- Abramovich, Sapatinas, et al.
- 1998
(Show Context)
Citation Context ...t is less well understood, the choice of linear transform also has an impact on the quality of denoising results. Multiscale decompositions are a typical choice, and both empirical Bayes methods [3], =-=[5]-=-, [8], and SURE adaptive methods have been used to optimize scalar [31], [16]–[18], [22] and joint [15] estimators for application to subbands of multiscale decompositions. Empirical evidence indicate... |

196 |
Zur Theorie der orthogonalen Funktionensysteme. Mathematische Annalen 69
- HAAR
- 1910
(Show Context)
Citation Context ...t the number of parameters that must be optimized, we limit the representation to four bumps, as illustrated in Fig. 1. To test our methodology, we used decompositions based on a separable Haar basis =-=[30]-=-, consisting of local averages and differences of adjacent local averages. These are the simplestRAPHAN AND SIMONCELLI: OPTIMAL DENOISING IN REDUNDANT REPRESENTATIONS 1347 Fig. 3. Comparison of denoi... |

194 | Image compression via joint statistical characterization in the wavelet domain
- Buccigrossi, Simoncelli
- 1999
(Show Context)
Citation Context ...as been found to produce substantial increases in performance [11], [12], [15], although a significant portion of these increases may be obtained through the use of larger spatial neighborhoods [35], =-=[36]-=-. Perhaps more importantly, we believe there is room for substantial improvement in the design of the denoising functions. The examples in this article used a fixed linear family of “bump” functions, ... |

180 | Spatially adaptive wavelet thresholding with context modeling for image denoising
- Chang, Yu, et al.
- 2000
(Show Context)
Citation Context ...n functions. However, recent literature has demonstrated that substantial improvements can be achieved with estimators that operate on the surrounding “context” of multiscale coefficients (e.g., [9], =-=[10]-=-, [11], [12], [15], [26], [27], [28]). In general, the surrounding neighborhood can include coefficients within the same subband, as well as coefficients in other subbands, and the neighborhoods are g... |

173 | Adaptive Bayesian wavelet shrinkage - Chipman, Kolaczyk, et al. - 1997 |

151 | Low-complexity image denoising based on statistical modeling of wavelet coefficients
- Mihcak, Kozintsev, et al.
- 1999
(Show Context)
Citation Context ...operators that are applied independently to each transform coefficient (e.g., [1]–[8]), or are applied to neighborhoods of coefficients at adjacent spatial positions and/or from other subbands (e.g., =-=[9]-=-–[12]). Manuscript received August 20, 2007; revised March 27, 2008. First published June 24, 2008; last published July 11, 2008 (projected). The associate editor coordinating the review of this manus... |

136 | Bivariate shrinkage functions for wavelet-based denoising exploiting interscale dependency
- Sendur, Selesnick
- 2002
(Show Context)
Citation Context ...tions. However, recent literature has demonstrated that substantial improvements can be achieved with estimators that operate on the surrounding “context” of multiscale coefficients (e.g., [9], [10], =-=[11]-=-, [12], [15], [26], [27], [28]). In general, the surrounding neighborhood can include coefficients within the same subband, as well as coefficients in other subbands, and the neighborhoods are general... |

115 | Nonlinear wavelet shrinkage with Bayes rules and Bayes factors - Vidakovic - 1998 |

92 | Sparse code shrinkage: Denoising of nongaussian data by maximum likelihood estimation
- Hyvarinen
- 1999
(Show Context)
Citation Context ...ion of small coefficients, along with preservation (or even boosting) of coefficients of moderate magnitude. That is, we may view the estimator as performing a type of sparsification, as suggested in =-=[7]-=- and [21]: lower-amplitude coefficients are suppressed, but medium-amplitude coefficients are boosted in order to compensate for the loss of signal energy. The boosting of mid-amplitude coefficients i... |

89 | Bayesian denoising of visual images in the wavelet domain - Simoncelli - 1999 |

88 | Why simple shrinkage is still relevant for redundant representations
- Elad
- 2006
(Show Context)
Citation Context ...age domain. Recent work provides an interesting explanation for this phenomenon by interpreting shrinkage in overcomplete representations as the first iteration of a Basis Pursuit denoising algorithm =-=[21]-=-. In this paper, we prove that application of denoising functions to subbands made overcomplete through cycle spinning or elimination of decimation is guaranteed to be no worse in MSE (and is in pract... |

79 | Universal discrete denoising: Known channel - Weissman, Ordentlich, et al. - 2003 |

72 |
Translation-invariant de-noising,” in Wavelets and statistics
- Coifman, Donoho
- 1995
(Show Context)
Citation Context ...ors for application to subbands of multiscale decompositions. Empirical evidence indicates that redundant (overcomplete) multiscale representations are more effective than orthonormal representations =-=[19]-=-, [20]. This fact is somewhat mysterious, since the estimators are generally optimized for MSE within individual subbands, which (for a redundant basis) is not the same as the MSE in the image domain.... |

69 | Information-theoretic analysis of interscale and intrascale dependencies between image wavelet coefficients
- Liu, Moulin
(Show Context)
Citation Context ...hood has been found to produce substantial increases in performance [11], [12], [15], although a significant portion of these increases may be obtained through the use of larger spatial neighborhoods =-=[35]-=-, [36]. Perhaps more importantly, we believe there is room for substantial improvement in the design of the denoising functions. The examples in this article used a fixed linear family of “bump” funct... |

68 | Optimal Spatial Adaptation for Patch-Based Image Denoising
- Kervrann, Bourlanger
- 2006
(Show Context)
Citation Context ...e levels. Although they do not quite achieve the performance level of current state-of-the-art methods (e.g., [28]), the results are roughly comparable to many recent results (e.g., [12], [32], [33], =-=[34]-=-), especially at low to moderate levels of noise. Fig. 10 shows example denoised images. Increases in redundancy and image domain optimization are both seen to improve visual quality. VI. DISCUSSION I... |

51 | Wavelet-based image estimation: an empirical bayes approach using Jeffreys’ noninformative prior
- Figueiredo, Nowak
- 2001
(Show Context)
Citation Context ...erting the linear transform to obtain the denoised image. Estimation functions generally take the form of “shrinkage” operators that are applied independently to each transform coefficient (e.g., [1]–=-=[8]-=-), or are applied to neighborhoods of coefficients at adjacent spatial positions and/or from other subbands (e.g., [9]–[12]). Manuscript received August 20, 2007; revised March 27, 2008. First publish... |

51 |
A new SURE approach to image denoising: Interscale orthonormal wavelet thresholding
- Luisier, Blu, et al.
- 2007
(Show Context)
Citation Context ...ng results. Multiscale decompositions are a typical choice, and both empirical Bayes methods [3], [5], [8], and SURE adaptive methods have been used to optimize scalar [31], [16]–[18], [22] and joint =-=[15]-=- estimators for application to subbands of multiscale decompositions. Empirical evidence indicates that redundant (overcomplete) multiscale representations are more effective than orthonormal represen... |

37 | information-theoretic, adaptive image filtering for image restoration - Awate, Whitaker, et al. - 2006 |

30 |
The SURE-LET approach to image denoising
- Blu, Luisier
- 2007
(Show Context)
Citation Context ...he quality of denoising results. Multiscale decompositions are a typical choice, and both empirical Bayes methods [3], [5], [8], and SURE adaptive methods have been used to optimize scalar [31], [16]–=-=[18]-=-, [22] and joint [15] estimators for application to subbands of multiscale decompositions. Empirical evidence indicates that redundant (overcomplete) multiscale representations are more effective than... |

26 |
Image denoising using local contextual hidden Markov model
- Fan, Xia
- 2001
(Show Context)
Citation Context ...cent literature has demonstrated that substantial improvements can be achieved with estimators that operate on the surrounding “context” of multiscale coefficients (e.g., [9], [10], [11], [12], [15], =-=[26]-=-, [27], [28]). In general, the surrounding neighborhood can include coefficients within the same subband, as well as coefficients in other subbands, and the neighborhoods are generally overlapping (i.... |

22 |
Building robust wavelet estimators for multicomponent images using Stein^{\prime}s principle
- Benazza-Benyahia, Pesquet
- 2005
(Show Context)
Citation Context ...applied to orthogonal wavelet subbands [31]. It has also been used in conjunction with cycle-spun wavelets [19], alternative pointwise nonlinear estimators [16], two-component Gaussian mixture models =-=[17]-=-, and for an interscale contextual estimator [15], and, most recently, to optimize a 2-parameter [18] and a multiparameter [22] scalar subband estimator in the image domain. A. SURE for Correlated Gau... |

19 | Learning to be Bayesian without supervision
- Raphan, Simöncelli
- 2007
(Show Context)
Citation Context ... without explicit reference to either or . We have recently shown that this concept may be generalized to several types of non-Gaussian noise, as well as a variety of nonadditive corruption processes =-=[29]-=-. The expression in the curly braces of (5), known (up to an additive constant) as Stein’sunbiasedrisk estimate(SURE),maybe evaluated on a single observation to produce an unbiased estimate of the MSE... |

16 |
A joint interand intrascale statistical model for Bayesian wavelet based image denoising
- Pi˘zurica, Philips, et al.
- 2002
(Show Context)
Citation Context ... at all noise levels. Although they do not quite achieve the performance level of current state-of-the-art methods (e.g., [28]), the results are roughly comparable to many recent results (e.g., [12], =-=[32]-=-, [33], [34]), especially at low to moderate levels of noise. Fig. 10 shows example denoised images. Increases in redundancy and image domain optimization are both seen to improve visual quality. VI. ... |

14 | A new wavelet estimator for image denoising - Pesquet, Leporini - 1997 |

11 |
Digital techniques for reducing television noise
- Rossi
- 1978
(Show Context)
Citation Context ... inverting the linear transform to obtain the denoised image. Estimation functions generally take the form of “shrinkage” operators that are applied independently to each transform coefficient (e.g., =-=[1]-=-–[8]), or are applied to neighborhoods of coefficients at adjacent spatial positions and/or from other subbands (e.g., [9]–[12]). Manuscript received August 20, 2007; revised March 27, 2008. First pub... |

8 |
A discriminative approach for wavelet denoising
- Hel-Or, Shaked
- 2008
(Show Context)
Citation Context ...e chosen/optimized for performance over image ensembles. In fact, some methods rely entirely on the latter, optimizing over a large set of training images to select a single universal denoiser (e.g., =-=[13]-=-, [33]). In our implementation, some prior information is implicitly included through the design of the bumps. Prior information could be more explicitly incorporated in the deterministic choice of a ... |

2 |
Optimal denoising in redundant bases,” presented at the
- Raphan, Simoncelli
- 2007
(Show Context)
Citation Context ...lity of denoising results. Multiscale decompositions are a typical choice, and both empirical Bayes methods [3], [5], [8], and SURE adaptive methods have been used to optimize scalar [31], [16]–[18], =-=[22]-=- and joint [15] estimators for application to subbands of multiscale decompositions. Empirical evidence indicates that redundant (overcomplete) multiscale representations are more effective than ortho... |