## Nonlinear Extraction of Independent Components of Natural Images Using Radial Gaussianization (2009)

### Cached

### Download Links

Citations: | 14 - 3 self |

### BibTeX

@MISC{Lyu09nonlinearextraction,

author = {Siwei Lyu and Eero P. Simoncelli},

title = { Nonlinear Extraction of Independent Components of Natural Images Using Radial Gaussianization},

year = {2009}

}

### OpenURL

### Abstract

We consider the problem of efficiently encoding a signal by transforming it to a new representation whose components are statistically independent. A widely studied linear solution, known as independent component analysis (ICA), exists for the case when the signal is generated as a linear transformation of independent nongaussian sources. Here, we examine a complementary case, in which the source is nongaussian and elliptically symmetric. In this case, no invertible linear transform suffices to decompose the signal into independent components, but we show that a simple nonlinear transformation, which we call radial gaussianization (RG), is able to remove all dependencies. We then examine this methodology in the context of natural image statistics. We first show that distributions of spatially proximal bandpass filter responses are better described as elliptical than as linearly transformed independent sources. Consistent with this, we demonstrate that the reduction in dependency achieved by applying RG to either nearby pairs or blocks of bandpass filter responses is significantly greater than that achieved by ICA. Finally, we show that the RG transformation may be closely approximated by divisive normalization, which has been used to model the nonlinear response properties of visual neurons.

### Citations

2392 | A theory for multiresolution signal decomposition: The wavelet representation
- Mallat
(Show Context)
Citation Context ...eatures such as object boundaries) had long advocated the use of banks of local oriented filters for representation and analysis of image data (Koenderink, 1984; Granlund, 1978; Adelson et al., 1987; =-=Mallat, 1989-=-). Despite the success of ICA methods in providing a fundamental motivation for the use of localized oriented filters, there are a number of simple observations that indicate inconsistencies in the in... |

2054 |
An Introduction to Probability Theory and Its Applications, Vol 1, 3rd ed
- Feller
- 1968
(Show Context)
Citation Context ... local data onto a random direction should result in a density that 13becomes more Gaussian as the neighborhood size increases, in accordance with a generalized version of the Central Limit Theorem (=-=Feller, 1968-=-). A recent quantitative study (Bethge, 2006) showed that the oriented band-pass filters obtained through ICA optimization lead to a surprisingly small improvement in terms of reduction in multi-infor... |

2050 |
Principal Component Analysis
- Jolliffe
- 1989
(Show Context)
Citation Context ...rix in terms of an orthogonal matrix of eigenvectors, U, and a diagonal matrix of eigenvalues, Λ, such that Σ = UΛU T . The classical solution, generally known as principal components analysis (PCA) (=-=Jolliffe, 2002-=-), transforms the data with the orthogonal eigenvector matrix, ⃗y = UT ⃗x, resulting in a Gaussian density whose diagonal covariance matrix containing the eigenvalues. 2.3 Whitening and ZCA The diagon... |

1688 | Atomic decomposition by basis pursuit - Chen, Donoho, et al. - 1999 |

1523 | Independent Component Analysis - Hyvärinen, Oja |

1369 |
Independent component analysis, a new concept
- Comon
- 1994
(Show Context)
Citation Context ...tical properties of the source (typically, the second-order statistics, augmented with a higher-order set of marginal measurements). ICA methods have shown success in blind signal separation problems(=-=Comon, 1994-=-), and in deriving bases for natural signals (Olshausen and Field, 1996; van der Schaaf and van Hateren, 1996; Bell and Sejnowski, 1997; Lewicki, 2002). As with PCA, the ICA transformations may be com... |

1060 | Matching pursuits with time-frequency dictionaries - Mallat, Zhang - 1993 |

1021 | The Laplacian Pyramid as a compact image code
- Burt, Adelson
- 1983
(Show Context)
Citation Context ...s, but are typically not pure sinusoids due to the non-uniqueness of the eigenvalues. Starting in the 1980’s, researchers began to notice striking non-Gaussian behaviors of bandpass filter responses (=-=Burt and Adelson, 1981-=-; Field, 1987), and this led to an influential set of results obtained by using newly developed ICA methodologies to exploit these behaviors (Olshausen and Field, 1996; van der Schaaf and van Hateren,... |

932 |
Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature
- Olshausen
- 1996
(Show Context)
Citation Context ...er statistics, augmented with a higher-order set of marginal measurements). ICA methods have shown success in blind signal separation problems(Comon, 1994), and in deriving bases for natural signals (=-=Olshausen and Field, 1996-=-; van der Schaaf and van Hateren, 1996; Bell and Sejnowski, 1997; Lewicki, 2002). As with PCA, the ICA transformations may be computed for nearly any source, but they are only guaranteed to eliminate ... |

661 |
The structure of images
- Koenderink
- 1984
(Show Context)
Citation Context ... observations, and partly by a desire to capture image features such as object boundaries) had long advocated the use of banks of local oriented filters for representation and analysis of image data (=-=Koenderink, 1984-=-; Granlund, 1978; Adelson et al., 1987; Mallat, 1989). Despite the success of ICA methods in providing a fundamental motivation for the use of localized oriented filters, there are a number of simple ... |

608 | Relations between the statistics of natural images and the response properties of cortical cells
- Field
- 1987
(Show Context)
Citation Context ... pure sinusoids due to the non-uniqueness of the eigenvalues. Starting in the 1980’s, researchers began to notice striking non-Gaussian behaviors of bandpass filter responses (Burt and Adelson, 1981; =-=Field, 1987-=-), and this led to an influential set of results obtained by using newly developed ICA methodologies to exploit these behaviors (Olshausen and Field, 1996; van der Schaaf and van Hateren, 1996; Bell a... |

603 | Sparse coding with an overcomplete basis set: a strategy employed by v1? Vision Research 37 - OLSHAUSEN, FIELD - 1997 |

524 | Fast and robust fixed-point algorithms for ICA - Hyvärinen - 1999 |

494 | Entropy-based algorithms for best basis selection,” Information Theory - Coifman, Wickerhauser - 1992 |

483 | The “independent components” of natural scenes are edge filters
- Bell, Sejnowski
- 1997
(Show Context)
Citation Context ...urements). ICA methods have shown success in blind signal separation problems(Comon, 1994), and in deriving bases for natural signals (Olshausen and Field, 1996; van der Schaaf and van Hateren, 1996; =-=Bell and Sejnowski, 1997-=-; Lewicki, 2002). As with PCA, the ICA transformations may be computed for nearly any source, but they are only guaranteed to eliminate dependencies when the the assumed linear mixture of independent ... |

427 | Some informational aspects of visual perception
- Attneave
- 1954
(Show Context)
Citation Context ...ents of the signal may be manipulated, transmitted or stored more efficiently. It has been proposed that this principle also plays an important role in the formation of biological perceptual systems (=-=Attneave, 1954-=-; Barlow, 1961). The problem of deriving an appropriate transformation for a given source, based on the statistics of observed samples, has been studied for more than a century. The classical solution... |

357 | Image denoising using a scale mixture of Gaussians in the wavelet domain
- Portilla, Strela, et al.
(Show Context)
Citation Context ...ar dependencies (Zetzsche and Krieger, 1999; Wainwright and Simoncelli, 2000; Huang and Mumford, 1999; Parra et al., 2001; Hyvärinen et al., 2000; Srivastava et al., 2002; Sendur and Selesnick, 2002; =-=Portilla et al., 2003-=-; Teh et al., 2003; Gehler and Welling, 2006). Here, we consider the factorization problem for the class of elliptically symmetric densities (ESDs). For this source model, we prove that linear transfo... |

324 | Possible principles underlying the transformation of sensory messages - Barlow - 1961 |

262 | Learning overcomplete representations - Lewicki, Sejnowski - 2000 |

259 | Normalization of cell responses in cat striate cortex - Heeger - 1992 |

231 | The steerable pyramid: A flexible architecture for multi-scale derivative computation
- Simoncelli, Freeman
- 1995
(Show Context)
Citation Context ...airs are easy to visualize, and can serve as an intuitive reference when we later extend to the multi-dimensional pixel blocks. 2 Specifically, we use one subband of a non-oriented steerable pyramid (=-=Simoncelli and Freeman, 1995-=-). 3 All images are available from Javier Portilla’s web page at http://www.io.csic.es/PagsPers/JPortilla/denoise/. 14The top row of Fig. 4 (labeled “raw”) shows example contour plots of the joint hi... |

229 |
Natural image statistics and neural representation
- Simoncelli, Olshausen
- 2001
(Show Context)
Citation Context ... solving problems in image processing, and in understanding the design and functionality of biological visual systems. The problem has been studied for more than fifty years (see (Ruderman, 1996) or (=-=Simoncelli and Olshausen, 2001-=-) for reviews). Early analysis, developed in the k=1 12television engineering community, concentrated on second-order characterization of local pixel statistics. If one assumes translation-invariance... |

201 | Statistics of natural images and models
- Huang, Mumford
- 1999
(Show Context)
Citation Context ...tistics have proposed the use of spherically or elliptically symmetric nonGaussian densities, whose components exhibit clear dependencies (Zetzsche and Krieger, 1999; Wainwright and Simoncelli, 2000; =-=Huang and Mumford, 1999-=-; Parra et al., 2001; Hyvärinen et al., 2000; Srivastava et al., 2002; Sendur and Selesnick, 2002; Portilla et al., 2003; Teh et al., 2003; Gehler and Welling, 2006). Here, we consider the factorizati... |

201 | Noise removal via Bayesian wavelet coring
- Simoncelli, Adelson
- 1996
(Show Context)
Citation Context ... and ⃗xica, it has been observed that the marginals can be well fitted by the generalized Laplacian family (also known as the generalized Gaussians, or stretched exponential densities) (Mallat, 1989; =-=Simoncelli and Adelson, 1996-=-; Huang and Mumford, 1999): p(x;p,s) = p 2sΓ(1/p) exp ( ( ) |x| p) − , (25) s which is determined by the shape parameter p and scale s. This suggests that we can estimate the marginal entropy with a p... |

193 | Image compression via joint statistical characterization in the wavelet domain
- Buccigrossi, Simoncelli
- 1999
(Show Context)
Citation Context ...nd Sejnowski, 1997; van der Schaaf and van Hateren, 1996). But the responses of such linear filters exhibit striking dependencies (Wegmann and Zetzsche, 1990; Zetzsche et al., 1993; Simoncelli, 1997; =-=Buccigrossi and Simoncelli, 1999-=-a), and although dependency between these responses is reduced compared to the original pixels (Zetzsche and Schönecker, 1987), such reduction is relatively small (Bethge, 2006). A number of recent at... |

192 | High-order contrasts for independent component analysis
- Cardoso
- 1999
(Show Context)
Citation Context ...l scalar sources. The procedure for recovering the inverse transformation matrix, M −1 , and the original factorial source from data ⃗x is known as Independent Components Analysis (ICA) (Comon, 1994; =-=Cardoso, 1999-=-). For our purposes here, we assume M is square and invertible, although the ICA methodology may be generalized to arbitrary matrices. The ICA computation can be better understood by expanding M in te... |

171 | Elements of information theory (2nd ed - Cover, Thomas - 2006 |

167 | Emergence of phase and shift invariant features by decomposition of natural images into independent feature subspaces
- Hyvärinen, Hoyer
(Show Context)
Citation Context ...Hyvärinen et al., 2000, e.g., ). In addition, several recent approaches for unsupervised learning of image structures arrive at related local descriptions. Specifically, independent subspace analysis(=-=Hyvärinen and Hoyer, 2000-=-), topographical ICA (Hyvärinen et al., 2000), and hierarchical scale mixture models (Karklin and Lewicki, 2005) each assume that image data are generated from linearly transformed densities which are... |

158 |
Symmetric multivariate and related distributions
- Fang, Kotz, et al.
- 1990
(Show Context)
Citation Context ...tically Symmetric Densities The family of elliptically symmetric random vectors ⃗x ∈ R d are densities of the form: p(⃗x) = 1 α|Σ| 1 f(− 2 1 2 ⃗xT Σ −1 ⃗x), (3) where Σ is a positive definite matrix (=-=Fang et al., 1990-=-). When ⃗x has finite second-order statistics, Σ is a multiple of the covariance matrix. With Σ fixed, p(⃗x) is completely determined by the generating function f(·) : R + ∪ {0} ↦→ R + ∪ {0}, which ha... |

158 | Statistics of natural images: Scaling in the woods - Ruderman, Bialek - 1994 |

148 |
The statistics of natural images
- Ruderman
- 1996
(Show Context)
Citation Context ...entral importance in solving problems in image processing, and in understanding the design and functionality of biological visual systems. The problem has been studied for more than fifty years (see (=-=Ruderman, 1996-=-) or (Simoncelli and Olshausen, 2001) for reviews). Early analysis, developed in the k=1 12television engineering community, concentrated on second-order characterization of local pixel statistics. I... |

147 | Scale mixtures of normal distributions - Andrews, Mallows - 1974 |

138 | Bivariate shrinkage functions for wavelet-based denoising exploiting interscale dependency
- Sendur, Selesnick
- 2002
(Show Context)
Citation Context ...whose components exhibit clear dependencies (Zetzsche and Krieger, 1999; Wainwright and Simoncelli, 2000; Huang and Mumford, 1999; Parra et al., 2001; Hyvärinen et al., 2000; Srivastava et al., 2002; =-=Sendur and Selesnick, 2002-=-; Portilla et al., 2003; Teh et al., 2003; Gehler and Welling, 2006). Here, we consider the factorization problem for the class of elliptically symmetric densities (ESDs). For this source model, we pr... |

138 | Statistical models for images: compression, restoration and synthesis
- Simoncelli
- 1997
(Show Context)
Citation Context ...ield, 1996; Bell and Sejnowski, 1997; van der Schaaf and van Hateren, 1996). But the responses of such linear filters exhibit striking dependencies (Wegmann and Zetzsche, 1990; Zetzsche et al., 1993; =-=Simoncelli, 1997-=-; Buccigrossi and Simoncelli, 1999a), and although dependency between these responses is reduced compared to the original pixels (Zetzsche and Schönecker, 1987), such reduction is relatively small (Be... |

134 | Natural signal statistics and sensory gain control
- Schwartz, Simoncelli
- 2001
(Show Context)
Citation Context ...filter coefficients than does ICA. Finally, we show that divisive normalization, which have previously been shown empirically to reduce higher-order dependencies in multi-scale image representations (=-=Schwartz and Simoncelli, 2001-=-; Wainwright et al., 2002; Malo et al., 2000b; Valerio and Navarro, 2003a; Gluckman, 2006; Lyu and Simoncelli, 2007), can approximate the RG transform. Thus, RG provides a more principled justificatio... |

129 | Perceptual image distortion
- Teo, Heeger
- 1994
(Show Context)
Citation Context ...earities in the responses of mammalian cortical neurons (Heeger, 1992; Geisler and Albrecht, 1992), and nonlinear masking phenomenon in human visual perception (Foley, 1994; Watson and Solomon, 1997; =-=Teo and Heeger, 1994-=-). Statistically, it’s been shown that locally dividing bandpass-filtered images by local standard deviation can produce approximately Gaussian marginal distributions(Ruderman, 1996), and that a weigh... |

125 | Kernel pca and de-noising in feature spaces
- Mika, Scholkopf, et al.
- 1998
(Show Context)
Citation Context ...ivisive normalization, RG is derived as an optimal procedure for a specific family of density models. There are several nonlinear methods for dependency removal in the literature. Kernel PCA methods (=-=Mika et al., 1999-=-) operate by nonlinearly transforming the data to a space where PCA is used to remove any remaining dependencies. The concept is quite general, but success relies on choosing nonlinear kernel function... |

120 | Scale mixtures of Gaussians and the statistics of natural images
- Wainwright, Simoncelli
(Show Context)
Citation Context ...attempts to model local image statistics have proposed the use of spherically or elliptically symmetric nonGaussian densities, whose components exhibit clear dependencies (Zetzsche and Krieger, 1999; =-=Wainwright and Simoncelli, 2000-=-; Huang and Mumford, 1999; Parra et al., 2001; Hyvärinen et al., 2000; Srivastava et al., 2002; Sendur and Selesnick, 2002; Portilla et al., 2003; Teh et al., 2003; Gehler and Welling, 2006). Here, we... |

114 | Towards a theory of early visual processing - Atick, Redlich - 1990 |

109 | Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex - Hateren, Ruderman - 1998 |

101 |
Estimating mutual information
- Kraskov, Stogbauer, et al.
(Show Context)
Citation Context ...ca and ⃗xrg on pairs of band-pass filtered responses separated by distances ranging from 1 to 32 samples. Here, the MI was computed using a recent non-parametric method based on the order statistics (=-=Kraskov et al., 2004-=-). This approach belongs to the class of ”binless” estimator of entropy and mutual information, which alleviates the strong bias and variance intrinsic to the more traditional binning (i.e., “plug-in”... |

101 | Principal component analysis (2nd ed - Jolliffe - 2002 |

94 | Efficient coding of natural sounds
- Lewicki
- 2002
(Show Context)
Citation Context ...e shown success in blind signal separation problems(Comon, 1994), and in deriving bases for natural signals (Olshausen and Field, 1996; van der Schaaf and van Hateren, 1996; Bell and Sejnowski, 1997; =-=Lewicki, 2002-=-). As with PCA, the ICA transformations may be computed for nearly any source, but they are only guaranteed to eliminate dependencies when the the assumed linear mixture of independent sources model i... |

90 | Model of visual contrast gain control and pattern masking
- Watson, Solomon
- 1997
(Show Context)
Citation Context ...een used to explain nonlinearities in the responses of mammalian cortical neurons (Heeger, 1992; Geisler and Albrecht, 1992), and nonlinear masking phenomenon in human visual perception (Foley, 1994; =-=Watson and Solomon, 1997-=-; Teo and Heeger, 1994). Statistically, it’s been shown that locally dividing bandpass-filtered images by local standard deviation can produce approximately Gaussian marginal distributions(Ruderman, 1... |

90 | Predictive coding: A fresh view of inhibition in the retina - Srinivasan, Laughlin, et al. - 1982 |

87 |
Orthogonal pyramid transforms for image coding
- Adelson, Simoncelli, et al.
- 1987
(Show Context)
Citation Context ...ire to capture image features such as object boundaries) had long advocated the use of banks of local oriented filters for representation and analysis of image data (Koenderink, 1984; Granlund, 1978; =-=Adelson et al., 1987-=-; Mallat, 1989). Despite the success of ICA methods in providing a fundamental motivation for the use of localized oriented filters, there are a number of simple observations that indicate inconsisten... |

87 | Random cascades on wavelet trees and their use in modeling and analyzing natural imagery - Wainwright, Simoncelli, et al. - 2001 |

84 | Nonlinear independent component analysis: Existence and uniqueness results
- Hyvärinen, Pajunen
- 1999
(Show Context)
Citation Context ...oblem of selecting a transformation that maps a source signal drawn from a known density to a new representation whose individual components are statistically independent is highly under-constrained (=-=Hyvärinen and Pajunen, 1999-=-). Indeed, even when one specifies a particular target density, there are an infinite number of transformations that can map a random variable associated with the input density into one associated wit... |

84 | Human luminance pattern-vision mechanisms: Masking experiments require a new model - Foley - 1994 |

81 |
Elements of Information Theory. Wiley-Interscience
- Cover, Thomas
- 1991
(Show Context)
Citation Context ... Multi-information We quantify the statistical dependency for multi-variate sources using the multi-information (MI) (Studeny and Vejnarova, 1998), which is defined as the Kulback-Leibler divergence (=-=Cover and Thomas, 2006-=-) between the joint distribution and the product of its marginals: ( ) ∏ I(⃗x) = DKL p(⃗x) p(xk) ∥ = k d∑ H(xk) − H(⃗x), (1) k=1 where H(⃗x) is the differential entropy of ⃗x, and H(xk) denotes the di... |