## Regularization on graphs with function-adapted diffusion process (2006)

Citations: | 23 - 5 self |

### BibTeX

@MISC{Szlam06regularizationon,

author = {Arthur D. Szlam and Mauro Maggioni and Ronald R. Coifman and Zoubin Ghahrmani},

title = {Regularization on graphs with function-adapted diffusion process},

year = {2006}

}

### OpenURL

### Abstract

Harmonic analysis and diffusion on discrete data has been shown to lead to state-of-the-art algorithms for machine learning tasks, especially in the context of semi-supervised and transductive learning. The success of these algorithms rests on the assumption that the function(s) to be studied (learned, interpolated, etc.) are smooth with respect to the geometry of the data. In this paper we present a method for modifying the given geometry so the function(s) to be studied are smoother with respect to the modified geometry, and thus more amenable to treatment using harmonic analysis methods. Among the many possible applications, we consider the problems of image denoising and transductive classification. In both settings, our approach improves on standard diffusion based methods.

### Citations

2590 | Normalized cuts and image segmentation - Shi, Malik - 1997 |

1097 | On spectral clustering: Analysis and an algorithm - Ng, Jordan, et al. - 2001 |

734 | Laplacian Eigenmaps for Dimensionality Reduction and - Belkin |

660 | The structure of images - Koenderink - 1984 |

539 |
Scale-Space Filtering
- Witkin
- 1983
(Show Context)
Citation Context ...mensional heat kernel, and K t I would be the classical smoothing of I induced by the Euclidean two-dimensional heat kernel, associated with the classical Gaussian scale space (we refer the reader to =-=Witkin, 1983-=-; Koenderink, 1721SZLAM, MAGGIONI AND COIFMAN 1984; Lindeberg, 1994, and references therein). In our context K t is associated with a scale space induced by G(I), which is thus a nonlinear scale spac... |

470 | Scale-Space Theory in Computer Vision
- Lindeberg
- 1994
(Show Context)
Citation Context ... of I induced by the Euclidean two-dimensional heat kernel, associated with the classical Gaussian scale space (we refer the reader to Witkin, 1983; Koenderink, 1721SZLAM, MAGGIONI AND COIFMAN 1984; =-=Lindeberg, 1994-=-, and references therein). In our context K t is associated with a scale space induced by G(I), which is thus a nonlinear scale space (in the sense that it depends on the original 50 3 4 100 2 150 1 2... |

338 |
The Rapid Evaluation of Potential Fields in Particle Systems
- Greengard
- 1988
(Show Context)
Citation Context ...he weights, but a global σ in the self-tuning weights corresponds to some location dependent choice of σ in the standard exponential weights. 3. However, methods of Fast Multipole of Fast Gauss type (=-=Greengard and Rokhlin, 1988-=-) may make it possible to work with dense matrices implicitly, with complexity proportional to the number of points. See Raykar et al. (2005) for a recent reference with applications to machine learni... |

289 | Localitysensitive hashing scheme based on p-stable distributions - Datar, Immorlica, et al. - 2004 |

287 | A tutorial on spectral clustering - Luxburg |

280 |
SemiSupervised Learning
- Chapelle, Schölkopf, et al.
- 2006
(Show Context)
Citation Context .... (ii) αi = λ t i for some t > 0, this corresponds to setting ˜f = K t ( f ), that is, kernel smoothing on the data set, with a data-dependent kernel (Smola and Kondor, 2003; Zhou and Schlkopf, 2005; =-=Chapelle et al., 2006-=-). (iii) αi = P(λi), for some polynomial (or rational function) P, generalizing (ii). See, for example, Maggioni and Mhaskar (2007) As mentioned, one can interpret K t f as evolving a heat equation on... |

263 | A review of image denoising algorithms, with a new one, Multiscale Modeling and Simulation 4 (2 - Buades, Coll, et al. - 2005 |

257 | On clusterings: Good, bad and spectral - Kannan, Vempala, et al. - 2004 |

227 | Translation-invariant de-noising
- Coifman, Donoho
- 1995
(Show Context)
Citation Context ...facts, we could embed pixels x ∈ Q into R d+2 by x ↦→ (αx, f1(x),..., fd(x)). In other words we interpret ( fi(x))i=1,...,d as a feature vector at x. This method is an alternative to “cycle spinning”(=-=Coifman and Donoho, 1995-=-), that is, simply averaging the different denoisings. In practice, we have found that a better choice of feature vector is f σ(1)(x),..., f σ(d)(x), where σ is a random permutation of {1,...,d} depen... |

206 | Partially labeled classification with markov random walks - Szummer, Jaakkola - 2006 |

188 | Self-tuning spectral clustering
- Zelnik-Manor, Perona
- 2004
(Show Context)
Citation Context ...g in very general geometries. These ideas have been applied to a wide range of tasks in the design of computer networks, in parallel computation, clustering (Ng et al., 2001; Belkin and Niyogi, 2001; =-=Zelnik-Manor and Perona, 2004-=-; Kannan et al., 2004; Coifman and Maggioni, 2007), manifold learning (Bérard et al., 1994; Belkin and Niyogi, 2001; Lafon, 2004; Coifman et al., 2005a; Coifman and Lafon, 2006a), image segmentation (... |

169 | Kernels and regularization on graphs
- Smola, Kondor
(Show Context)
Citation Context ...tion (with band I), see for example Belkin (2003). (ii) αi = λ t i for some t > 0, this corresponds to setting ˜f = K t ( f ), that is, kernel smoothing on the data set, with a data-dependent kernel (=-=Smola and Kondor, 2003-=-; Zhou and Schlkopf, 2005; Chapelle et al., 2006). (iii) αi = P(λi), for some polynomial (or rational function) P, generalizing (ii). See, for example, Maggioni and Mhaskar (2007) As mentioned, one ca... |

156 | Semi-supervised learning on Riemannian manifolds
- Belkin, Niyogi
- 2004
(Show Context)
Citation Context ...ogi, 2001; Lafon, 2004; Coifman et al., 2005a; Coifman and Lafon, 2006a), image segmentation (Shi and Malik, 2000), classification (Coifman and Maggioni, 2007), regression and function approximation (=-=Belkin and Niyogi, 2004-=-; Mahadevan and Maggioni, 2005; Mahadevan et al., 2006; Mahadevan and Maggioni, 2007; Coifman and Maggioni, 2007). 2.4 Regularization by Diffusion It is often useful to find the smoothest function ˜f ... |

156 | Diffusion maps - Coifman, Lafon - 2006 |

156 | Spectral relaxation for k-means clustering
- Zha, He, et al.
- 2001
(Show Context)
Citation Context ...ysis of a data set, modeled as a graph or a manifold, can be developed by considering a natural random walk K on it (Chung, 1997; Szummer and Jaakkola, 2001; Ng et al., 2001; Belkin and Niyogi, 2001; =-=Zha et al., 2001-=-; Lafon, 2004; Coifman et al., 2005a,b). The random walk allows to construct diffusion operators on the data set, as well as associated basis functions. For an initial condition δx, K t δx(y) represen... |

149 | Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps - Coifman, Lafon, et al. |

126 | Diffusion kernels on graphs and other discrete structures
- Kondor, Lafferty
- 2002
(Show Context)
Citation Context ...nalyses, fits very well in the transductive learning framework. In several papers a diffusion process constructed on X has been used for finding F directly (Zhou and Schlkopf, 2005; Zhu et al., 2003; =-=Kondor and Lafferty, 2002-=-) and indirectly, by using adapted basis functions on X constructed from the diffusion, such as the eigenfunctions of the Laplacian (Coifman and Lafon, 2006a,b; Lafon, 2004; Coifman et al., 2005a,b; B... |

101 | Towards a theoretical foundation for Laplacian-based manifold methods - Belkin, Niyogi - 2005 |

99 | Ideal denoising in an orthonormal basis chosen from a library of bases
- Donoho, Johnstone
- 1994
(Show Context)
Citation Context ...esponds at equilibrium to (iv) αi = β/(1 + β − λi). One can also consider families of nonlinear mollifiers, of the form ˜f = ∑ i m(〈 f ,ψi〉)ψi , where for example m is a (soft-)thresholding function (=-=Donoho and Johnstone, 1994-=-). In fact, m may be made even dependent on i. While these techniques are classical and well-understood in Euclidean space (mostly in view of applications to signal processing), it is only recently th... |

95 | Image Processing And Analysis - Chan, Shen - 2005 |

95 | Diffusion maps and coarse-graining: A unified framework for dimensionality reduction, graph partitioning and data set parameterization - Lafon, Lee |

91 | A Fast Two-Dimensional Median Filtering Algorithm - Huang, Yang, et al. - 1979 |

75 | Using manifold structure for partially labeled classification. Advances in neural information processing systems (NIPS) (Vol. 15 - Belkin, Niyogi - 2003 |

73 | Diffusion wavelets
- Coifman, Maggioni
- 2006
(Show Context)
Citation Context ...cale geometry. K, its powers, and the special bases associated to it, such as its eigenfunctions (Belkin and Niyogi, 2003a; Coifman et al., 2005a; Coifman and Lafon, 2006a) or its diffusion wavelets (=-=Coifman and Maggioni, 2006-=-) can be used to study the geometry of and analyze functions on the data set. Among other things, “diffusion analysis” allows us to introduce a notion of smoothness in discrete settings that preserves... |

73 |
Diffusion Maps and Geometric Harmonics
- Lafon
- 2004
(Show Context)
Citation Context ..., modeled as a graph or a manifold, can be developed by considering a natural random walk K on it (Chung, 1997; Szummer and Jaakkola, 2001; Ng et al., 2001; Belkin and Niyogi, 2001; Zha et al., 2001; =-=Lafon, 2004-=-; Coifman et al., 2005a,b). The random walk allows to construct diffusion operators on the data set, as well as associated basis functions. For an initial condition δx, K t δx(y) represents the probab... |

61 | Weighted median filters: A tutorial - Yin, Yang, et al. - 1996 |

57 | Problems of Learning on Manifolds - Belkin - 2003 |

57 | A deterministic strongly polynomial algorithm for matrix scaling and approximate permanents
- Linial, Samorodnitsky, et al.
(Show Context)
Citation Context ...or example one could consider the heat kernel e −tL where L is defined in (3) below, see also Chung (1997), or a bi-Markov matrix similar to W (Sinkhorn, 1964; Sinkhorn and Knopp, 1967; Soules, 1991; =-=Linial et al., 1998-=-; Shashua et al., 2005; Zass and Shashua, 2005). In general K is not column-stochastic, 2 but the operation f K of multiplication on the right by a (row) vector can be thought of as a diffusion of the... |

49 | a new approach to low level image processing - Susan - 1995 |

41 | From graph to manifold laplacian: the convergence rate - Singer |

41 | The multiscale structure of nondifferentiable image manifolds
- Wakin, Donoho, et al.
- 2005
(Show Context)
Citation Context ... family of smooth diffeomorphisms of [0,1] × [0,1], the set of images obtained under the family of diffeomorphisms is not necessarily a (differentiable) manifold (see Donoho and Grimes 2002, and also =-=Wakin et al. 2005-=-). However, if the image does not have edges, then the family of morphed images is a manifold. 6 We do the following: x1. Choose 100 points as labeled. Each of the benchmark data sets of Chapelle et a... |

39 | Luxburg. From graphs to manifolds – weak and strong pointwise consistency of graph Laplacians - Hein, Audibert, et al. - 2005 |

38 |
When does isomap recover the natural parameterization of families of articulated images
- Donoho, Grimes
- 2002
(Show Context)
Citation Context ...harp edges and considers a smooth family of smooth diffeomorphisms of [0,1] × [0,1], the set of images obtained under the family of diffeomorphisms is not necessarily a (differentiable) manifold (see =-=Donoho and Grimes 2002-=-, and also Wakin et al. 2005). However, if the image does not have edges, then the family of morphed images is a manifold. 6 We do the following: x1. Choose 100 points as labeled. Each of the benchmar... |

38 |
Concerning nonnegative matrices and doubly stochastic matrices
- Sinkhorn, Knopp
- 1967
(Show Context)
Citation Context ... ways of defining averaging operators. For example one could consider the heat kernel e −tL where L is defined in (3) below, see also Chung (1997), or a bi-Markov matrix similar to W (Sinkhorn, 1964; =-=Sinkhorn and Knopp, 1967-=-; Soules, 1991; Linial et al., 1998; Shashua et al., 2005; Zass and Shashua, 2005). In general K is not column-stochastic, 2 but the operation f K of multiplication on the right by a (row) vector can ... |

34 | Geometric Harmonics: A Novel Tool for Multiscale Out-of-Sample Extension of Empirical Functions - Coifman, Lafon - 2006 |

34 | Multi-way clustering using super-symmetric non-negative tensor factorization
- Shashua, Zass, et al.
(Show Context)
Citation Context ...consider the heat kernel e −tL where L is defined in (3) below, see also Chung (1997), or a bi-Markov matrix similar to W (Sinkhorn, 1964; Sinkhorn and Knopp, 1967; Soules, 1991; Linial et al., 1998; =-=Shashua et al., 2005-=-; Zass and Shashua, 2005). In general K is not column-stochastic, 2 but the operation f K of multiplication on the right by a (row) vector can be thought of as a diffusion of the vector f . This filte... |

32 |
Value function approximation with Diffusion Wavelets and Laplacian Eigenfunctions
- Mahadevan, Maggioni
- 2006
(Show Context)
Citation Context ...erences therein). It has recently inspired several algorithms in clustering, classification and learning (Belkin and Niyogi, 2003a, 2004; Lafon, 2004; Coifman et al., 2005a; Coifman and Lafon, 2006a; =-=Mahadevan and Maggioni, 2005-=-; Lafon and Lee, to appear, 2006; Maggioni and Mhaskar, 2007). 2.3 Harmonic Analysis The eigenfunctions {ψi} of K, satisfying Kψi = λiψi , are are related,via multiplication by D − 1 2 , to the eigenf... |

29 |
A unifying approach to hard and probabilistic clustering
- Zass, Shashua
- 2005
(Show Context)
Citation Context ...el e −tL where L is defined in (3) below, see also Chung (1997), or a bi-Markov matrix similar to W (Sinkhorn, 1964; Sinkhorn and Knopp, 1967; Soules, 1991; Linial et al., 1998; Shashua et al., 2005; =-=Zass and Shashua, 2005-=-). In general K is not column-stochastic, 2 but the operation f K of multiplication on the right by a (row) vector can be thought of as a diffusion of the vector f . This filter can be iterated severa... |

28 | Regularization on discrete spaces
- Zhou, Schölkopf
- 2005
(Show Context)
Citation Context ...ng the data set and functions on it intrinsically has lead to novel algorithms with state-of-the-art performance in various problems in machine learning (Szummer and Jaakkola, 2001; Zhu et al., 2003; =-=Zhou and Schlkopf, 2005-=-; Belkin and Niyogi, 2003a; Mahadevan and Maggioni, 2007; Maggioni and Mhaskar, 2007). They are based on the construction of a diffusion, or an averaging operator K on the data set, dependent on its l... |

25 | Manifold parametrizations by eigenfunctions of the Laplacian and heat kernels - Jones, Maggioni, et al. |

25 | PDE’s Based Regularization of Multivalued Images and Applications - Tschumperlé - 2002 |

24 | Noise cleaning by iterated local averaging - Davis, Rosenfeld - 1978 |

17 | Fast direct policy evaluation using multiscale analysis of markov diffusion processes
- Maggioni, Mahadevan
- 2005
(Show Context)
Citation Context ...ifman and Lafon, 2006a,b; Lafon, 2004; Coifman et al., 2005a,b; Belkin and Niyogi, 2003b; Maggioni and Mhaskar, 2007), or diffusion wavelets (Coifman and Maggioni, 2006; Mahadevan and Maggioni, 2007; =-=Maggioni and Mahadevan, 2006-=-; Mahadevan and Maggioni, 2005; Maggioni and Mahadevan, 2005). We will try to modify the geometry of the unlabeled data so that F is as smooth as possible with respect to the modified geometry. We wil... |

17 | Fast computation of sums of Gaussians in high dimensions - Raykar, Yang, et al. - 2005 |

13 | Diffusion polynomial frames on metric measure spaces - Maggioni, Mhaskar - 2008 |

8 | Biorthogonal diffusion wavelets for multiscale representations on manifolds and graphs,” August 2005 - Maggioni, Jr, et al. |