Results 1  10
of
664
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 311 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
Regularization networks and support vector machines
 Advances in Computational Mathematics
, 2000
"... Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization a ..."
Abstract

Cited by 267 (33 self)
 Add to MetaCart
Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization and Support Vector Machines. We review both formulations in the context of Vapnik’s theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics. The emphasis is on regression: classification is treated as a special case.
A generalized Gaussian image model for edgepreserving MAP estimation
 IEEE Trans. on Image Processing
, 1993
"... Absfrucf We present a Markov random field model which allows realistic edge modeling while providing stable maximum a posteriori MAP solutions. The proposed model, which we refer to as a generalized Gaussian Markov random field (GGMRF), is named for its similarity to the generalized Gaussian distri ..."
Abstract

Cited by 238 (34 self)
 Add to MetaCart
Absfrucf We present a Markov random field model which allows realistic edge modeling while providing stable maximum a posteriori MAP solutions. The proposed model, which we refer to as a generalized Gaussian Markov random field (GGMRF), is named for its similarity to the generalized Gaussian distribution used in robust detection and estimation. The model satisifies several desirable analytical and computational properties for MAP estimation, including continuous dependence of the estimate on the data, invariance of the character of solutions to scaling of data, and a solution which lies at the unique global minimum of the U posteriori loglikeihood function. The GGMRF is demonstrated to be useful for image reconstruction in lowdosage transmission tomography. I.
An equivalence between sparse approximation and Support Vector Machines
 A.I. Memo 1606, MIT Arti cial Intelligence Laboratory
, 1997
"... This publication can be retrieved by anonymous ftp to publications.ai.mit.edu. The pathname for this publication is: aipublications/15001999/AIM1606.ps.Z This paper shows a relationship between two di erent approximation techniques: the Support Vector Machines (SVM), proposed by V.Vapnik (1995), ..."
Abstract

Cited by 203 (7 self)
 Add to MetaCart
This publication can be retrieved by anonymous ftp to publications.ai.mit.edu. The pathname for this publication is: aipublications/15001999/AIM1606.ps.Z This paper shows a relationship between two di erent approximation techniques: the Support Vector Machines (SVM), proposed by V.Vapnik (1995), and a sparse approximation scheme that resembles the Basis Pursuit DeNoising algorithm (Chen, 1995 � Chen, Donoho and Saunders, 1995). SVM is a technique which can be derived from the Structural Risk Minimization Principle (Vapnik, 1982) and can be used to estimate the parameters of several di erent approximation schemes, including Radial Basis Functions, algebraic/trigonometric polynomials, Bsplines, and some forms of Multilayer Perceptrons. Basis Pursuit DeNoising is a sparse approximation technique, in which a function is reconstructed by using a small number of basis functions chosen from a large set (the dictionary). We show that, if the data are noiseless, the modi ed version of Basis Pursuit DeNoising proposed in this paper is equivalent to SVM in the following sense: if applied to the same data set the two techniques give the same solution, which is obtained by solving the same quadratic programming problem. In the appendix we also present a derivation of the SVM technique in the framework of regularization theory, rather than statistical learning theory, establishing a connection between SVM, sparse approximation and regularization theory.
A Theory of Networks for Approximation and Learning
 Laboratory, Massachusetts Institute of Technology
, 1989
"... Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, t ..."
Abstract

Cited by 195 (24 self)
 Add to MetaCart
Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nonlinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. Wedevelop a theoretical framework for approximation based on regularization techniques that leads to a class of threelayer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the wellknown Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods suchasParzen windows and potential functions and to several neural network algorithms, suchas Kanerva's associative memory,backpropagation and Kohonen's topology preserving map. They also haveaninteresting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.
On Edge Detection
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1984
"... Edge detection is the process that attempts to characterize the intensity changes in the image in terms of the physical processes that have originated them. A critical, intermediate goal of edge detection is the detection and characterization of significant intensity changes. This paper discusses th ..."
Abstract

Cited by 176 (6 self)
 Add to MetaCart
Edge detection is the process that attempts to characterize the intensity changes in the image in terms of the physical processes that have originated them. A critical, intermediate goal of edge detection is the detection and characterization of significant intensity changes. This paper discusses this part of the edge d6tection problem. To characterize the types of intensity changes derivatives of different types, and possibly different scales, are needed. Thus, we consider this part of edge detection as a problem in numerical differentiation.
Finite Element Methods for Active Contour Models and Balloons for 2D and 3D Images
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1991
"... The use of energyminimizing curves, known as "snakes" to extract features of interest in images has been introduced by Kass, Witkin and Terzopoulos [23]. A balloon model was introduced in [12] as a way to generalize and solve some of the problems encountered with the original method. We present a 3 ..."
Abstract

Cited by 162 (22 self)
 Add to MetaCart
The use of energyminimizing curves, known as "snakes" to extract features of interest in images has been introduced by Kass, Witkin and Terzopoulos [23]. A balloon model was introduced in [12] as a way to generalize and solve some of the problems encountered with the original method. We present a 3D generalization of the balloon model as a 3D deformable surface, which evolves in 3D images. It is deformed under the action of internal and external forces attracting the surface toward detected edgels by means of an attraction potential. We also show properties of energyminimizing surfaces concerning their relationship with 3D edge points. To solve the minimization problem for a surface, two simplified approaches are shown first, defining a 3D surface as a series of 2D planar curves. Then, after comparing Finite Element Method and Finite Difference Method in the 2D problem, we solve the 3D model using the Finite Element Method yielding greater stability and faster convergence. We have a...
Probing the Pareto frontier for basis pursuit solutions
, 2008
"... The basis pursuit problem seeks a minimum onenorm solution of an underdetermined leastsquares problem. Basis pursuit denoise (BPDN) fits the leastsquares problem only approximately, and a single parameter determines a curve that traces the optimal tradeoff between the leastsquares fit and the ..."
Abstract

Cited by 160 (2 self)
 Add to MetaCart
The basis pursuit problem seeks a minimum onenorm solution of an underdetermined leastsquares problem. Basis pursuit denoise (BPDN) fits the leastsquares problem only approximately, and a single parameter determines a curve that traces the optimal tradeoff between the leastsquares fit and the onenorm of the solution. We prove that this curve is convex and continuously differentiable over all points of interest, and show that it gives an explicit relationship to two other optimization problems closely related to BPDN. We describe a rootfinding algorithm for finding arbitrary points on this curve; the algorithm is suitable for problems that are large scale and for those that are in the complex domain. At each iteration, a spectral gradientprojection method approximately minimizes a leastsquares problem with an explicit onenorm constraint. Only matrixvector operations are required. The primaldual solution of this problem gives function and derivative information needed for the rootfinding method. Numerical experiments on a comprehensive set of test problems demonstrate that the method scales well to large problems.
Consistency of the group lasso and multiple kernel learning
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2007
"... We consider the leastsquare regression problem with regularization by a block 1norm, i.e., a sum of Euclidean norms over spaces of dimensions larger than one. This problem, referred to as the group Lasso, extends the usual regularization by the 1norm where all spaces have dimension one, where it ..."
Abstract

Cited by 156 (27 self)
 Add to MetaCart
We consider the leastsquare regression problem with regularization by a block 1norm, i.e., a sum of Euclidean norms over spaces of dimensions larger than one. This problem, referred to as the group Lasso, extends the usual regularization by the 1norm where all spaces have dimension one, where it is commonly referred to as the Lasso. In this paper, we study the asymptotic model consistency of the group Lasso. We derive necessary and sufficient conditions for the consistency of group Lasso under practical assumptions, such as model misspecification. When the linear predictors and Euclidean norms are replaced by functions and reproducing kernel Hilbert norms, the problem is usually referred to as multiple kernel learning and is commonly used for learning from heterogeneous data sources and for non linear variable selection. Using tools from functional analysis, and in particular covariance operators, we extend the consistency results to this infinite dimensional case and also propose an adaptive scheme to obtain a consistent model estimate, even when the necessary condition required for the non adaptive scheme is not satisfied.
Prior Learning and Gibbs ReactionDiffusion
, 1997
"... This article addresses two important themes in early visual computation: rst it presents a novel theory for learning the universal statistics of natural images { a prior model for typical cluttered scenes of the world { from a set of natural images, second it proposes a general framework of designi ..."
Abstract

Cited by 149 (18 self)
 Add to MetaCart
This article addresses two important themes in early visual computation: rst it presents a novel theory for learning the universal statistics of natural images { a prior model for typical cluttered scenes of the world { from a set of natural images, second it proposes a general framework of designing reactiondiusion equations for image processing. We start by studying the statistics of natural images including the scale invariant properties, then generic prior models were learned to duplicate the observed statistics, based on the minimax entropy theory studied in two previous papers. The resulting Gibbs distributions have potentials of the form U(I; ; S) = P K I)(x; y)) with S = fF g being a set of lters and = f the potential functions. The learned Gibbs distributions con rm and improve the form of existing prior models such as lineprocess, but in contrast to all previous models, inverted potentials (i.e. (x) decreasing as a function of jxj) were found to be necessary. We nd that the partial dierential equations given by gradient descent on U(I; ; S) are essentially reactiondiusion equations, where the usual energy terms produce anisotropic diusion while the inverted energy terms produce reaction associated with pattern formation, enhancing preferred image features. We illustrate how these models can be used for texture pattern rendering, denoising, image enhancement and clutter removal by careful choice of both prior and data models of this type, incorporating the appropriate features. Song Chun Zhu is now with the Computer Science Department, Stanford University, Stanford, CA 94305, and David Mumford is with the Division of Applied Mathematics, Brown University, Providence, RI 02912. This work started when the authors were at ...