• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

On the relationship between generalization error, hypothesis complexity, and sample complexity for radial basis functions (1996)

by P Niyogi, F Girosi
Venue:Neural Computation
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 30
Next 10 →

Regularization Theory and Neural Networks Architectures

by Federico Girosi, Michael Jones, Tomaso Poggio - Neural Computation , 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract - Cited by 257 (30 self) - Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...

Regularization networks and support vector machines

by Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio - Advances in Computational Mathematics , 2000
"... Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization a ..."
Abstract - Cited by 215 (28 self) - Add to MetaCart
Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization and Support Vector Machines. We review both formulations in the context of Vapnik’s theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics. The emphasis is on regression: classification is treated as a special case.

Shape quantization and recognition with randomized trees

by Yali Amit, Donald Geman Y - Neural Computation , 1997
"... We explore a new approach to shape recognition based on a virtually in nite family of binary features (\queries") of the image data, designed to accommodate prior in-formation about shape invariance and regularity. Each query corresponds to a spatial arrangement ofseveral local topographic code ..."
Abstract - Cited by 126 (15 self) - Add to MetaCart
We explore a new approach to shape recognition based on a virtually in nite family of binary features (\queries") of the image data, designed to accommodate prior in-formation about shape invariance and regularity. Each query corresponds to a spatial arrangement ofseveral local topographic codes (\tags") which are in themselves too primitive and common to be informative about shape. All the discriminating power derives from relative angles and distances among the tags. The important attributes of the queries are (i) a natural partial ordering corresponding to increasing structure and complexity � (ii) semi-invariance, meaning that most shapes of a given class will answer the same way totwo queries which are successive in the ordering � and (iii) stability, since the queries are not based on distinguished points and substructures. No classi er based on the full feature set can be evaluated and it is impossible to determine a priori which arrangements are informative. Our approach istoselect informative features and build tree classi ers at the same time by inductive learning. In e ect, each tree provides an approximation to the full posterior where the features

A nonparametric approach to pricing and hedging derivative securities via learning networks

by James M. Hutchinson, Andrew W. Lo, Tomaso Poggio - Journal of Finance , 1994
"... http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-com ..."
Abstract - Cited by 84 (4 self) - Add to MetaCart
http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at

The mathematics of learning: Dealing with data

by Tomaso Poggio, Steve Smale - Notices of the American Mathematical Society , 2003
"... Draft for the Notices of the AMS Learning is key to developing systems tailored to a broad range of data analysis and information extraction tasks. We outline the mathematical foundations of learning theory and describe a key algorithm of it. 1 ..."
Abstract - Cited by 79 (11 self) - Add to MetaCart
Draft for the Notices of the AMS Learning is key to developing systems tailored to a broad range of data analysis and information extraction tasks. We outline the mathematical foundations of learning theory and describe a key algorithm of it. 1

A unified framework for Regularization Networks and Support Vector Machines

by Theodoros Evgeniou, Massimiliano Pontil , 1999
"... This report describers research done at the Center for Biological & Computational Learning and the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. This research was sponsored by theN ational Science Foundation under contractN o. IIS-9800032, the O#ce ofN aval Researc ..."
Abstract - Cited by 40 (11 self) - Add to MetaCart
This report describers research done at the Center for Biological & Computational Learning and the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. This research was sponsored by theN ational Science Foundation under contractN o. IIS-9800032, the O#ce ofN aval Research under contractN o.N 0001493 -1-0385 and contractN o.N 00014-95-1-0600. Partial support was also provided by Daimler-Benz AG, Eastman Kodak, Siemens Corporate Research, Inc., ATR and AT&T. Contents Introductic 3 2 OverviF of stati.48EF learni4 theory 5 2.1 Unifo6 Co vergence and the Vapnik-Chervo nenkis bo und ............. 7 2.2 The metho d o Structural Risk Minimizatio ..................... 10 2.3 #-unifo8 co vergence and the V # ..................... 10 2.4 Overviewo fo urappro6 h ............................... 13 3 Reproduci9 Kernel HiT ert Spaces: a briL overviE 14 4RegulariEqq.L Networks 16 4.1 Radial Basis Functio8 ................................. 19 4.2 Regularizatioz generalized splines and kernel smo oxy rs .............. 20 4.3 Dual representatio o f Regularizatio Netwo rks ................... 21 4.4 Fro regressioto 5 Support vector machiT9 22 5.1 SVMin RKHS ..................................... 22 5.2 Fro regressioto 6SRMforRNsandSVMs 26 6.1 SRMfo SVMClassificatio .............................. 28 6.1.1 Distributio dependent bo undsfo SVMC .................. 29 7 A BayesiL Interpretatiq ofRegulariTFqEL and SRM? 30 7.1 Maximum A Po terio6 Interpretatio o f ............... 30 7.2 Bayesian interpretatio o f the stabilizer in the RN andSVMfunctio6I6 ...... 32 7.3 Bayesian interpretatio o f the data term in the Regularizatio andSVMfunctioy8 33 7.4 Why a MAP interpretatio may be misleading .................... 33 Connectine between SVMs and Sparse Ap...

Incorporating Prior Information in Machine Learning by Creating Virtual Examples

by P. Niyogi, F. Girosi, T. Poggio - Proceedings of the IEEE , 1998
"... One of the key problems in supervised learning is the insufficient size of the training set. The natural way for an intelligent learner to counter this problem and successfully generalize is to exploit prior information that may be available about the domain or that can be learned from prototypical ..."
Abstract - Cited by 36 (2 self) - Add to MetaCart
One of the key problems in supervised learning is the insufficient size of the training set. The natural way for an intelligent learner to counter this problem and successfully generalize is to exploit prior information that may be available about the domain or that can be learned from prototypical examples. We discuss the notion of using prior knowledge by creating virtual examples and thereby expanding the effective training set size. We show that in some contexts, this idea is mathematically equivalent to incorporating the prior knowledge as a regularizer, suggesting that the strategy is well-motivated. The process of creating virtual examples in real world pattern recognition tasks is highly non-trivial. We provide demonstrative examples from object recognition and speech recognition to illustrate the idea. 1 Learning from Examples Recently, machine learning techniques have become increasingly popular as an alternative to knowledge-based approaches to artificial intelligence pro...

Survey of Neural Transfer Functions

by Wlodzislaw Duch, Norbert Jankowski - Neural Computing Surveys , 1999
"... The choice of transfer functions may strongly influence complexity and performance of neural networks. Although sigmoidal transfer functions are the most common there is no apriorireason why models based on such functions should always provide optimal decision borders. A large number of alternative ..."
Abstract - Cited by 33 (19 self) - Add to MetaCart
The choice of transfer functions may strongly influence complexity and performance of neural networks. Although sigmoidal transfer functions are the most common there is no apriorireason why models based on such functions should always provide optimal decision borders. A large number of alternative transfer functions has been described in the literature. A taxonomy of activation and output functions is proposed, and advantages of various non-local and local neural transfer functions are discussed. Several less-known types of transfer functions and new combinations of activation/output functions are described. Universal transfer functions, parametrized to change from localized to delocalized type, are of greatest interest. Other types of neural transfer functions discussed here include functions with activations based on nonEuclidean distance measures, bicentral functions, formed from products or linear combinations of pairs of sigmoids, and extensions of such functions making rotations...

Learning and Approximation Capabilities of Adaptive Spline Activation Function Neural Networks

by Lorenzo Vecci, Francesco Piazza, Aurelio Uncini - NEURAL NETWORKS , 1998
"... In this paper, we study the theoretical properties of a new kind of artificial neural network, which is able to adapt its activation functions by varying the control points of a Catmull --Rom cubic spline. Most of all, we are interested in generalization capability, and we can show that our architec ..."
Abstract - Cited by 25 (17 self) - Add to MetaCart
In this paper, we study the theoretical properties of a new kind of artificial neural network, which is able to adapt its activation functions by varying the control points of a Catmull --Rom cubic spline. Most of all, we are interested in generalization capability, and we can show that our architecture presents several advantages. First of all, it can be seen as a sub-optimal realization of the additive spline based model obtained by the reguralization theory. Besides, simulations confirm that the special learning mechanism allows to use in a very effective way the network's free parameters, keeping their total number at lower values than in networks with sigmoidal activation functions. Other notable properties are a shorter training time and a reduced hardware complexity, due to the surplus in the number of neurons. # 1998 Elsevier Science Ltd. All rights reserved. Keywords: Spline neural networks; Multilayer perceptron; Generalized sigmoidal functions; Adaptive activation functions...

Support vector machine soft margin classifiers: Error analysis

by Di-rong Chen, Qiang Wu, Yiming Ying, Ding-xuan Zhou - Journal of Machine Learning Research , 2004
"... The purpose of this paper is to provide a PAC error analysis for the q-norm soft margin classifier, a support vector machine classification algorithm. It consists of two parts: regularization error and sample error. While many techniques are available for treating the sample error, much less is know ..."
Abstract - Cited by 15 (9 self) - Add to MetaCart
The purpose of this paper is to provide a PAC error analysis for the q-norm soft margin classifier, a support vector machine classification algorithm. It consists of two parts: regularization error and sample error. While many techniques are available for treating the sample error, much less is known for the regularization error and the corresponding approximation error for reproducing kernel Hilbert spaces. We are mainly concerned about the regularization error. It is estimated for general distributions by a K-functional in weighted L q spaces. For weakly separable distributions (i.e., the margin may be zero) satisfactory convergence rates are provided by means of separating functions. A projection operator is introduced, which leads to better sample error estimates especially for small complexity kernels. The misclassification error is bounded by the V-risk associated with a general class of loss functions V. The difficulty of bounding the offset is overcome. Polynomial kernels and Gaussian kernels are used to demonstrate the main results. The choice of the regularization parameter plays an important role in our analysis.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University