Results 1  10
of
62
Regularization networks and support vector machines
 Advances in Computational Mathematics
, 2000
"... Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization a ..."
Abstract

Cited by 366 (38 self)
 Add to MetaCart
(Show Context)
Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization and Support Vector Machines. We review both formulations in the context of Vapnik’s theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics. The emphasis is on regression: classification is treated as a special case.
On the mathematical foundations of learning
 Bulletin of the American Mathematical Society
, 2002
"... The problem of learning is arguably at the very core of the problem of intelligence, both biological and arti cial. T. Poggio and C.R. Shelton ..."
Abstract

Cited by 329 (12 self)
 Add to MetaCart
(Show Context)
The problem of learning is arguably at the very core of the problem of intelligence, both biological and arti cial. T. Poggio and C.R. Shelton
An equivalence between sparse approximation and Support Vector Machines
 A.I. Memo 1606, MIT Arti cial Intelligence Laboratory
, 1997
"... This publication can be retrieved by anonymous ftp to publications.ai.mit.edu. The pathname for this publication is: aipublications/15001999/AIM1606.ps.Z This paper shows a relationship between two di erent approximation techniques: the Support Vector Machines (SVM), proposed by V.Vapnik (1995), ..."
Abstract

Cited by 248 (7 self)
 Add to MetaCart
This publication can be retrieved by anonymous ftp to publications.ai.mit.edu. The pathname for this publication is: aipublications/15001999/AIM1606.ps.Z This paper shows a relationship between two di erent approximation techniques: the Support Vector Machines (SVM), proposed by V.Vapnik (1995), and a sparse approximation scheme that resembles the Basis Pursuit DeNoising algorithm (Chen, 1995 � Chen, Donoho and Saunders, 1995). SVM is a technique which can be derived from the Structural Risk Minimization Principle (Vapnik, 1982) and can be used to estimate the parameters of several di erent approximation schemes, including Radial Basis Functions, algebraic/trigonometric polynomials, Bsplines, and some forms of Multilayer Perceptrons. Basis Pursuit DeNoising is a sparse approximation technique, in which a function is reconstructed by using a small number of basis functions chosen from a large set (the dictionary). We show that, if the data are noiseless, the modi ed version of Basis Pursuit DeNoising proposed in this paper is equivalent to SVM in the following sense: if applied to the same data set the two techniques give the same solution, which is obtained by solving the same quadratic programming problem. In the appendix we also present a derivation of the SVM technique in the framework of regularization theory, rather than statistical learning theory, establishing a connection between SVM, sparse approximation and regularization theory.
A unified framework for Regularization Networks and Support Vector Machines
, 1999
"... This report describers research done at the Center for Biological & Computational Learning and the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. This research was sponsored by theN ational Science Foundation under contractN o. IIS9800032, the O#ce ofN aval Res ..."
Abstract

Cited by 57 (12 self)
 Add to MetaCart
(Show Context)
This report describers research done at the Center for Biological & Computational Learning and the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. This research was sponsored by theN ational Science Foundation under contractN o. IIS9800032, the O#ce ofN aval Research under contractN o.N 0001493 10385 and contractN o.N 000149510600. Partial support was also provided by DaimlerBenz AG, Eastman Kodak, Siemens Corporate Research, Inc., ATR and AT&T. Contents Introductic 3 2 OverviF of stati.48EF learni4 theory 5 2.1 Unifo6 Co vergence and the VapnikChervo nenkis bo und ............. 7 2.2 The metho d o Structural Risk Minimizatio ..................... 10 2.3 #unifo8 co vergence and the V # ..................... 10 2.4 Overviewo fo urappro6 h ............................... 13 3 Reproduci9 Kernel HiT ert Spaces: a briL overviE 14 4RegulariEqq.L Networks 16 4.1 Radial Basis Functio8 ................................. 19 4.2 Regularizatioz generalized splines and kernel smo oxy rs .............. 20 4.3 Dual representatio o f Regularizatio Netwo rks ................... 21 4.4 Fro regressioto 5 Support vector machiT9 22 5.1 SVMin RKHS ..................................... 22 5.2 Fro regressioto 6SRMforRNsandSVMs 26 6.1 SRMfo SVMClassificatio .............................. 28 6.1.1 Distributio dependent bo undsfo SVMC .................. 29 7 A BayesiL Interpretatiq ofRegulariTFqEL and SRM? 30 7.1 Maximum A Po terio6 Interpretatio o f ............... 30 7.2 Bayesian interpretatio o f the stabilizer in the RN andSVMfunctio6I6 ...... 32 7.3 Bayesian interpretatio o f the data term in the Regularizatio andSVMfunctioy8 33 7.4 Why a MAP interpretatio may be misleading .................... 33 Connectine between SVMs and Sparse Ap...
ON THE NUMERICAL EVALUATION OF FREDHOLM DETERMINANTS
, 804
"... Abstract. Some significant quantities in mathematics and physics are most naturally expressed as the Fredholm determinant of an integral operator, most notably many of the distribution functions in random matrix theory. Though their numerical values are of interest, there is no systematic numerical ..."
Abstract

Cited by 43 (6 self)
 Add to MetaCart
Abstract. Some significant quantities in mathematics and physics are most naturally expressed as the Fredholm determinant of an integral operator, most notably many of the distribution functions in random matrix theory. Though their numerical values are of interest, there is no systematic numerical treatment of Fredholm determinants to be found in the literature. Instead, the few numerical evaluations that are available rely on eigenfunction expansions of the operator, if expressible in terms of special functions, or on alternative, numerically more straightforwardly accessible analytic expressions, e.g., in terms of Painlevé transcendents, that have masterfully been derived in some cases. In this paper we close the gap in the literature by studying projection methods and, above all, a simple, easily implementable, general method for the numerical evaluation of Fredholm determinants that is derived from the classical Nyström method for the solution of Fredholm equations of the second kind. Using Gauss–Legendre or Clenshaw– Curtis as the underlying quadrature rule, we prove that the approximation error essentially behaves like the quadrature error for the sections of the kernel. In particular, we get exponential convergence for analytic kernels, which are typical in random matrix theory. The application of the method to the distribution functions of the Gaussian unitary ensemble (GUE), in the bulk and the edge scaling limit, is discussed in detail. After extending the method to systems of integral operators, we evaluate the twopoint correlation functions of the more recently studied Airy and Airy 1 processes. Key words. Fredholm determinant, Nyström’s method, projection method, trace class operators, random
The law of the supremum of a stable Lévy process with no negative jumps
, 2006
"... Let X = (Xt)t≥0 be a stable Lévy process of index α ∈ (1, 2) with no negative jumps, and let St = sup 0≤s≤t Xs denote its running supremum for t>0. We show that the density function ft of St can be characterized as the unique solution to ..."
Abstract

Cited by 33 (3 self)
 Add to MetaCart
(Show Context)
Let X = (Xt)t≥0 be a stable Lévy process of index α ∈ (1, 2) with no negative jumps, and let St = sup 0≤s≤t Xs denote its running supremum for t>0. We show that the density function ft of St can be characterized as the unique solution to
On integral equations arising in the firstpassage problem for Brownian motion
 Research Report No. 421, Dept. Theoret. Statist. Aarhus
, 2001
"... Let (Bt)t 0 be a standard Brownian motion started at zero, let g: (0;1) ! IR be a continuous function satisfying g(0+) 0, let = inf f t> 0 j Bt g(t) g be the firstpassage time of B over g, and let F denote the distribution function of. Then the following system of integral equations is satisfied ..."
Abstract

Cited by 26 (5 self)
 Add to MetaCart
(Show Context)
Let (Bt)t 0 be a standard Brownian motion started at zero, let g: (0;1) ! IR be a continuous function satisfying g(0+) 0, let = inf f t> 0 j Bt g(t) g be the firstpassage time of B over g, and let F denote the distribution function of. Then the following system of integral equations is satisfied: t n=2 Hn g(t) pt =Zt
A Theory of Robust LongRun Variance Estimation
, 2004
"... The paper studies the robustness of longrun variance estimators employed for conducting Waldtype tests in standard time series models. It is shown that all longrun variance estimators that are consistent for the variance of Gaussian White Noise lack robustness in the sense that they yield arbitra ..."
Abstract

Cited by 21 (6 self)
 Add to MetaCart
(Show Context)
The paper studies the robustness of longrun variance estimators employed for conducting Waldtype tests in standard time series models. It is shown that all longrun variance estimators that are consistent for the variance of Gaussian White Noise lack robustness in the sense that they yield arbitrary results for some underlying process that satisfies a Functional Central Limit Theorem. An analytical measure of robustness of longrun variance estimators is suggested that captures the degree of this fragility. A family of inconsistent longrun variance estimators is derived that optimally trade off this measure of robustness against efficiency. A minor modification of these optimal estimators lead to asymptotically Fdistributed test statistics under the null hypothesis, so that robust large sample inference can be conducted very similarly to wellunderstood small sample Gaussian inference.
Predicting the Ultimate Supremum of a Stable Lévy Process with No Negative Jumps
"... Given a stable Lévy process X = (Xt)0≤t≤T of index α ∈ (1, 2) with no negative jumps, and letting St = sup 0≤s≤t Xs denote its running supremum for t ∈ [0, T], we consider the optimal prediction problem V = inf 0≤τ≤T E(ST −Xτ) p where the infimum is taken over all stopping times τ of X, and the erro ..."
Abstract

Cited by 15 (7 self)
 Add to MetaCart
(Show Context)
Given a stable Lévy process X = (Xt)0≤t≤T of index α ∈ (1, 2) with no negative jumps, and letting St = sup 0≤s≤t Xs denote its running supremum for t ∈ [0, T], we consider the optimal prediction problem V = inf 0≤τ≤T E(ST −Xτ) p where the infimum is taken over all stopping times τ of X, and the error parameter p ∈ (1, α) is given and fixed. Reducing the optimal prediction problem to a fractional freeboundary problem of RiemannLiouville type, and finding an explicit solution to the latter, we show that there exists α ∗ ∈ (1, 2) (equal to 1.57 approximately) and a strictly increasing function p ∗ : (α∗, 2) → (1, 2) satisfying p∗(α∗+) = 1, p∗(2−) = 2 and p∗(α) < α for α ∈ (α∗, 2) such that for every α ∈ (α∗, 2) and p ∈ (1, p∗(α)) the following stopping time is optimal τ ∗ = inf { t ∈ [0, T] : St−Xt ≥ z∗(T −t) 1/α} where z ∗ ∈ (0, ∞) is the unique root to a transcendental equation (with parameters α and p). Moreover, if either α ∈ (1, α∗) or p ∈ (p∗(α), α) then it is not optimal to stop at t ∈ [0, T) when St−Xt is sufficiently large. The existence of the breakdown points α ∗ and p∗(α) stands in sharp contrast with the Brownian motion case (formally corresponding to α = 2), and the phenomenon itself may be attributed to the interplay between the jump structure (admitting a transition from lighter to heavier tails) and the individual preferences (represented by the error parameter p). 1.
An Efficient Numerical Algorithm For Cracks Partly In Frictionless Contact
, 2000
"... An algorithm for a loaded crack partly in frictionless contact is presented. The problem is nonlinear in the sense that the equations of linear elasticity are supplemented by certain contact inequalities. The location of a priori unknown contact zones and the solutions to the field equations must be ..."
Abstract

Cited by 7 (6 self)
 Add to MetaCart
(Show Context)
An algorithm for a loaded crack partly in frictionless contact is presented. The problem is nonlinear in the sense that the equations of linear elasticity are supplemented by certain contact inequalities. The location of a priori unknown contact zones and the solutions to the field equations must be determined simultaneously. The algorithm is based on a rapidly converging sequence of relaxed Fredholm integral equations of the second kind in which the contact problem is viewed as a perturbation of a noncontacting crack problem. The algorithm exhibits great stability and speed. The numerical results are ordersofmagnitudes more accurate than those of previous investigators.