Results 1  10
of
51
On the mathematical foundations of learning
 Bulletin of the American Mathematical Society
, 2002
"... The problem of learning is arguably at the very core of the problem of intelligence, both biological and arti cial. T. Poggio and C.R. Shelton ..."
Abstract

Cited by 223 (12 self)
 Add to MetaCart
The problem of learning is arguably at the very core of the problem of intelligence, both biological and arti cial. T. Poggio and C.R. Shelton
Data compression and harmonic analysis
 IEEE Trans. Inform. Theory
, 1998
"... In this paper we review some recent interactions between harmonic analysis and data compression. The story goes back of course to Shannon’s R(D) theory... ..."
Abstract

Cited by 140 (24 self)
 Add to MetaCart
In this paper we review some recent interactions between harmonic analysis and data compression. The story goes back of course to Shannon’s R(D) theory...
Statistical performance of support vector machines
 ANN. STATIST
, 2008
"... The support vector machine (SVM) algorithm is well known to the computer learning community for its very good practical results. The goal of the present paper is to study this algorithm from a statistical perspective, using tools of concentration theory and empirical processes. Our main result build ..."
Abstract

Cited by 42 (8 self)
 Add to MetaCart
The support vector machine (SVM) algorithm is well known to the computer learning community for its very good practical results. The goal of the present paper is to study this algorithm from a statistical perspective, using tools of concentration theory and empirical processes. Our main result builds on the observation made by other authors that the SVM can be viewed as a statistical regularization procedure. From this point of view, it can also be interpreted as a model selection principle using a penalized criterion. It is then possible to adapt general methods related to model selection in this framework to study two important points: (1) what is the minimum penalty and how does it compare to the penalty actually used in the SVM algorithm; (2) is it possible to obtain “oracle inequalities ” in that setting, for the specific loss function used in the SVM algorithm? We show that the answer to the latter question is positive and provides relevant insight to the former. Our result shows that it is possible to obtain fast rates of convergence for SVMs.
Approximation, Metric Entropy and Small Ball Estimates for Gaussian Measures
 Ann. Probab
, 1999
"... A precise link proved by J. Kuelbs and W. V. Li relates the small ball behavior of a Gaussian measure on a Banach space E with the metric entropy behavior of K , the unit ball of the RKHS of in E. We remove the main regularity assumption imposed on the unknown function in the link. This enables t ..."
Abstract

Cited by 42 (19 self)
 Add to MetaCart
A precise link proved by J. Kuelbs and W. V. Li relates the small ball behavior of a Gaussian measure on a Banach space E with the metric entropy behavior of K , the unit ball of the RKHS of in E. We remove the main regularity assumption imposed on the unknown function in the link. This enables the application of tools and results from functional analysis to small ball problems and leads to small ball estimates of general algebraic type as well as to new estimates for concrete Gaussian processes. Moreover, we show that the small ball behavior of a Gaussian process is also tightly connected with the speed of approximation by "nite rank" processes. Abbreviated title: Metric Entropy and Small Ball Estimates Keywords: Gaussian process, small deviation, metric entropy, approximation number. AMS 1991 Subject Classications: Primary: 60G15 ; Secondary: 60F99, 47D50, 47G10 . 1 Supported in part by NSF 1 1 Introduction Let denote a centered Gaussian measure on a real separable B...
Sickel: Optimal approximation of elliptic problems by linear and nonlinear mappings III
 Triebel, Function Spaces, Entropy Numbers, Differential Operators
, 1996
"... We study the optimal approximation of the solution of an operator equation A(u) = f by four types of mappings: a) linear mappings of rank n; b) nterm approximation with respect to a Riesz basis; c) approximation based on linear information about the right hand side f; d) continuous mappings. We co ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
We study the optimal approximation of the solution of an operator equation A(u) = f by four types of mappings: a) linear mappings of rank n; b) nterm approximation with respect to a Riesz basis; c) approximation based on linear information about the right hand side f; d) continuous mappings. We consider worst case errors, where f is an element of the unit ball of a Sobolev or Besov space Br q(Lp(Ω)) and Ω ⊂ Rd is a bounded Lipschitz domain; the error is always measured in the Hsnorm. The respective widths are the linear widths (or approximation numbers), the nonlinear widths, the Gelfand widths, and the manifold widths. As a technical tool, we also study the Bernstein numbers. Our main results are the following. If p ≥ 2 then the order of convergence is the same for all four classes of approximations. In particular, the best linear approximations are of the same order as the best nonlinear ones. The best linear approximation can be quite difficult to realize as a numerical algorithm since the optimal Galerkin space usually depends on the operator and of the shape of the domain Ω. For p < 2 there is a difference, nonlinear approximations are better than linear ones. However, in this case, it turns out that linear information about the right hand side f is again optimal. Our main theoretical tool is the best nterm approximation with respect to an optimal Riesz basis and related nonlinear widths. These general results are used to study the Poisson equation in a polygonal domain. It turns out that best nterm wavelet approximation is (almost) optimal. The main results of
Some Limiting Embeddings in Weighted Function Spaces and Related Entropy Numbers
, 1997
"... The paper deals with weighted function spaces of type B s p;q (R n ; w(x)) and F s p;q (R n ; w(x)), where w(x) is a weight function of at most polynomial growth. Of special interest are weight functions of type w(x) = (1 + jxj 2 ) ff=2 (log(2 + jxj)) with ff 0 and 2 R. Our main resu ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
The paper deals with weighted function spaces of type B s p;q (R n ; w(x)) and F s p;q (R n ; w(x)), where w(x) is a weight function of at most polynomial growth. Of special interest are weight functions of type w(x) = (1 + jxj 2 ) ff=2 (log(2 + jxj)) with ff 0 and 2 R. Our main result deals with estimates for the entropy numbers of compact embeddings between spaces of this type; more precisely, we may extend and tighten some of our previous results in [12]. AMS Subject Classification: 46E 35 Key Words: weighted function spaces, compact embeddings, entropy numbers Introduction 1 Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 Weighted embeddings  the nonlimiting case 3 2 Limiting embeddings, entropy numbers 7 2.1 Estimates from above, an approach via duality arguments . . . . . . . . . . . . . . 8 2.2 Estimates from above, an approach via approximation numbers . . . . . . . . . . . 15 2.3 Estimates...
METRIC ENTROPY OF HIGH DIMENSIONAL DISTRIBUTIONS
, 2007
"... Let Fd be the collection of all ddimensional probability distribution functions on [0, 1] d, d ≥ 2. The metric entropy of Fd under the L2([0, 1] d) norm is studied. The exact rate is obtained for d =1, 2 and bounds are given for d>3. Connections with small deviation probability for Brownian sheets ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
Let Fd be the collection of all ddimensional probability distribution functions on [0, 1] d, d ≥ 2. The metric entropy of Fd under the L2([0, 1] d) norm is studied. The exact rate is obtained for d =1, 2 and bounds are given for d>3. Connections with small deviation probability for Brownian sheets under the supnorm are established.
Optimal Rates for Regularized Least Squares Regression
, 2009
"... We establish a new oracle inequality for kernelbased, regularized least squares regression methods, which uses the eigenvalues of the associated integral operator as a complexity measure. We then use this oracle inequality to derive learning rates for these methods. Here, it turns out that these rat ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
We establish a new oracle inequality for kernelbased, regularized least squares regression methods, which uses the eigenvalues of the associated integral operator as a complexity measure. We then use this oracle inequality to derive learning rates for these methods. Here, it turns out that these rates are independent of the exponent of the regularization term. Finally, we show that our learning rates are asymptotically optimal whenever, e.g., the kernel is continuous and the input space is a compact metric space.
An Exotic Minimal Banach Space of Functions
, 2002
"... This note describes a new Banach space B_0 of square integrable functions on R having many interesting invariance properties. In fact, the Fourier transform, timefrequency shifts, L²normalized dilations act isometrically on it. For its definition, we make use of a general construction principle f ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
This note describes a new Banach space B_0 of square integrable functions on R having many interesting invariance properties. In fact, the Fourier transform, timefrequency shifts, L²normalized dilations act isometrically on it. For its definition, we make use of a general construction principle for minimal invariant spaces. We demonstrate a variety of properties following immediately from this principle. Furthermore, we give a number of di#erent characterizations, including various atomic decompositions, as well as natural necessary and su#cient conditions for an L²function to belong to this new space. It turns out that this new space is somewhat exotic, since it is neither rearrangement invariant nor solid.