Results 1  10
of
17
Risk bounds for Statistical Learning
"... We propose a general theorem providing upper bounds for the risk of an empirical risk minimizer (ERM).We essentially focus on the binary classi…cation framework. We extend Tsybakov’s analysis of the risk of an ERM under margin type conditions by using concentration inequalities for conveniently weig ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
We propose a general theorem providing upper bounds for the risk of an empirical risk minimizer (ERM).We essentially focus on the binary classi…cation framework. We extend Tsybakov’s analysis of the risk of an ERM under margin type conditions by using concentration inequalities for conveniently weighted empirical processes. This allows us to deal with other ways of measuring the ”size”of a class of classi…ers than entropy with bracketing as in Tsybakov’s work. In particular we derive new risk bounds for the ERM when the classi…cation rules belong to some VCclass under margin conditions and discuss the optimality of those bounds in a minimax sense.
Generalized Quantile Processes Based on Multivariate Depth Functions, with Applications in Nonparametric Multivariate Analysis
, 2001
"... Statistical depth functions are being used increasingly in nonparametric multivariate data analysis. In a broad treatment of depthbased methods, Liu, Parelius and Singh (1999) include several devices for visualizing selected multivariate distributional characteristics by onedimensional curves cons ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
Statistical depth functions are being used increasingly in nonparametric multivariate data analysis. In a broad treatment of depthbased methods, Liu, Parelius and Singh (1999) include several devices for visualizing selected multivariate distributional characteristics by onedimensional curves constructed in terms of given depth functions. Here we show how these tools may be represented as special depthbased cases of generalized quantile functions introduced by Einmahl and Mason (1992). By specializing results of the latter authors to the depthbased case, we develop an easily applied general result on convergence of sample depthbased generalized quantile processes to a Brownian bridge. As applications, we obtain the asymptotic behavior of sample versions of depthbased curves for "scale" and "kurtosis" introduced by Liu, Parelius and Singh. The kurtosis curve is actually a Lorenz curve designed to measure heaviness of tails of a multivariate distribution. We also obtain the asymptotic distribution of the quantile process of the sample depth values.
Uniform ergodic theorems for dynamical systems under VC entropy conditions
 Proc. Probab. Banach Spaces IX (Sandbjerg
, 1993
"... Necessary and sufficient conditions are given for the uniform convergence over an arbitrary index set in von Neumann’s mean and Birkhoff’s pointwise ergodic theorem. Three different types of conditions already known from probability theory are investigated. Firstly it is shown that the property of b ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Necessary and sufficient conditions are given for the uniform convergence over an arbitrary index set in von Neumann’s mean and Birkhoff’s pointwise ergodic theorem. Three different types of conditions already known from probability theory are investigated. Firstly it is shown that the property of being eventually totally
KullbackLeibler Constrained Estimation of Probability Measures
, 1988
"... We consider two estimation problems in this paper. In the first, we observe Xl,. ",Xn i.i.d. Po, where it is assumed known that EPoT = a. In the second, we observe Xl,. ",Xn E IRa and J::: 2::'1 T(Yi) E IR b, where Xl,...,Xn, Yl,...,Y", are i.i.d. Po, T is some Borel measurable function and n/m+ ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We consider two estimation problems in this paper. In the first, we observe Xl,. ",Xn i.i.d. Po, where it is assumed known that EPoT = a. In the second, we observe Xl,. ",Xn E IRa and J::: 2::'1 T(Yi) E IR b, where Xl,...,Xn, Yl,...,Y", are i.i.d. Po, T is some Borel measurable function and n/m+ A as n/ \ m+ (X). In both situations we consider the problem of estimating the probability measure Po, uniformly over a class of sets C. The estimators considered here are based on the minimization of the KullbackLeibier divergence from certain collections of probability measures to the empirical measure of the Xi'S. We show that these estimators are consistent and asymptotically efficient.
A Local Maximal Inequality under Uniform Entropy
"... Abstract: We derive an upper bound for the mean of the supremum of the empirical process indexed by a class of functions that are known to have variance bounded by a small constant δ. The bound is expressed in the uniform entropy integral of the class at δ. The bound yields a rate of convergence of ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract: We derive an upper bound for the mean of the supremum of the empirical process indexed by a class of functions that are known to have variance bounded by a small constant δ. The bound is expressed in the uniform entropy integral of the class at δ. The bound yields a rate of convergence of minimum contrast estimators when applied to the modulus of continuity of the contrast functions.
A Central Limit Theorem for L^{infinity}Valued Martingale Difference Arrays and Its Application
, 1996
"... A central limit theorem for ` 1 (T )valued martingale difference arrays is given, where T is a nonempty set. As its application, the asymptotic behavior of loglikelihood ratio random fields in general statistical models is derived. 1 Introduction Let T be a nonempty set. We denote by ` 1 (T ..."
Abstract
 Add to MetaCart
A central limit theorem for ` 1 (T )valued martingale difference arrays is given, where T is a nonempty set. As its application, the asymptotic behavior of loglikelihood ratio random fields in general statistical models is derived. 1 Introduction Let T be a nonempty set. We denote by ` 1 (T ) the space of bounded, Rvalued functions defined on T , and equip it with the supnorm jj \Delta jj T . For every n 2 N, let B n = (\Omega n ; F n ; F n ; P n ) be a stochastic base, where(\Omega n ; F n ; P n ) is a probability space and F n = fF n i g i2N0 is a nondecreasing sequence of suboefields of F n indexed by N 0 = f0g[N . Here we make a definition. Definition 1 fM n i g i2N = f(M n i (t)jt 2 T )g i2N is an ` 1 (T )valued martingale difference array on B n if (i) M n i is a mapping from\Omega n to ` 1 (T ) for every i 2 N; (ii) fM n i (t)g i2N is a Rvalued martingale difference array on B n for every t 2 T . It is required in (ii) above...
Convergence rate of deformable models and image recalage FLEMMARD
, 2001
"... We study deformable models (eg, snakes) and image recalage (eg, for radiotherapy) in the framework of learning theory. First, let's recall the problem of image recalage. Let I be an image and J an image to "recal" on I, ie on which we apply g 2 G such that g(J) ' I, the dissimilarity being encoded b ..."
Abstract
 Add to MetaCart
We study deformable models (eg, snakes) and image recalage (eg, for radiotherapy) in the framework of learning theory. First, let's recall the problem of image recalage. Let I be an image and J an image to "recal" on I, ie on which we apply g 2 G such that g(J) ' I, the dissimilarity being encoded by L(g(J); I). For physical reasons, we assume that some g 2 G are less likely than some others; this is encoded by a regularization; thus, we minimize L(g(J); I) + R(g). This restricts g to a subfamily G 0 G. Usually, g is reached by a gradient descent. In the sequel and in order to have 1 homogeneous notations, we will notice F and L (again) such that L(f; I) = L(g(J); I), without loss of generality (F = fg(J)=g 2 Gg). Let's now consider deformable models. Consider a set F of possible models (say, possible positions of a body (eg a tumor) in a picture) and an image I, and look for f 2 F such that f is probably nearly the position of the body encoded in I. Deformable models are typically (in 2D) encoded by lists of points. Usually, an initial point is chosen (automatically, as in [14], or manually by the user). Then a gradient descent is used, based upon the empirical similarity between the image I and the model f , plus a term of regularization. This principle is based upon two hypothesis: 1. The gradient descend nearly nds an empirically optimal point. 2. The empirical values are related to the "real" ones, the real ones being the values resulting of grids of innite precision without noise. The rst point depends upon convexity results (see FLEMMARD for a study in the case of deformable models). The second point is very similar to problems handled in [4, 27, 30]. Unfortunately, results based upon VCtheory do not apply here, because VCdimension is usually inn...
Learning nonindependent sequences of examples. System identification, control and stabilization
"... Very extended version of papers published in proceedings of EFTF 2001 and ICNF 2001 (J.M. Friedt/O. Teytaud/M. Planat/D. Gillet). Many recent works consider practical applications of neural networks for control. A few papers only have been devoted to the application of the theoretical part of learn ..."
Abstract
 Add to MetaCart
Very extended version of papers published in proceedings of EFTF 2001 and ICNF 2001 (J.M. Friedt/O. Teytaud/M. Planat/D. Gillet). Many recent works consider practical applications of neural networks for control. A few papers only have been devoted to the application of the theoretical part of learning to control (their main results are recalled here). This paper provides: Notations and denitions for system identication, stabilization and control. An as extensive as possible survey of results about ergodic, stationary or chaotic time series (learning theory with temporal dependencies). Historical introductions to areas of science which intersect system identi cation, stabilization or control: statistical physics, stochastic dynamics of deterministic systems, empirical process and VCtheory, fuzzy logic, neural networks and related learning tools, Markov models. Practical illustrations and classical benchmarks, with references for practical algorithms. Theoretical open problems in system identication, stabilization and control.