Results 1 - 10
of
11
Risk bounds for Statistical Learning
"... We propose a general theorem providing upper bounds for the risk of an empirical risk minimizer (ERM).We essentially focus on the binary classi…cation framework. We extend Tsybakov’s analysis of the risk of an ERM under margin type conditions by using concentration inequalities for conveniently weig ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
We propose a general theorem providing upper bounds for the risk of an empirical risk minimizer (ERM).We essentially focus on the binary classi…cation framework. We extend Tsybakov’s analysis of the risk of an ERM under margin type conditions by using concentration inequalities for conveniently weighted empirical processes. This allows us to deal with other ways of measuring the ”size”of a class of classi…ers than entropy with bracketing as in Tsybakov’s work. In particular we derive new risk bounds for the ERM when the classi…cation rules belong to some VC-class under margin conditions and discuss the optimality of those bounds in a minimax sense.
Generalized Quantile Processes Based on Multivariate Depth Functions, with Applications in Nonparametric Multivariate Analysis
, 2001
"... Statistical depth functions are being used increasingly in nonparametric multivariate data analysis. In a broad treatment of depth-based methods, Liu, Parelius and Singh (1999) include several devices for visualizing selected multivariate distributional characteristics by one-dimensional curves cons ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
Statistical depth functions are being used increasingly in nonparametric multivariate data analysis. In a broad treatment of depth-based methods, Liu, Parelius and Singh (1999) include several devices for visualizing selected multivariate distributional characteristics by one-dimensional curves constructed in terms of given depth functions. Here we show how these tools may be represented as special depth-based cases of generalized quantile functions introduced by Einmahl and Mason (1992). By specializing results of the latter authors to the depth-based case, we develop an easily applied general result on convergence of sample depth-based generalized quantile processes to a Brownian bridge. As applications, we obtain the asymptotic behavior of sample versions of depth-based curves for "scale" and "kurtosis" introduced by Liu, Parelius and Singh. The kurtosis curve is actually a Lorenz curve designed to measure heaviness of tails of a multivariate distribution. We also obtain the asymptotic distribution of the quantile process of the sample depth values.
Kullback-Leibler Constrained Estimation of Probability Measures
, 1988
"... We consider two estimation problems in this paper. In the first, we observe Xl,. ",Xn i.i.d. Po, where it is assumed known that EPoT = a. In the second, we observe Xl,. ",Xn E IRa and J::: 2::'1 T(Yi) E IR b, where Xl,...,Xn, Yl,...,Y", are i.i.d. Po, T is some Borel measurable function and n/m----+ ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We consider two estimation problems in this paper. In the first, we observe Xl,. ",Xn i.i.d. Po, where it is assumed known that EPoT = a. In the second, we observe Xl,. ",Xn E IRa and J::: 2::'1 T(Yi) E IR b, where Xl,...,Xn, Yl,...,Y", are i.i.d. Po, T is some Borel measurable function and n/m----+ A as n/ \ m--+ (X). In both situations we consider the problem of estimating the probability measure Po, uniformly over a class of sets C. The estimators considered here are based on the minimization of the Kullback-Leibier divergence from certain collections of probability measures to the empirical measure of the Xi'S. We show that these estimators are consistent and asymptotically efficient.
A Local Maximal Inequality under Uniform Entropy
"... Abstract: We derive an upper bound for the mean of the supremum of the empirical process indexed by a class of functions that are known to have variance bounded by a small constant δ. The bound is expressed in the uniform entropy integral of the class at δ. The bound yields a rate of convergence of ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract: We derive an upper bound for the mean of the supremum of the empirical process indexed by a class of functions that are known to have variance bounded by a small constant δ. The bound is expressed in the uniform entropy integral of the class at δ. The bound yields a rate of convergence of minimum contrast estimators when applied to the modulus of continuity of the contrast functions.
A Central Limit Theorem for Linfinity-Valued Martingale Difference Arrays and Its Application
, 1996
"... A central limit theorem for ` 1 (T )-valued martingale difference arrays is given, where T is a non-empty set. As its application, the asymptotic behavior of loglikelihood ratio random fields in general statistical models is derived. 1 Introduction Let T be a non-empty set. We denote by ` 1 (T ..."
Abstract
- Add to MetaCart
A central limit theorem for ` 1 (T )-valued martingale difference arrays is given, where T is a non-empty set. As its application, the asymptotic behavior of loglikelihood ratio random fields in general statistical models is derived. 1 Introduction Let T be a non-empty set. We denote by ` 1 (T ) the space of bounded, R-valued functions defined on T , and equip it with the sup-norm jj \Delta jj T . For every n 2 N, let B n = (\Omega n ; F n ; F n ; P n ) be a stochastic base, where(\Omega n ; F n ; P n ) is a probability space and F n = fF n i g i2N0 is a non-decreasing sequence of sub-oe-fields of F n indexed by N 0 = f0g[N . Here we make a definition. Definition 1 fM n i g i2N = f(M n i (t)jt 2 T )g i2N is an ` 1 (T )-valued martingale difference array on B n if (i) M n i is a mapping from\Omega n to ` 1 (T ) for every i 2 N; (ii) fM n i (t)g i2N is a R-valued martingale difference array on B n for every t 2 T . It is required in (ii) above...
Convergence rate of deformable models and image recalage FLEMMARD
, 2001
"... We study deformable models (eg, snakes) and image recalage (eg, for radiotherapy) in the framework of learning theory. First, let's recall the problem of image recalage. Let I be an image and J an image to "recal" on I, ie on which we apply g 2 G such that g(J) ' I, the dissimilarity being encoded b ..."
Abstract
- Add to MetaCart
We study deformable models (eg, snakes) and image recalage (eg, for radiotherapy) in the framework of learning theory. First, let's recall the problem of image recalage. Let I be an image and J an image to "recal" on I, ie on which we apply g 2 G such that g(J) ' I, the dissimilarity being encoded by L(g(J); I). For physical reasons, we assume that some g 2 G are less likely than some others; this is encoded by a regularization; thus, we minimize L(g(J); I) + R(g). This restricts g to a subfamily G 0 G. Usually, g is reached by a gradient descent. In the sequel and in order to have 1 homogeneous notations, we will notice F and L (again) such that L(f; I) = L(g(J); I), without loss of generality (F = fg(J)=g 2 Gg). Let's now consider deformable models. Consider a set F of possible models (say, possible positions of a body (eg a tumor) in a picture) and an image I, and look for f 2 F such that f is probably nearly the position of the body encoded in I. Deformable models are typically (in 2D) encoded by lists of points. Usually, an initial point is chosen (automatically, as in [14], or manually by the user). Then a gradient descent is used, based upon the empirical similarity between the image I and the model f , plus a term of regularization. This principle is based upon two hypothesis: 1. The gradient descend nearly nds an empirically optimal point. 2. The empirical values are related to the "real" ones, the real ones being the values resulting of grids of innite precision without noise. The rst point depends upon convexity results (see FLEMMARD for a study in the case of deformable models). The second point is very similar to problems handled in [4, 27, 30]. Unfortunately, results based upon VC-theory do not apply here, because VC-dimension is usually inn...
Learning non-independent sequences of examples. System identification, control and stabilization
"... Very extended version of papers published in proceedings of EFTF 2001 and ICNF 2001 (J.M. Friedt/O. Teytaud/M. Planat/D. Gillet). Many recent works consider practical applications of neural networks for control. A few papers only have been devoted to the application of the theoretical part of learn ..."
Abstract
- Add to MetaCart
Very extended version of papers published in proceedings of EFTF 2001 and ICNF 2001 (J.M. Friedt/O. Teytaud/M. Planat/D. Gillet). Many recent works consider practical applications of neural networks for control. A few papers only have been devoted to the application of the theoretical part of learning to control (their main results are recalled here). This paper provides: Notations and denitions for system identication, stabilization and control. An as extensive as possible survey of results about ergodic, stationary or chaotic time series (learning theory with temporal dependencies). Historical introductions to areas of science which intersect system identi cation, stabilization or control: statistical physics, stochastic dynamics of deterministic systems, empirical process and VC-theory, fuzzy logic, neural networks and related learning tools, Markov models. Practical illustrations and classical benchmarks, with references for practical algorithms. Theoretical open problems in system identication, stabilization and control.
unknown title
"... The uniform empirical process central limit theorem (CLT) and law of the iterated logarithm (LIL) for i.i.d. observations has been the subject of extensive study, but much less is known for dependent observations. Recent work which improves on this situation includes [2], [3], [4], [7] and [8], wher ..."
Abstract
- Add to MetaCart
The uniform empirical process central limit theorem (CLT) and law of the iterated logarithm (LIL) for i.i.d. observations has been the subject of extensive study, but much less is known for dependent observations. Recent work which improves on this situation includes [2], [3], [4], [7] and [8], where

