Results 1  10
of
13
Highdimensional data analysis: The curses and blessings of dimensionality
 AMS CONFERENCE ON MATH CHALLENGES OF THE 21ST CENTURY
, 2000
"... The coming century is surely the century of data. A combination of blind faith and serious purpose makes our society invest massively in the collection and processing of data of all kinds, on scales unimaginable until recently. Hyperspectral Imagery, Internet Portals, Financial tickbytick data, a ..."
Abstract

Cited by 168 (0 self)
 Add to MetaCart
The coming century is surely the century of data. A combination of blind faith and serious purpose makes our society invest massively in the collection and processing of data of all kinds, on scales unimaginable until recently. Hyperspectral Imagery, Internet Portals, Financial tickbytick data, and DNA Microarrays are just a few of the betterknown sources, feeding data in torrential streams into scientific and business databases worldwide. In traditional statistical data analysis, we think of observations of instances of particular phenomena (e.g. instance ↔ human being), these observations being a vector of values we measured on several variables (e.g. blood pressure, weight, height,...). In traditional statistical methodology, we assumed many observations and a few, wellchosen variables. The trend today is towards more observations but even more so, to radically larger numbers of variables – voracious, automatic, systematic collection of hyperinformative detail about each observed instance. We are seeing examples where the observations gathered on individual instances are curves, or spectra, or images, or
Empirical margin distributions and bounding the generalization error of combined classifiers
 Ann. Statist
, 2002
"... Dedicated to A.V. Skorohod on his seventieth birthday We prove new probabilistic upper bounds on generalization error of complex classifiers that are combinations of simple classifiers. Such combinations could be implemented by neural networks or by voting methods of combining the classifiers, such ..."
Abstract

Cited by 160 (11 self)
 Add to MetaCart
Dedicated to A.V. Skorohod on his seventieth birthday We prove new probabilistic upper bounds on generalization error of complex classifiers that are combinations of simple classifiers. Such combinations could be implemented by neural networks or by voting methods of combining the classifiers, such as boosting and bagging. The bounds are in terms of the empirical distribution of the margin of the combined classifier. They are based on the methods of the theory of Gaussian and empirical processes (comparison inequalities, symmetrization method, concentration inequalities) and they improve previous results of Bartlett (1998) on bounding the generalization error of neural networks in terms of ℓ1norms of the weights of neurons and of Schapire, Freund, Bartlett and Lee (1998) on bounding the generalization error of boosting. We also obtain rates of convergence in Lévy distance of empirical margin distribution to the true margin distribution uniformly over the classes of classifiers and prove the optimality of these rates.
Lectures on the central limit theorem for empirical processes
 Probability and Banach Spaces
, 1986
"... Abstract. Concentration inequalities are used to derive some new inequalities for ratiotype suprema of empirical processes. These general inequalities are used to prove several new limit theorems for ratiotype suprema and to recover anumber of the results from [1] and [2]. As a statistical applica ..."
Abstract

Cited by 135 (9 self)
 Add to MetaCart
(Show Context)
Abstract. Concentration inequalities are used to derive some new inequalities for ratiotype suprema of empirical processes. These general inequalities are used to prove several new limit theorems for ratiotype suprema and to recover anumber of the results from [1] and [2]. As a statistical application, an oracle inequality for nonparametric regression is obtained via ratio bounds. 1.
Rates of convergence and adaption over Besov spaces under pointwise risk
 STATISTICA SINICA
, 2003
"... Function estimation over the Besov spaces under pointwise ℓ r (1 ≤ r< ∞) risks is considered. Minimax rates of convergence are derived using a constrained risk inequality and wavelets. Adaptation under pointwise risks is also considered. Sharp lower bounds on the cost of adaptation are obtained a ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Function estimation over the Besov spaces under pointwise ℓ r (1 ≤ r< ∞) risks is considered. Minimax rates of convergence are derived using a constrained risk inequality and wavelets. Adaptation under pointwise risks is also considered. Sharp lower bounds on the cost of adaptation are obtained and are shown to be attainable by a wavelet estimator. The results demonstrate important differences between the minimax properties under pointwise and global risk measures. The minimax rates and adaptation for estimating derivatives under pointwise risks are also presented. A general ℓ rrisk oracle inequality is developed for the proofs of the main results.
TwoStage Model Selection Procedures In Partially Linear Regression
 Canadian Journal of Statistics
, 2004
"... This paper proposes a twostage estimation procedure in the partially linear model Y = f0 (T ) +X 0 +". We give a general rule for consistent estimation of the location of the nonzero components of 0 , which is shown to be compatible with minimax adaptive estimation of f0 over Besov balls when ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
This paper proposes a twostage estimation procedure in the partially linear model Y = f0 (T ) +X 0 +". We give a general rule for consistent estimation of the location of the nonzero components of 0 , which is shown to be compatible with minimax adaptive estimation of f0 over Besov balls when specialized to penalized least squares estimation. The proofs are based on a new type of oracle inequality. 1.
Penalty Choices and Consistent Covariate Selection in Semiparametric Regression
 Department of Statistics, Florida State University, Tallahassee. ( http://stat.fsu.edu/recentreports.html
, 2002
"... This paper presents a new method for estimation in semiparametric regression models of the type Y i = X i + f(T i ) +W i ; i = 1; : : : n, based on a model selection technique which allows simultaneous estimation of the parametric and nonparametric components within a collection of nite dime ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This paper presents a new method for estimation in semiparametric regression models of the type Y i = X i + f(T i ) +W i ; i = 1; : : : n, based on a model selection technique which allows simultaneous estimation of the parametric and nonparametric components within a collection of nite dimensional models, using a penalized least squares criterion. We devise a new type of penalization, tailored to semiparametric models, for which we can consistently estimate the subset of nonzero coecients of the linear part. Moreover, the selected estimator of the linear component is asymptotically normal
A Model Selection Approach to Semiparametric Regression
, 2000
"... This paper presents a new method for estimation in semiparametric regression models, based on a model selection technique. This method allows simultaneous estimation of the parametric and nonparametric part within a collection of nite dimensional models, using a penalized criterion. It extends the m ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This paper presents a new method for estimation in semiparametric regression models, based on a model selection technique. This method allows simultaneous estimation of the parametric and nonparametric part within a collection of nite dimensional models, using a penalized criterion. It extends the method of sieve estimators proposed by Barron, Birge and Massart (1999) for nonparametric regression, and also the method suggested by Chen (1988), for semiparametric regression, in which only one sieve is used. The estimator of the nonparametric part is shown to be minimax adaptive. The estimator of the parametric component, which now depends on the random index of the selected model, is shown to be asymptotically normal. October 11, 2000 1. Introduction The partially linear regression model is a semiparametric extension of the linear regression model. Assume that the covariates are partitioned in two groups, X and T . Then, the regression function is assumed to be the sum of a linear function of X and an arbitrary (possibly nonlinear) function of T . Thus, the observations consist of i.i.d. P pairs ((X 1 ; T 1 ); Y 1 ); : : : ; (Xn ; Tn ); Yn ), where each Y i
AideMemoire. HighDimensional Data Analysis: The Curses and Blessings of Dimensionality
, 2000
"... The coming century is surely the century of data. A combination of blind faith and serious purpose makes our society invest massively in the collection and processing of data of all kinds, on scales unimaginable until recently. Hyperspectral Imagery, Internet Portals, Financial tickbytick data, an ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
The coming century is surely the century of data. A combination of blind faith and serious purpose makes our society invest massively in the collection and processing of data of all kinds, on scales unimaginable until recently. Hyperspectral Imagery, Internet Portals, Financial tickbytick data, and DNA Microarrays are just a few of the betterknown sources, feeding data in torrential streams into scientific and business databases worldwide. In traditional statistical data analysis, we think of observations of instances of particular phenomena (e.g. instance ↔ human being), these observations being a vector of values we measured on several variables (e.g. blood pressure, weight, height,...). In traditional statistical methodology, we assumed many observations and a few, wellchosen variables. The trend today is towards more observations but even more so, to radically larger numbers of variables – voracious, automatic, systematic collection of hyperinformative detail about each observed instance. We are seeing examples where the observations gathered on individual instances are curves, or spectra, or images, or
Reduction Of Correlated Noise Using A Library Of Orthonormal Bases
"... . We study the application of a library of orthonormal bases to the reduction of correlated Gaussian noise. A joint condition on the library and the noise covariance is derived which ensures that simple thresholding in an adaptively chosen basis yields an estimation error within a logarithmic factor ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
. We study the application of a library of orthonormal bases to the reduction of correlated Gaussian noise. A joint condition on the library and the noise covariance is derived which ensures that simple thresholding in an adaptively chosen basis yields an estimation error within a logarithmic factor of the ideal risk. In the model example of a wavelet packet library and stationary noise the condition can be translated into a reverse Holder inequality on the power spectrum. 1. Introduction Consider signals given as vectors in R N , and assume that a finite library L of orthonormal bases is available. The total collection of w 2 B, B 2 L defines a dictionary D of M vectors. If L consists of only one basis, we have M = N . In general M NjLj where jLj is the number of bases in L. For the examples we have in mind, however, it holds that M NjLj. For wavelet packet libraries one has M = N(1 + log 2 N) while jLj ? 1:5 N , [CW]. Let a clean signal s be corrupted by additive noise z so t...