Results 1  10
of
13
SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR
 SUBMITTED TO THE ANNALS OF STATISTICS
, 2007
"... We exhibit an approximate equivalence between the Lasso estimator and Dantzig selector. For both methods we derive parallel oracle inequalities for the prediction risk in the general nonparametric regression model, as well as bounds on the ℓp estimation loss for 1 ≤ p ≤ 2 in the linear model when th ..."
Abstract

Cited by 472 (11 self)
 Add to MetaCart
We exhibit an approximate equivalence between the Lasso estimator and Dantzig selector. For both methods we derive parallel oracle inequalities for the prediction risk in the general nonparametric regression model, as well as bounds on the ℓp estimation loss for 1 ≤ p ≤ 2 in the linear model when the number of variables can be much larger than the sample size.
Model selection via testing: an alternative to (penalized) maximum likelihood estimators
, 2003
"... This paper is devoted to the description and study of a family of estimators, that we shall call T estimators (T for tests), for minimax estimation and model selection. Their construction is based on former ideas about deriving estimators from some families of tests due to Le Cam (1973 and 1975) ..."
Abstract

Cited by 65 (7 self)
 Add to MetaCart
This paper is devoted to the description and study of a family of estimators, that we shall call T estimators (T for tests), for minimax estimation and model selection. Their construction is based on former ideas about deriving estimators from some families of tests due to Le Cam (1973 and 1975) and Birge (1983, 1984a and b) and about complexity based model selection from Barron and Cover (1991). It is
Learning by mirror averaging
 The Annals of Statistics
"... Given a finite collection of estimators or classifiers, we study the problem of model selection type aggregation, that is, we construct a new estimator or classifier, called aggregate, which is nearly as good as the best among them with respect to a given risk criterion. We define our aggregate by a ..."
Abstract

Cited by 53 (8 self)
 Add to MetaCart
Given a finite collection of estimators or classifiers, we study the problem of model selection type aggregation, that is, we construct a new estimator or classifier, called aggregate, which is nearly as good as the best among them with respect to a given risk criterion. We define our aggregate by a simple recursive procedure which solves an auxiliary stochastic linear programming problem related to the original nonlinear one and constitutes a special case of the mirror averaging algorithm. We show that the aggregate satisfies sharp oracle inequalities under some general assumptions. The results are applied to several problems including regression, classification and density estimation. 1. Introduction. Several
Sequential procedures for aggregating arbitrary estimators of a conditional mean
, 2005
"... In this paper we describe and analyze a sequential procedure for aggregating linear combinations of a finite family of regression estimates, with particular attention to linear combinations having coefficients in the generalized simplex. The procedure is based on exponential weighting, and has a com ..."
Abstract

Cited by 39 (1 self)
 Add to MetaCart
(Show Context)
In this paper we describe and analyze a sequential procedure for aggregating linear combinations of a finite family of regression estimates, with particular attention to linear combinations having coefficients in the generalized simplex. The procedure is based on exponential weighting, and has a computationally tractable approximation. Analysis of the procedure is based in part on techniques from the sequential prediction of nonrandom sequences. Here these techniques are applied in a stochastic setting to obtain cumulative loss bounds for the aggregation procedure. From the cumulative loss bounds we derive an oracle inequality for the aggregate estimator for an unbounded response having a suitable moment generating function. The inequality shows that the risk of the aggregate estimator is less than the risk of the best candidate linear combination in the generalized simplex, plus a complexity term that depends on the size of the coefficient set. The inequality readily yields convergence rates for aggregation over the unit simplex that are within logarithmic factors of known minimax bounds. Some preliminary results on model selection are also presented.
Sparsity regret bounds for individual sequences in online linear regression
 JMLR Workshop and Conference Proceedings, 19 (COLT 2011 Proceedings):377–396
, 2011
"... We consider the problem of online linear regression on arbitrary deterministic sequences when the ambient dimension d can be much larger than the number of time rounds T. We introduce the notion of sparsity regret bound, which is a deterministic online counterpart of recent risk bounds derived in th ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
We consider the problem of online linear regression on arbitrary deterministic sequences when the ambient dimension d can be much larger than the number of time rounds T. We introduce the notion of sparsity regret bound, which is a deterministic online counterpart of recent risk bounds derived in the stochastic setting under a sparsity scenario. We prove such regret bounds for an onlinelearning algorithm called SeqSEW and based on exponential weighting and datadriven truncation. In a second part we apply a parameterfree version of this algorithm to the stochastic setting (regression model with random design). This yields risk bounds of the same flavor as in Dalalyan and Tsybakov (2012a) but which solve two questions left open therein. In particular our risk bounds are adaptive (up to a logarithmic factor) to the unknown variance of the noise if the latter is Gaussian. We also address the regression model with fixed design.
Mirror averaging with sparsity priors
 SUBMITTED TO THE BERNOULLI
, 2012
"... We consider the problem of aggregating the elements of a possibly infinite dictionary for building a decision procedure that aims at minimizing a given criterion. Along with the dictionary, an independent identically distributed training sample is available, on which the performance of a given proce ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of aggregating the elements of a possibly infinite dictionary for building a decision procedure that aims at minimizing a given criterion. Along with the dictionary, an independent identically distributed training sample is available, on which the performance of a given procedure can be tested. In a fairly general setup, we establish an oracle inequality for the Mirror Averaging aggregate with any prior distribution. By choosing an appropriate prior, we apply this oracle inequality in the context of prediction under sparsity assumption for the problems of regression with random design, density estimation and binary classification.
Optimal rates of aggregation in classification
, 2005
"... In the same spirit as Tsybakov [2003], we define the optimality of an aggregation procedure in the problem of classification. Using an aggregate with exponential weights, we obtain an optimal rate of convex aggregation for the hinge risk under the margin assumption. Moreover we obtain an optimal rat ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
In the same spirit as Tsybakov [2003], we define the optimality of an aggregation procedure in the problem of classification. Using an aggregate with exponential weights, we obtain an optimal rate of convex aggregation for the hinge risk under the margin assumption. Moreover we obtain an optimal rate of model selection aggregation under the margin assumption for the excess Bayes risk.
On aggregation of nonparametric estimators
, 2009
"... We study the problem of aggregation of arbitrary estimators from a given family F with respect to Lp–risks. The proposed method is based on comparison of empirical estimates of certain linear functionals with estimates induced by the family F. We derive oracle inequalities and show that they are uni ..."
Abstract
 Add to MetaCart
We study the problem of aggregation of arbitrary estimators from a given family F with respect to Lp–risks. The proposed method is based on comparison of empirical estimates of certain linear functionals with estimates induced by the family F. We derive oracle inequalities and show that they are unimprovable in some sense.