Results 1  10
of
10
A Stochastic Gradient Method with an Exponential Convergence Rate for StronglyConvex Optimization with Finite Training Sets. arXiv preprint arXiv:1202.6258
, 2012
"... We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in ..."
Abstract

Cited by 76 (11 self)
 Add to MetaCart
(Show Context)
We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in order to achieve a linear convergence rate. In a machine learning context, numerical experiments indicate that the new algorithm can dramatically outperform standard algorithms, both in terms of optimizing the training objective and reducing the testing objective quickly. 1
Smoothness, low noise and fast rates
 In NIPS
, 2010
"... We establish an excess risk bound of Õ HR2 n + √ HL ∗) Rn for ERM with an Hsmooth loss function and a hypothesis class with Rademacher complexity Rn, where L ∗ is the best risk achievable by the hypothesis class. For typical hypothesis classes where Rn = √ R/n, this translates to a learning rate o ..."
Abstract

Cited by 31 (11 self)
 Add to MetaCart
(Show Context)
We establish an excess risk bound of Õ HR2 n + √ HL ∗) Rn for ERM with an Hsmooth loss function and a hypothesis class with Rademacher complexity Rn, where L ∗ is the best risk achievable by the hypothesis class. For typical hypothesis classes where Rn = √ R/n, this translates to a learning rate of Õ (RH/n) in the separable (L ∗ = 0) case and Õ RH/n + √ L ∗) RH/n more generally. We also provide similar guarantees for online and stochastic convex optimization of a smooth nonnegative objective. 1
θmrf: Capturing spatial and semantic structure in the parameters for scene understanding
 In NIPS, 2011. 8
"... For most scene understanding tasks (such as object detection or depth estimation), the classifiers need to consider contextual information in addition to the local features. We can capture such contextual information by taking as input the features/attributes from all the regions in the image. Howev ..."
Abstract

Cited by 13 (9 self)
 Add to MetaCart
(Show Context)
For most scene understanding tasks (such as object detection or depth estimation), the classifiers need to consider contextual information in addition to the local features. We can capture such contextual information by taking as input the features/attributes from all the regions in the image. However, this contextual dependence also varies with the spatial location of the region of interest, and we therefore need a different set of parameters for each spatial location. This results in a very large number of parameters. In this work, we model the independence properties between the parameters for each location and for each task, by defining a Markov Random Field (MRF) over the parameters. In particular, two sets of parameters are encouraged to have similar values if they are spatially close or semantically close. Our method is, in principle, complementary to other ways of capturing context such as the ones that use a graphical model over the labels instead. In extensive evaluation over two different settings, of multiclass object detection and of multiple scene understanding tasks (scene categorization, depth estimation, geometric labeling), our method beats the stateoftheart methods in all the four tasks. 1
Multitask regression using minimal penalties
, 2011
"... In this paper we study the kernel multiple ridge regression framework, which we refer to as multitask regression, using penalization techniques. The theoretical analysis of this problem shows that the key element appearing for an optimal calibration is the covariance matrix of the noise between t ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
In this paper we study the kernel multiple ridge regression framework, which we refer to as multitask regression, using penalization techniques. The theoretical analysis of this problem shows that the key element appearing for an optimal calibration is the covariance matrix of the noise between the different tasks. We present a new algorithm to estimate this covariance matrix, based on the concept of minimal penalty, which was previously used in the singletask regression framework to estimate the variance of the noise. We show, in a nonasymptotic setting and under mild assumptions on the target function, that this estimator converges towards the covariance matrix. Then plugging this estimator into the corresponding ideal penalty leads to an oracle inequality. We illustrate the behavior of our algorithm on synthetic examples.
On the Interaction between Norm and Dimensionality: Multiple Regimes in Learning
"... A learning problem might have several measures of complexity (e.g., norm and dimensionality) that affect the generalization error. What is the interaction between these complexities? Dimensionfree learning theory bounds and parametric asymptotic analyses each provide a partial picture of the full l ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
A learning problem might have several measures of complexity (e.g., norm and dimensionality) that affect the generalization error. What is the interaction between these complexities? Dimensionfree learning theory bounds and parametric asymptotic analyses each provide a partial picture of the full learning curve. In this paper, we use highdimensional asymptotics on two classical problems—mean estimation and linear regression—to explore the learning curve more completely. We show that these curves exhibit multiple regimes, where in each regime, the excess risk is controlled by a subset of the problem complexities. 1.
ENS; Sierra Projectteam
, 2012
"... HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
 Add to MetaCart
(Show Context)
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
regression. Submitted to the Electronic Journal of Statistics. 2013. <hal00846715>
, 2013
"... bewteen multitask and singletask oracle risks in kernel ridge regression ..."
Abstract
 Add to MetaCart
(Show Context)
bewteen multitask and singletask oracle risks in kernel ridge regression
1 Multitask Regression using Minimal Penalties Multitask Regression using Minimal Penalties
"... ar ..."
LEARNING CONTEXTUAL INFORMATION FOR HOLISTIC SCENE UNDERSTANDING
, 2012
"... One of the primary goals in computer vision is holistic scene understanding, which involves many subtasks, such as depth estimation, scene categorization, saliency detection, object detection, event categorization, etc. Each of these tasks explains some aspect of a particular scene and in order to ..."
Abstract
 Add to MetaCart
One of the primary goals in computer vision is holistic scene understanding, which involves many subtasks, such as depth estimation, scene categorization, saliency detection, object detection, event categorization, etc. Each of these tasks explains some aspect of a particular scene and in order to fully understand a scene, we would need to solve for each of these subtasks. In our human’s visual system, the subtasks are often coupled together. One task can leverage the output of another task as contextual information for its own decision, and can also feed useful information back to the other tasks. In this thesis, our goal is to design computational algorithms that perform multiple scene understanding tasks in a collaborative way like human does. In our algorithm design, we consider a twolayer cascade of classifiers, which are repeated instantiations of the original tasks, with the output of the first layer fed into the second layer as input. To better optimize the secondlayer outputs, we propose three algorithms, which result in capturing contextual information at multiple levels, ranging from contextual interactions between dif