Results 1  10
of
3,096
Sparse Learning for Stochastic Composite Optimization
"... In this paper, we focus on Stochastic Composite Optimization (SCO) for sparse learning that aims to learn a sparse solution. Although many SCO algorithms have been developed for sparse learning with an optimal convergence rate O(1/T), they often fail to deliver sparse solutions at the end either b ..."
Abstract
 Add to MetaCart
In this paper, we focus on Stochastic Composite Optimization (SCO) for sparse learning that aims to learn a sparse solution. Although many SCO algorithms have been developed for sparse learning with an optimal convergence rate O(1/T), they often fail to deliver sparse solutions at the end either
Efficient sparse coding algorithms
 In NIPS
, 2007
"... Sparse coding provides a class of algorithms for finding succinct representations of stimuli; given only unlabeled input data, it discovers basis functions that capture higherlevel features in the data. However, finding sparse codes remains a very difficult computational problem. In this paper, we ..."
Abstract

Cited by 445 (14 self)
 Add to MetaCart
present efficient sparse coding algorithms that are based on iteratively solving two convex optimization problems: an L1regularized least squares problem and an L2constrained least squares problem. We propose novel algorithms to solve both of these optimization problems. Our algorithms result in a
Pegasos: Primal Estimated subgradient solver for SVM
"... We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on a singl ..."
Abstract

Cited by 542 (20 self)
 Add to MetaCart
We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on a
Training Linear SVMs in Linear Time
, 2006
"... Linear Support Vector Machines (SVMs) have become one of the most prominent machine learning techniques for highdimensional sparse data commonly encountered in applications like text classification, wordsense disambiguation, and drug design. These applications involve a large number of examples n ..."
Abstract

Cited by 549 (6 self)
 Add to MetaCart
Linear Support Vector Machines (SVMs) have become one of the most prominent machine learning techniques for highdimensional sparse data commonly encountered in applications like text classification, wordsense disambiguation, and drug design. These applications involve a large number of examples n
Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
, 2001
"... Variable selection is fundamental to highdimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which can be computationally expensive and ignore stochastic errors in the variable selection process. In this article, penalized ..."
Abstract

Cited by 948 (62 self)
 Add to MetaCart
functions are symmetric, nonconcave on (0, ∞), and have singularities at the origin to produce sparse solutions. Furthermore, the penalty functions should be bounded by a constant to reduce bias and satisfy certain conditions to yield continuous solutions. A new algorithm is proposed for optimizing
Distortion invariant object recognition in the dynamic link architecture
 IEEE TRANSACTIONS ON COMPUTERS
, 1993
"... We present an object recognition system based on the Dynamic Link Architecture, which is an extension to classical Artificial Neural Networks. The Dynamic Link Architecture exploits correlations in the finescale temporal structure of cellular signals in order to group neurons dynamically into hig ..."
Abstract

Cited by 637 (80 self)
 Add to MetaCart
are represented by sparse graphs, whose vertices are labeled by a multiresolution description in terms of a local power spectrum, and whose edges are labeled by geometrical distance vectors. Object recognition can be formulated as elastic graph matching, which is performed here by stochastic optimization of a
Benchmarking Least Squares Support Vector Machine Classifiers
 NEURAL PROCESSING LETTERS
, 2001
"... In Support Vector Machines (SVMs), the solution of the classification problem is characterized by a (convex) quadratic programming (QP) problem. In a modified version of SVMs, called Least Squares SVM classifiers (LSSVMs), a least squares cost function is proposed so as to obtain a linear set of eq ..."
Abstract

Cited by 476 (46 self)
 Add to MetaCart
stage by gradually pruning the support value spectrum and optimizing the hyperparameters during the sparse approximation procedure. In this paper, twenty public domain benchmark datasets are used to evaluate the test set performance of LSSVM classifiers with linear, polynomial and radial basis function
Support vector machine learning for interdependent and structured output spaces
 In ICML
, 2004
"... Learning general functional dependencies is one of the main goals in machine learning. Recent progress in kernelbased methods has focused on designing flexible and powerful input representations. This paper addresses the complementary issue of problems involving complex outputs suchas multiple depe ..."
Abstract

Cited by 450 (20 self)
 Add to MetaCart
dependent output variables and structured output spaces. We propose to generalize multiclass Support Vector Machine learning in a formulation that involves features extracted jointly from inputs and outputs. The resulting optimization problem is solved efficiently by a cutting plane algorithm that exploits
Online learning for matrix factorization and sparse coding
, 2010
"... Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the largescale matrix factorization problem that consists of learning the basis set in order to ad ..."
Abstract

Cited by 330 (31 self)
 Add to MetaCart
to adapt it to specific data. Variations of this problem include dictionary learning in signal processing, nonnegative matrix factorization and sparse principal component analysis. In this paper, we propose to address these tasks with a new online optimization algorithm, based on stochastic approximations
Policy gradient methods for reinforcement learning with function approximation.
 In NIPS,
, 1999
"... Abstract Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly repres ..."
Abstract

Cited by 439 (20 self)
 Add to MetaCart
, but has several limitations. First, it is oriented toward finding deterministic policies, whereas the optimal policy is often stochastic, selecting different actions with specific probabilities (e.g., see In this paper we explore an alternative approach to function approximation in RL. Rather than
Results 1  10
of
3,096