Results 1  10
of
603,649
Pegasos: Primal Estimated subgradient solver for SVM
"... We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on a singl ..."
Abstract

Cited by 541 (21 self)
 Add to MetaCart
single training example. In contrast, previous analyses of stochastic gradient descent methods for SVMs require Ω(1/ɛ2) iterations. As in previously devised SVM solvers, the number of iterations also scales linearly with 1/λ, where λ is the regularization parameter of SVM. For a linear kernel, the total
descent
, 2003
"... general inefficiency of batch training for gradient ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
general inefficiency of batch training for gradient
Variational algorithms for approximate Bayesian inference
, 2003
"... The Bayesian framework for machine learning allows for the incorporation of prior knowledge in a coherent way, avoids overfitting problems, and provides a principled basis for selecting between alternative models. Unfortunately the computations required are usually intractable. This thesis presents ..."
Abstract

Cited by 440 (9 self)
 Add to MetaCart
theorems are presented to pave the road for automated VB derivation procedures in both directed and undirected graphs (Bayesian and Markov networks, respectively). Chapters 35 derive and apply the VB EM algorithm to three commonlyused and important models: mixtures of factor analysers, linear dynamical
Computational Models of Sensorimotor Integration
 SCIENCE
, 1997
"... The sensorimotor integration system can be viewed as an observer attempting to estimate its own state and the state of the environment by integrating multiple sources of information. We describe a computational framework capturing this notion, and some specific models of integration and adaptati ..."
Abstract

Cited by 419 (12 self)
 Add to MetaCart
information from visual and auditory systems is integrated so as to reduce the variance in localization. (2) The effects of a remapping in the relation between visual and auditory space can be predicted from a simple learning rule. (3) The temporal propagation of errors in estimating the hand
Learning by Online Gradient Descent
 Journal of Physics A
, 1995
"... We study online gradientdescent learning in multilayer networks analytically and numerically. The training is based on randomly drawn inputs and their corresponding outputs as defined by a target rule. In the thermodynamic limit we derive deterministic differential equations for the order paramete ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
We study online gradientdescent learning in multilayer networks analytically and numerically. The training is based on randomly drawn inputs and their corresponding outputs as defined by a target rule. In the thermodynamic limit we derive deterministic differential equations for the order
Residual Algorithms: Reinforcement Learning with Function Approximation
 In Proceedings of the Twelfth International Conference on Machine Learning
, 1995
"... A number of reinforcement learning algorithms have been developed that are guaranteed to converge to the optimal solution when used with lookup tables. It is shown, however, that these algorithms can easily become unstable when implemented directly with a general functionapproximation system, such ..."
Abstract

Cited by 307 (6 self)
 Add to MetaCart
, such as a sigmoidal multilayer perceptron, a radialbasisfunction system, a memorybased learning system, or even a linear functionapproximation system. A new class of algorithms, residual gradient algorithms, is proposed, which perform gradient descent on the mean squared Bellman residual, guaranteeing
Gradient descent learning in and out of equilibrium
, 2000
"... Abstract. Relations between the off thermal equilibrium dynamical process of online learning and the thermally equilibrated offline learning are studied for potential gradient descent learning. The approach of Opper to study online Bayesian algorithms is extended to potential based or maximum like ..."
Abstract
 Add to MetaCart
Abstract. Relations between the off thermal equilibrium dynamical process of online learning and the thermally equilibrated offline learning are studied for potential gradient descent learning. The approach of Opper to study online Bayesian algorithms is extended to potential based or maximum
Reinforcement Learning Through Gradient Descent
, 1999
"... Reinforcement learning is often done using parameterized function approximators to store value functions. Algorithms are typically developed for lookup tables, and then applied to function approximators by using backpropagation. This can lead to algorithms diverging on very small, simple MDPs and Ma ..."
Abstract
 Add to MetaCart
and Markov chains, even with linear function approximators and epochwise training. These algorithms are also very difficult to analyze, and difficult to combine with other algorithms. A series of new families of algorithms are derived based on stochastic gradient descent. Since they are derived from first
The Variational Formulation of the FokkerPlanck Equation
 SIAM J. Math. Anal
, 1999
"... The FokkerPlanck equation, or forward Kolmogorov equation, describes the evolution of the probability density for a stochastic process associated with an Ito stochastic differential equation. It pertains to a wide variety of timedependent systems in which randomness plays a role. In this paper, ..."
Abstract

Cited by 281 (22 self)
 Add to MetaCart
that the dynamics may be regarded as a gradient flux, or a steepest descent, for the free energy wi...
Fast GradientDescent Methods for TemporalDifference Learning with Linear Function Approximation
"... offpolicy learning, linear function approximation Sutton, Szepesvári and Maei (2009) recently introduced the first temporaldifference learning algorithm compatible with both linear function approximation and offpolicy training, and whose complexity scales only linearly in the size of the function ..."
Abstract
 Add to MetaCart
offpolicy learning, linear function approximation Sutton, Szepesvári and Maei (2009) recently introduced the first temporaldifference learning algorithm compatible with both linear function approximation and offpolicy training, and whose complexity scales only linearly in the size
Results 1  10
of
603,649