Optimization Flow Control, I: Basic Algorithm and Convergence
 IEEE/ACM TRANSACTIONS ON NETWORKING
, 1999
Abstract

Cited by 690 (64 self)
We propose an optimization approach to flow control where the objective is to maximize the aggregate source utility over their transmission rates. We view network links and sources as processors of a distributed computation system to solve the dual problem using gradient projection algorithm
Learning LongTerm Dependencies with Gradient Descent is Difficult
 TO APPEAR IN THE SPECIAL ISSUE ON RECURRENT NETWORKS OF THE IEEE TRANSACTIONS ON NEURAL NETWORKS
Abstract

Cited by 374 (35 self)
in the input/output sequences span long intervals. We showwhy gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a tradeoff between efficient learning by gradient descent and latching on information
A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm
 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS
, 1993
Abstract

Cited by 917 (34 self)
A new learning algorithm for multilayer feedforward networks, RPROP, is proposed. To overcome the inherent disadvantages of pure gradientdescent, RPROP performs a local adaptation of the weightupdates according to the behaviour of the errorfunction. In substantial difference to other adaptive
The particel swarm: Explosion, stability, and convergence in a multidimensional complex space
 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTION
Abstract

Cited by 822 (10 self)
’s convergence tendencies. Some results of the particle swarm optimizer, implementing modifications derived from the analysis, suggest methods for altering the original algorithm in ways that eliminate problems and increase the optimization power of the particle swarm
Algorithms for Nonnegative Matrix Factorization
 In NIPS
, 2001
Abstract

Cited by 1230 (5 self)
. The algorithms can also be interpreted as diagonally rescaled gradient descent, where the rescaling factor is optimally chosen to ensure convergence.
Gradient flows in metric spaces and in the space of probability measures
 LECTURES IN MATHEMATICS ETH ZÜRICH, BIRKHÄUSER VERLAG
, 2005
Sequence labeling · Stochastic gradient descent
"... Periodic stepsize adaptation in secondorder gradient ..."
Mean shift, mode seeking, and clustering
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1995
Abstract

Cited by 620 (0 self)
seeking process on a surface constructed with a “shadow ” kernel. For Gaussian kernels, mean shift is a gradient mapping. Convergence is studied for mean shift iterations. Cluster analysis is treated as a deterministic problem of finding a fixed point of mean shift that characterizes the data. Applications
Online Learning with Kernels
, 2003
Abstract

Cited by 2807 (126 self)
use of these methods in an online setting suitable for realtime applications. In this paper we consider online learning in a Reproducing Kernel Hilbert Space. By considering classical stochastic gradient descent within a feature space, and the use of some straightforward tricks, we develop simple
Equivariant Adaptive Source Separation
 IEEE Trans. on Signal Processing
, 1996
Abstract

Cited by 448 (9 self)
Source separation consists in recovering a set of independent signals when only mixtures with unknown coefficients are observed. This paper introduces a class of adaptive algorithms for source separation which implements an adaptive version of equivariant estimation and is henceforth called EASI
