Results 1  10
of
65
Convergence of a stochastic approximation version of the EM algorithm
, 1997
"... The Expectation Maximization (EM) algorithm is a powerful computational technique for locating maxima of functions... ..."
Abstract

Cited by 158 (15 self)
 Add to MetaCart
The Expectation Maximization (EM) algorithm is a powerful computational technique for locating maxima of functions...
Adaptive Stochastic Approximation by the Simultaneous Perturbation Method
, 2000
"... Stochastic approximation (SA) has long been applied for problems of minimizing loss functions or root finding with noisy input information. As with all stochastic search algorithms, there are adjustable algorithm coefficients that must be specified, and that can have a profound effect on algorithm p ..."
Abstract

Cited by 87 (4 self)
 Add to MetaCart
Stochastic approximation (SA) has long been applied for problems of minimizing loss functions or root finding with noisy input information. As with all stochastic search algorithms, there are adjustable algorithm coefficients that must be specified, and that can have a profound effect on algorithm performance. It is known that choosing these coefficients according to an SA analog of the deterministic NewtonRaphson algorithm provides an optimal or nearoptimal form of the algorithm. However, directly determining the required Hessian matrix (or Jacobian matrix for root finding) to achieve this algorithm form has often been difficult or impossible in practice. This paper presents a general adaptive SA algorithm that is based on a simple method for estimating the Hessian matrix, while concurrently estimating the primary parameters of interest. The approach applies in both the gradientfree optimization (KieferWolfowitz) and rootfinding/stochastic gradientbased (RobbinsMonro) settings, and is based on the "simultaneous perturbation (SP)" idea introduced previously. The algorithm requires only a small number of loss function or gradient measurements per iterationindependent of the problem dimensionto adaptively estimate the Hessian and parameters of primary interest. Aside from introducing the adaptive SP approach, this paper presents practical implementation guidance, asymptotic theory, and a nontrivial numerical evaluation. Also included is a discussion and numerical analysis comparing the adaptive SP approach with the iterateaveraging approach to accelerated SA.
Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization
 Journal of Discrete Event Dynamic Systems: Theory and Applications
, 2000
"... Abstract. Ordinal Optimization has emerged as an efficient technique for simulation and optimization. Exponential convergence rates can be achieved in many cases. In this paper, we present a new approach that can further enhance the efficiency of ordinal optimization. Our approach determines a highl ..."
Abstract

Cited by 81 (23 self)
 Add to MetaCart
(Show Context)
Abstract. Ordinal Optimization has emerged as an efficient technique for simulation and optimization. Exponential convergence rates can be achieved in many cases. In this paper, we present a new approach that can further enhance the efficiency of ordinal optimization. Our approach determines a highly efficient number of simulation replications or samples and significantly reduces the total simulation cost. We also compare several different allocation procedures, including a popular twostage procedure in simulation literature. Numerical testing shows that our approach is much more efficient than all compared methods. The results further indicate that our approach can obtain a speedup factor of higher than 20 above and beyond the speedup achieved by the use of ordinal optimization for a 210design example.
Stability of Stochastic Approximation Under Verifiable Conditions
 SIAM J. Control and Optimization
, 2005
"... procedure In this paper we address the problem of the stability and convergence of the stochastic approximation θn+1 = θn + γn+1[h(θn) + ξn+1]. The stability of such sequences {θn} is known to heavily rely on the behaviour of the mean field h at the boundary of the parameter set and the magnitude of ..."
Abstract

Cited by 79 (10 self)
 Add to MetaCart
(Show Context)
procedure In this paper we address the problem of the stability and convergence of the stochastic approximation θn+1 = θn + γn+1[h(θn) + ξn+1]. The stability of such sequences {θn} is known to heavily rely on the behaviour of the mean field h at the boundary of the parameter set and the magnitude of the stepsizes used. The conditions typically required to ensure convergence, and in particular the boundedness or stability of {θn}, are either too difficult to check in practice or not satisfied at all. This is the case even for very simple models. The most popular technique to circumvent the stability problem consists of constraining {θn} to a compact subset K in the parameter space. This is obviously not a satisfactory solution as the choice of K is a delicate one. In the present contribution we first prove a “deterministic ” stability result which relies on simple conditions on the sequences {ξn} and {γn}. We then propose and analyze an algorithm based on projections on adaptive truncation sets which ensures that the aforementioned conditions required for stability are satisfied. We focus in particular on the case where {ξn} is a socalled Markov statedependent noise. We establish both the stability and convergence w.p. 1 of the algorithm under a set of simple and verifiable assumptions. We illustrate our results with an example related to adaptive Markov chain Monte Carlo algorithms. Key words. Stochastic approximation, statedependent noise, randomly varying truncation, Adaptive Markov Chain
Theoretic aspects of the SOM algorithm
 in: Proceedings of Workshop on SelfOrganising Maps (WSOM’97
, 1997
"... ..."
A comment on contrastive divergence
 Proc. of NIPS
, 2004
"... This paper analyses the Contrastive Divergence algorithm for learning statistical parameters. We relate the algorithm to the stochastic approximation literature. This enables us to specify conditions under which the algorithm is guaranteed to converge to the optimal solution (with probability 1). Th ..."
Abstract

Cited by 42 (0 self)
 Add to MetaCart
(Show Context)
This paper analyses the Contrastive Divergence algorithm for learning statistical parameters. We relate the algorithm to the stochastic approximation literature. This enables us to specify conditions under which the algorithm is guaranteed to converge to the optimal solution (with probability 1). This includes necessary and sufficient conditions for the solution to be unbiased. 1
Online learning and stochastic approximations
 In Online Learning in Neural Networks
, 1998
"... The convergence of online learning algorithms is analyzed using the tools of the stochastic approximation theory, and proved under very weak conditions. A general framework for online learning algorithms is first presented. This framework encompasses the most common online learning algorithms in use ..."
Abstract

Cited by 35 (0 self)
 Add to MetaCart
(Show Context)
The convergence of online learning algorithms is analyzed using the tools of the stochastic approximation theory, and proved under very weak conditions. A general framework for online learning algorithms is first presented. This framework encompasses the most common online learning algorithms in use today, as illustrated by several examples. The stochastic approximation theory then provides general results describing the convergence of all these learning algorithms at once.
On The Convergence Of Markovian Stochastic Algorithms With Rapidly Decreasing Ergodicity Rates
 STOCHASTICS AND STOCHASTICS MODELS
, 1999
"... We analyse the convergence of stochastic algorithms with Markovian noise when the ergodicity of the Markov chain governing the noise rapidly decreases as the control parameter tends to infinity. In such a case, there may be a positive probability of divergence of the algorithm in the classic Robbins ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
We analyse the convergence of stochastic algorithms with Markovian noise when the ergodicity of the Markov chain governing the noise rapidly decreases as the control parameter tends to infinity. In such a case, there may be a positive probability of divergence of the algorithm in the classic RobbinsMonro form. We provide modifications of the algorithm which ensure convergence. Moreover, we analyse the asymptotic behaviour of these algorithms and state a diffusion approximation theorem.