Results 1 -
5 of
5
Convergence of Stochastic Iterative Dynamic Programming Algorithms
- Neural Computation
, 1994
"... Increasing attention has recently been paid to algorithms based on dynamic programming (DP) due to the suitability of DP for learning problems involving control. In stochastic environments where the system being controlled is only incompletely known, however, a unifying theoretical account of th ..."
Abstract
-
Cited by 187 (8 self)
- Add to MetaCart
Increasing attention has recently been paid to algorithms based on dynamic programming (DP) due to the suitability of DP for learning problems involving control. In stochastic environments where the system being controlled is only incompletely known, however, a unifying theoretical account of the behavior of these methods has been missing. In this paper we relate DP-based learning algorithms to powerful techniques of stochastic approximation via a new convergence theorem, enabling us to establish a class of convergent algorithms to which both TD() and Q-learning belong. 1
A nonparametric approach to pricing and hedging derivative securities via learning networks
- Journal of Finance
, 1994
"... http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-com ..."
Abstract
-
Cited by 84 (4 self)
- Add to MetaCart
http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
Update rules for parameter estimation in Bayesian networks
, 1997
"... This paper re-examines the problem of parameter estimation in Bayesian networks with missing values and hidden variables from the perspective of recent work in on-line learning [12]. We provide a unified framework for parameter estimation that encompasses both on-line learning, where the model is co ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
This paper re-examines the problem of parameter estimation in Bayesian networks with missing values and hidden variables from the perspective of recent work in on-line learning [12]. We provide a unified framework for parameter estimation that encompasses both on-line learning, where the model is continuously adapted to new data cases as they arrive, and the more traditional batch learning, where a pre-accumulated set of samples is used in a one-time model selection process. In the batch case, our framework encompassesboth the gradient projection algorithm [2, 3] and the EM algorithm [14] for Bayesian networks. The framework also leads to new on-line and batch parameter update schemes, including a parameterized version of EM. We provide both empirical and theoretical results indicating that parameterized EM allows faster convergence to the maximum likelihood parameters than does standard EM. 1 Introduction Over the past few years, there has been a growing interest in the problem of le...
A New Parameter Estimation Method for Gaussian Mixtures
- in Advances in Neural Information Processing Systems
, 1998
"... We describe a new iterative method for parameter estimation of Gaussian mixtures. The new method is based on a framework developed by Kivinen and Warmuth for supervised online learning. In contrast to gradient descent and EM, which estimate the mixture's covariance matrices, the proposed method esti ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We describe a new iterative method for parameter estimation of Gaussian mixtures. The new method is based on a framework developed by Kivinen and Warmuth for supervised online learning. In contrast to gradient descent and EM, which estimate the mixture's covariance matrices, the proposed method estimates the inverses of the covariance matrices. Furthermore, the new parameter estimation procedure can be applied in both on-line and batch settings. We show experimentally that it is typically faster than EM, and usually requires about half as many iterations as EM. We also describe experiments with digit recognition that demonstrate the merits of the on-line version when the source generating the data is non-stationary. Keywords: Mixture of Gaussians, On-line learning, EM, Convergence rate, Digit recognition 1 Introduction Mixture models, in particular mixtures of Gaussians, have been a popular tool for density estimation, clustering, and un-supervised learning with a wide range of appl...
A proposed stochastic simulation framework for the government of canada’s debt strategy problem. Bank of Canada: Working Paper 2003-10
, 2003
"... The views expressed in this paper are those of the author. No responsibility for them should be attributed to the Bank of Canada. Contents ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The views expressed in this paper are those of the author. No responsibility for them should be attributed to the Bank of Canada. Contents

