Results 1  10
of
1,454,816
Stability of Stochastic Approximation Under Verifiable Conditions
 SIAM J. Control and Optimization
, 2005
"... procedure In this paper we address the problem of the stability and convergence of the stochastic approximation θn+1 = θn + γn+1[h(θn) + ξn+1]. The stability of such sequences {θn} is known to heavily rely on the behaviour of the mean field h at the boundary of the parameter set and the magnitude of ..."
Abstract

Cited by 78 (10 self)
 Add to MetaCart
and convergence w.p. 1 of the algorithm under a set of simple and verifiable assumptions. We illustrate our results with an example related to adaptive Markov chain Monte Carlo algorithms. Key words. Stochastic approximation, statedependent noise, randomly varying truncation, Adaptive Markov Chain
Insertion sequences
 Microbiol Mol. Biol. Rev
, 1998
"... These include: Receive: RSS Feeds, eTOCs, free email alerts (when new articles cite this article), more» Downloaded from ..."
Abstract

Cited by 426 (3 self)
 Add to MetaCart
These include: Receive: RSS Feeds, eTOCs, free email alerts (when new articles cite this article), more» Downloaded from
Learning from demonstration
 Advances in Neural Information Processing Systems 9
, 1997
"... By now it is widely accepted that learning a task from scratch, i.e., without any prior knowledge, is a daunting undertaking. Humans, however, rarely attempt to learn from scratch. They extract initial biases as well as strategies how to approach a learning problem from instructions and/or demonstra ..."
Abstract

Cited by 392 (32 self)
 Add to MetaCart
By now it is widely accepted that learning a task from scratch, i.e., without any prior knowledge, is a daunting undertaking. Humans, however, rarely attempt to learn from scratch. They extract initial biases as well as strategies how to approach a learning problem from instructions and/or demonstrations of other humans. For learning control, this paper investigates how learning from demonstration can be applied in the context of reinforcement learning. We consider priming the Qfunction, the value function, the policy, and the model of the task dynamics as possible areas where demonstrations can speed up learning. In general nonlinear learning problems, only modelbased reinforcement learning shows significant speedup after a demonstration, while in the special case of linear quadratic regulator (LQR) problems, all methods profit from the demonstration. In an implementation of pole balancing on a complex anthropomorphic robot arm, we demonstrate that, when facing the complexities of real signal processing, modelbased reinforcement learning offers the most robustness for LQR problems. Using the suggested methods, the robot learns pole balancing in just a single trial after a 30 second long demonstration of the human instructor. 1.
Tractable inference for complex stochastic processes
 In Proc. UAI
, 1998
"... The monitoring and control of any dynamic system depends crucially on the ability to reason about its current status and its future trajectory. In the case of a stochastic system, these tasks typically involve the use of a belief state—a probability distribution over the state of the process at a gi ..."
Abstract

Cited by 298 (14 self)
 Add to MetaCart
is intractable. We investigate the idea of maintaining a compact approximation to the true belief state, and analyze the conditions under which the errors due to the approximations taken over the lifetime of the process do not accumulate to make our answers completely irrelevant. We show that the error in a
On Positive Harris Recurrence of Multiclass Queueing Networks: A Unified Approach Via Fluid Limit Models
 Annals of Applied Probability
, 1995
"... It is now known that the usual traffic condition (the nominal load being less than one at each station) is not sufficient for stability for a multiclass open queueing network. Although there has been some progress in establishing the stability conditions for a multiclass network, there is no unified ..."
Abstract

Cited by 352 (28 self)
 Add to MetaCart
It is now known that the usual traffic condition (the nominal load being less than one at each station) is not sufficient for stability for a multiclass open queueing network. Although there has been some progress in establishing the stability conditions for a multiclass network
Understanding FaultTolerant Distributed Systems
 COMMUNICATIONS OF THE ACM
, 1993
"... We propose a small number of basic concepts that can be used to explain the architecture of faulttolerant distributed systems and we discuss a list of architectural issues that we find useful to consider when designing or examining such systems. For each issue we present known solutions and design ..."
Abstract

Cited by 374 (23 self)
 Add to MetaCart
We propose a small number of basic concepts that can be used to explain the architecture of faulttolerant distributed systems and we discuss a list of architectural issues that we find useful to consider when designing or examining such systems. For each issue we present known solutions and design alternatives, we discuss their relative merits and we give examples of systems which adopt one approach or the other. The aim is to introduce some order in the complex discipline of designing and understanding faulttolerant distributed systems.
Asynchronous stochastic approximation and Qlearning
 Machine Learning
, 1994
"... Abstract £ We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Qlearning algorithm, areinforcement learning method for solving Markov decision problems, and establi ..."
Abstract

Cited by 202 (4 self)
 Add to MetaCart
Abstract £ We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Qlearning algorithm, areinforcement learning method for solving Markov decision problems
Stability, queue length and delay of deterministic and stochastic queueing networks
 IEEE Transactions on Automatic Control
, 1994
"... Motivated by recent development in high speed networks, in this paper we study two types of stability problems: (i) conditions for queueing networks that render bounded queue lengths and bounded delay for customers, and (ii) conditions for queueing networks in which the queue length distribution of ..."
Abstract

Cited by 230 (21 self)
 Add to MetaCart
Motivated by recent development in high speed networks, in this paper we study two types of stability problems: (i) conditions for queueing networks that render bounded queue lengths and bounded delay for customers, and (ii) conditions for queueing networks in which the queue length distribution
Weak Convergence And Optimal Scaling Of Random Walk Metropolis Algorithms
, 1994
"... This paper considers the problem of scaling the proposal distribution of a multidimensional random walk Metropolis algorithm, in order to maximize the efficiency of the algorithm. The main result is a weak convergence result as the dimension of a sequence of target densities, n, converges to infinit ..."
Abstract

Cited by 278 (34 self)
 Add to MetaCart
to infinity. When the proposal variance is appropriately scaled according to n, the sequence of stochastic processes formed by the first component of each Markov chain, converge to the appropriate limiting Langevin diffusion process. The limiting diffusion approximation admits a straightforward efficiency
A RiskFactor Model Foundation for RatingsBased Bank Capital Rules
 Journal of Financial Intermediation
, 2003
"... When economic capital is calculated using a portfolio model of credit valueatrisk, the marginal capital requirement for an instrument depends, in general, on the properties of the portfolio in which it is held. By contrast, ratingsbased capital rules, including both the current Basel Accord and i ..."
Abstract

Cited by 283 (1 self)
 Add to MetaCart
When economic capital is calculated using a portfolio model of credit valueatrisk, the marginal capital requirement for an instrument depends, in general, on the properties of the portfolio in which it is held. By contrast, ratingsbased capital rules, including both the current Basel Accord and its proposed revision, assign a capital charge to an instrument based only on its own characteristics. I demonstrate that ratingsbased capital rules can be reconciled with the general class of credit VaR models. Contributions to VaR are portfolioinvariant only if (a) there is only a single systematic risk factor driving correlations across obligors, and (b) no exposure in a portfolio accounts for more than an arbitrarily small share of total exposure. Analysis of rates of convergence to asymptotic VaR leads to a simple and accurate portfoliolevel addon charge for undiversified idiosyncratic risk. There is no similarly simple way to address violation of the single factor assumption.
Results 1  10
of
1,454,816