Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems

by Satinder Singh , Dimitri Bertsekas
Citations:124 - 5 self

Documents Related by Co-Citation

3760 Reinforcement Learning I: Introduction – Richard S. Sutton, Andrew G. Barto - 1998
224 TD-gammon, a self-teaching backgammon program, achieves master-level play – G J Tesauro - 1994
1226 Learning to predict by the methods of temporal differences – Richard S. Sutton - 1988
612 Some studies in machine learning using the game of Checkers – Arthur L. Samuel - 1959
278 Improving Elevator Performance Using Reinforcement Learning – Robert Crites, Andrew Barto - 1996
112 A Reinforcement Learning Approach to Job-shop Scheduling – Wei Zhang, Thomas G. Dietterich - 1995
111 Reinforcement Learning with Soft State Aggregation – Satinder P. Singh, Tommi Jaakkola, Michael I. Jordan - 1995
1298 Reinforcement learning: a survey – Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore - 1996
1309 Learning from Delayed Rewards – C Watkins - 1989
218 An analysis of temporal-difference learning with function approximation – John N. Tsitsiklis, Benjamin Van Roy - 1997
174 Actor-Critic Algorithms – Vijay R. Konda, John N. Tsitsiklis - 2001
471 Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems – A G Barto, R S Sutton, C W Anderson - 1983
319 Policy Gradient Methods for Reinforcement Learning with Function Approximation – Richard S. Sutton, David Mcallester, Satinder Singh, Yishay Mansour - 1999
207 Convergence of Stochastic Iterative Dynamic Programming Algorithms – Tommi Jaakkola, Michael I. Jordan, Satinder P. Singh - 1994
182 Linear least-squares algorithms for temporal difference learning – Steven J. Bradtke, Andrew G. Barto, Pack Kaelbling - 1996
15 Learning to play chess using temporal-differences – J Baxter, A Tridgell, L Weaver - 2000
127 Gradient Descent for General Reinforcement Learning – Leemon Baird, Andrew Moore - 1998
7 Algorithms for Sensitivity Analysis of Markov Chains Through Potentials and Perturbation Realization – X-R Cao, Y-W Wan - 1998
23 Reinforcement learning in POMDPs with function approximation, in – H Kimura, K Miyazaki, S Kobayashi - 1997