|
|
Residual Algorithms: Reinforcement Learning with Function Approximation
– unknown authors
|
|
20
|
Incremental Dynamic Programming for On-Line Adaptive Optimal Control
– Steven J. Bradtke
- 1994
|
|
19
|
Reinforcement Learning Through Gradient Descent
– Leemon C. Baird, III, Scott Fahlman, Leslie Kaelbling
- 1999
|
|
43
|
Learning to Solve Markovian Decision Processes
– Satinder P. Singh
- 1994
|
|
19
|
Advantage Updating Applied to a Differential Game
– Mance E. Harmon, Leemon C. Baird, III, A. Harry Klopf
- 1995
|
|
9
|
A Tutorial Survey of Reinforcement Learning
– S Sathiya Keerthi, B Ravindran
|
|
2
|
Connectionist Adaptive Control
– Timothy Tristram Jervis
- 1993
|
|
|
unknown title
– unknown authors
|
|
224
|
Generalization in Reinforcement Learning: Safely Approximating the Value Function
– Justin A. Boyan, Andrew W. Moore
- 1995
|
|
23
|
Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models
– Satinder P. Singh
- 1992
|
|
472
|
Learning to act using real-time dynamic programming
– Andrew G. Barto, Steven J. Bradtke, Satinder P. Singh
- 1993
|
|
1134
|
Reinforcement learning: a survey
– Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore
- 1996
|
|
|
Solution of Delayed Reinforcement Learning Problems Having Continuous Action Spaces
– B. Ravindran
- 1996
|
|
9
|
Modular On-line Function Approximation for Scaling up Reinforcement Learning
– Chen Khong Tham
- 1994
|
|
6
|
Approximate Discounted Dynamic Programming Is Unreliable
– Matthew A. F. Mcdonald, Philip Hingston
- 1994
|
|
1
|
Greedy Adaptive Critics for LQR Problems: Convergence Proofs
– Tomas Landelius, Hans Knutsson
|
|
42
|
Problem Solving With Reinforcement Learning
– Gavin Adrian Rummery
- 1995
|
|
40
|
Advantage Updating
– Leemon C. Baird, III
- 1993
|
|
31
|
Modular Neural Networks for Learning Context-Dependent Game Strategies
– Justin A. Boyan
- 1992
|