|
472
|
Learning to act using real-time dynamic programming
– Andrew G. Barto, Steven J. Bradtke, Satinder P. Singh
- 1993
|
|
427
|
Dyna, an Integrated Architecture for Learning, Planning, and Reacting
– Richard S. Sutton
- 1991
|
|
39
|
Reinforcement learning is direct adaptive optimal control
– Richard S. Sutton, Andrew G. Barto, Ronald J. Williams
- 1991
|
|
|
Solution of Delayed Reinforcement Learning Problems Having Continuous Action Spaces
– B. Ravindran
- 1996
|
|
20
|
Incremental Dynamic Programming for On-Line Adaptive Optimal Control
– Steven J. Bradtke
- 1994
|
|
9
|
A Tutorial Survey of Reinforcement Learning
– S Sathiya Keerthi, B Ravindran
|
|
43
|
Learning to Solve Markovian Decision Processes
– Satinder P. Singh
- 1994
|
|
4
|
Reinforcement Learning in Non-Markov Environments
– Steven D. Whitehead, Long Ji Lin
- 1992
|
|
139
|
Interaction and Intelligent Behavior
– Maja J Mataric
- 1994
|
|
7
|
Learning with Incomplete Selective Perception
– R. Andrew Mccallum
- 1993
|
|
88
|
Operational Rationality through Compilation of Anytime Algorithms
– Shlomo Zilberstein
- 1993
|
|
|
Shlomo Zilberstein
– Technion Israel Institute, Shlomo Zilberstein
- 1993
|
|
|
Intelligence
– Andrew G. Barto, Steven J. Bradtke, Satinder I Singh
- 1993
|
|
|
ATutorial Survey of Reinforcement Learning
– Ssathiya Keerthi, B Ravindran
|
|
6
|
Advances in reinforcement learning and their implications for intelligent control
– Steven D. Whitehead, Richard S. Suttoni, Dana H. Ballard
- 1990
|
|
2
|
Learning Decision Strategies with Genetic Algorithms
– John J. Grefenstette
- 1992
|
|
1
|
A Tutorial on Reinforcement Learning Techniques
– Carlos Henrique, Costa Ribeiro
|
|
|
Aprendizado por Reforço
– Carlos Henrique Costa Ribeiro
- 1999
|
|
3
|
The convergence of TD(X) for general k
– Peter Dayan
- 1992
|