Temporal differences-based policy iteration and applications in neuro-dynamic programming (1996)
by
Dimitri P. Bertsekas
,
Sergey Ioffe
| Citations: | 30 - 6 self |







