Temporal differences-based policy iteration and applications in neuro-dynamic programming (1996)

by Dimitri P. Bertsekas , Sergey Ioffe
Citations:30 - 6 self