Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path (2006)

Cached

Download Links

by András Antos , Csaba Szepesvári , Rémi Munos
Venue:In COLT-19
Citations:74 - 20 self