Infinite-horizon policy-gradient estimation (0)

by J Baxter, P Bartlett
Venue:JAIR