Batch mode reinforcement learning based on the synthesis of artificial trajectories. (2013)

by Raphael Fonteneau, Susan A Murphy, Louis Wehenkel, Damien Ernst
Venue:Annals of Operations Research,