Reinforcement learning with highdimensional continuous actions (1993)

by Leemon C Baird, Harry Klopf