Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions (1993)

by Ronald Williams , Leemon C. Baird
Citations:86 - 1 self