Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions (1993)

by Ronald Williams , Leemon C. Baird
Citations:83 - 1 self