Modified policy iteration algorithms for discounted Markov decision problems (1978)

by M Puterman, M Shin
Venue:Management Science