Modi policy iteration algorithm for discounted Markov decision problem (1978)

by M L Puterman, M C Shin
Venue:Puterman, 1990