A Reinforcement Learning Method for Maximizing Undiscounted Rewards (1993)

by A Schwartz
Venue:Proc. of the 10th Intl. Conf. On Machine Learning