A comparison of Q-learning and Classifier Systems (1994)
| Venue: | In Proceedings of From Animals to Animats, Third International Conference on Simulation of Adaptive Behavior |
| Citations: | 32 - 4 self |
BibTeX
@INPROCEEDINGS{Dorigo94acomparison,
author = {Marco Dorigo and Hugues Bersini},
title = {A comparison of Q-learning and Classifier Systems},
booktitle = {In Proceedings of From Animals to Animats, Third International Conference on Simulation of Adaptive Behavior},
year = {1994},
pages = {248--255},
publisher = {MIT Press}
}
Years of Citing Articles
OpenURL
Abstract
Reinforcement Learning is a class of problems in which an autonomous agent acting in a given environment improves its behavior by progressively maximizing a function calculated just on the basis of a succession of scalar responses received from the environment. Q-learning and classifier systems (CS) are two methods among the most used to solve reinforcement learning problems. Notwithstanding their popularity and their shared goal, they have been in the past often considered as two different models. In this paper we first show that the classifier system, when restricted to a sharp simplification called discounted max very simple classifier system (D MAX - VSCS), boils down to tabular Q-learning. It follows that D MAX -VSCS converges to the optimal policy as proved by Watkins & Dayan (1992), and that it can draw profit from the results of experimental and theoretical works dedicated to improve Q-learning and to facilitate its use in concrete applications. In the second part of the paper...







