Results 1 
4 of
4
Making a Robot Learn to Play Soccer Using Reward and Punishment
"... Abstract In this paper, we show how reinforcement learning can be applied to real robots to achieve optimal robot behavior. As example, we enable an autonomous soccer robot to learn intercepting a rolling ball. Main focus is on how to adapt the Qlearning algorithm to the needs of learning strategie ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Abstract In this paper, we show how reinforcement learning can be applied to real robots to achieve optimal robot behavior. As example, we enable an autonomous soccer robot to learn intercepting a rolling ball. Main focus is on how to adapt the Qlearning algorithm to the needs of learning strategies for real robots and how to transfer strategies learned in simulation onto real robots. 1
TD(0) converges provably faster than the residual gradient algorithm
 Proceedings of the Twentieth International Conference on Machine Learning (ICML03
, 2003
"... In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the concept of asymptotic convergence rate to prove that under certain conditions the synchronous offpolicy TD(0) algorithm ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the concept of asymptotic convergence rate to prove that under certain conditions the synchronous offpolicy TD(0) algorithm converges faster than the synchronous offpolicy residual gradient algorithm if the value function is represented in tabular form. This is the first theoretical result comparing the convergence behaviour of two RL algorithms. We also show that as soon as linear function approximation is involved no general statement concerning the superiority of one of the algorithms can be made. 1.
Convergence of Synchronous Reinforcement Learning with Linear Function Approximation
, 2004
"... Synchronous reinforcement learning (RL) algorithms with linear function approximation are representable as inhomogeneous matrix iterations of a special form (Schoknecht & Merke, 2003). In this paper we state conditions of convergence for general inhomogeneous matrix iterations and prove that t ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Synchronous reinforcement learning (RL) algorithms with linear function approximation are representable as inhomogeneous matrix iterations of a special form (Schoknecht & Merke, 2003). In this paper we state conditions of convergence for general inhomogeneous matrix iterations and prove that they are both necessary and su#cient. This result extends the work presented in (Schoknecht & Merke, 2003), where only a su#cient condition of convergence was proved. As the condition of convergence is necessary and sufficient, the new result is suitable to prove convergence and divergence of RL algorithms with function approximation. We use the theorem to deduce a new concise proof of convergence for the synchronous residual gradient algorithm (Baird, 1995). Moreover, we derive a counterexample for which the uniform RL algorithm (Merke & Schoknecht, 2002) diverges. This yields a negative answer to the open question if the uniform RL algorithm converges for arbitrary multiple transitions.
A Formal Framework for Reinforcement Learning with Function Approximation in Learning Classifier Systems
, 2006
"... To fully understand the properties of Accuracybased Learning Classifier Systems, we need a formal framework that captures all components of classifier systems, that is, function approximation, reinforcement learning, and classifier replacement, and permits the modelling of them separately and in th ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
To fully understand the properties of Accuracybased Learning Classifier Systems, we need a formal framework that captures all components of classifier systems, that is, function approximation, reinforcement learning, and classifier replacement, and permits the modelling of them separately and in their interaction. In this paper we extend our previous work on function approximation [22] to reinforcement learning and its interaction between reinforcement learning and function approximation. After giving an overview and derivations for common reinforcement learning methods from first principles, we show how they apply to Learning Classifier Systems. At the same time, we present a new algorithm that is expected to outperform all current methods, discuss the use of XCS with gradient descent and TD(λ), and given an indepth discussion on how to study the convergence of Learning Classifier Systems with a timeinvariant population.