Results 1 - 10
of
2,342
Emphatic Temporal-Difference Learning
"... Emphatic algorithms are temporal-difference learning algorithms that change their ef-fective state distribution by selectively emphasizing and de-emphasizing their updates on different time steps. Recent works by Sutton, Mahmood and White (2015), and Yu (2015) show that by varying the emphasis in a ..."
Abstract
- Add to MetaCart
Emphatic algorithms are temporal-difference learning algorithms that change their ef-fective state distribution by selectively emphasizing and de-emphasizing their updates on different time steps. Recent works by Sutton, Mahmood and White (2015), and Yu (2015) show that by varying the emphasis in a
Dual Temporal Difference Learning
"... Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning al ..."
Abstract
- Add to MetaCart
Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning
An analysis of temporal-difference learning with function approximation
- IEEE Transactions on Automatic Control
, 1997
"... We discuss the temporal-difference learning algorithm, as applied to approximating the cost-to-go function of an infinite-horizon discounted Markov chain. The algorithm weanalyze updates parameters of a linear function approximator on-line, duringasingle endless trajectory of an irreducible aperiodi ..."
Abstract
-
Cited by 313 (8 self)
- Add to MetaCart
We discuss the temporal-difference learning algorithm, as applied to approximating the cost-to-go function of an infinite-horizon discounted Markov chain. The algorithm weanalyze updates parameters of a linear function approximator on-line, duringasingle endless trajectory of an irreducible
Average cost temporal-difference learning
, 1999
"... We propose a variant of temporal-difference learning that approximates average and differential costs of an irreducible aperiodic Markov chain. Approximations are comprised of linear combinations of fixed basis functions whose weights are incrementally updated during a single endless trajectory of t ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
We propose a variant of temporal-difference learning that approximates average and differential costs of an irreducible aperiodic Markov chain. Approximations are comprised of linear combinations of fixed basis functions whose weights are incrementally updated during a single endless trajectory
Temporal Difference Learning
- In Proc. IEA/AIE Conf
, 1998
"... Reinforcement learning, in general, has not been totally successful at solving complex realworld problems which can be described by nonlinear functions. However, temporal difference learning is a type of reinforcement learning algorithm that has been researched and applied to various prediction prob ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
problems with promising results. This paper discusses the application of temporal-difference learning in the training of a neural network to play a scaled-down version of the board game Chinese Chess. Preliminary results show that this technique is favorable for producing desired results. In test cases
Relational temporal difference learning
- In ICML
, 2006
"... We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal difference reinforcement to learn a distributed value function represented over a conceptual hierarchy of relational pred ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal difference reinforcement to learn a distributed value function represented over a conceptual hierarchy of relational
On the Convergence of Temporal-Difference Learning with Linear Function Approximation
"... Abstract. The asymptotic properties of temporal-difference learning algorithms with linear function approxi-mation are analyzed in this paper. The analysis is carried out in the context of the approximation of a discounted cost-to-go function associated with an uncontrolled Markov chain with an unco ..."
Abstract
- Add to MetaCart
Abstract. The asymptotic properties of temporal-difference learning algorithms with linear function approxi-mation are analyzed in this paper. The analysis is carried out in the context of the approximation of a discounted cost-to-go function associated with an uncontrolled Markov chain
Practical Issues in Temporal Difference Learning
- Machine Learning
, 1992
"... This paper examines whether temporal difference methods for training connectionist networks, such as Suttons's TD(lambda) algorithm can be successfully applied to complex real-world problems. A number of important practical issues are identified and discussed from a general theoretical perspect ..."
Abstract
-
Cited by 415 (2 self)
- Add to MetaCart
This paper examines whether temporal difference methods for training connectionist networks, such as Suttons's TD(lambda) algorithm can be successfully applied to complex real-world problems. A number of important practical issues are identified and discussed from a general theoretical
Coevolutionary Temporal Difference Learning for Othello
"... Abstract — This paper presents Coevolutionary Temporal Difference Learning (CTDL), a novel way of hybridizing coevolutionary search with reinforcement learning that works by interlacing one-population competitive coevolution with temporal difference learning. The coevolutionary part of the algorithm ..."
Abstract
-
Cited by 14 (8 self)
- Add to MetaCart
Abstract — This paper presents Coevolutionary Temporal Difference Learning (CTDL), a novel way of hybridizing coevolutionary search with reinforcement learning that works by interlacing one-population competitive coevolution with temporal difference learning. The coevolutionary part
Results 1 - 10
of
2,342