• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 2,342
Next 10 →

Emphatic Temporal-Difference Learning

by A. Rupam, Mahmood Huizhen, Yu Martha, White Richard, S. Sutton
"... Emphatic algorithms are temporal-difference learning algorithms that change their ef-fective state distribution by selectively emphasizing and de-emphasizing their updates on different time steps. Recent works by Sutton, Mahmood and White (2015), and Yu (2015) show that by varying the emphasis in a ..."
Abstract - Add to MetaCart
Emphatic algorithms are temporal-difference learning algorithms that change their ef-fective state distribution by selectively emphasizing and de-emphasizing their updates on different time steps. Recent works by Sutton, Mahmood and White (2015), and Yu (2015) show that by varying the emphasis in a

Dual Temporal Difference Learning

by Min Yang, Yuxi Li, Dale Schuurmans
"... Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning al ..."
Abstract - Add to MetaCart
Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning

! Temporal-difference learning

by unknown authors
"... ! Function approximation (e.g., linear) ..."
Abstract - Add to MetaCart
! Function approximation (e.g., linear)

An analysis of temporal-difference learning with function approximation

by John N. Tsitsiklis, Benjamin Van Roy - IEEE Transactions on Automatic Control , 1997
"... We discuss the temporal-difference learning algorithm, as applied to approximating the cost-to-go function of an infinite-horizon discounted Markov chain. The algorithm weanalyze updates parameters of a linear function approximator on-line, duringasingle endless trajectory of an irreducible aperiodi ..."
Abstract - Cited by 313 (8 self) - Add to MetaCart
We discuss the temporal-difference learning algorithm, as applied to approximating the cost-to-go function of an infinite-horizon discounted Markov chain. The algorithm weanalyze updates parameters of a linear function approximator on-line, duringasingle endless trajectory of an irreducible

Average cost temporal-difference learning

by John N. Tsitsiklis, Benjamin Van Roy , 1999
"... We propose a variant of temporal-difference learning that approximates average and differential costs of an irreducible aperiodic Markov chain. Approximations are comprised of linear combinations of fixed basis functions whose weights are incrementally updated during a single endless trajectory of t ..."
Abstract - Cited by 27 (4 self) - Add to MetaCart
We propose a variant of temporal-difference learning that approximates average and differential costs of an irreducible aperiodic Markov chain. Approximations are comprised of linear combinations of fixed basis functions whose weights are incrementally updated during a single endless trajectory

Temporal Difference Learning

by In Chinese Chess, Thong B. Trinh, Anwer S. Bashi, Nikhil Deshp - In Proc. IEA/AIE Conf , 1998
"... Reinforcement learning, in general, has not been totally successful at solving complex realworld problems which can be described by nonlinear functions. However, temporal difference learning is a type of reinforcement learning algorithm that has been researched and applied to various prediction prob ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
problems with promising results. This paper discusses the application of temporal-difference learning in the training of a neural network to play a scaled-down version of the board game Chinese Chess. Preliminary results show that this technique is favorable for producing desired results. In test cases

Relational temporal difference learning

by Nima Asgharbeygi, David Stracuzzi, Pat Langley - In ICML , 2006
"... We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal difference reinforcement to learn a distributed value function represented over a conceptual hierarchy of relational pred ..."
Abstract - Cited by 8 (2 self) - Add to MetaCart
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal difference reinforcement to learn a distributed value function represented over a conceptual hierarchy of relational

On the Convergence of Temporal-Difference Learning with Linear Function Approximation

by unknown authors
"... Abstract. The asymptotic properties of temporal-difference learning algorithms with linear function approxi-mation are analyzed in this paper. The analysis is carried out in the context of the approximation of a discounted cost-to-go function associated with an uncontrolled Markov chain with an unco ..."
Abstract - Add to MetaCart
Abstract. The asymptotic properties of temporal-difference learning algorithms with linear function approxi-mation are analyzed in this paper. The analysis is carried out in the context of the approximation of a discounted cost-to-go function associated with an uncontrolled Markov chain

Practical Issues in Temporal Difference Learning

by Gerald Tesauro - Machine Learning , 1992
"... This paper examines whether temporal difference methods for training connectionist networks, such as Suttons's TD(lambda) algorithm can be successfully applied to complex real-world problems. A number of important practical issues are identified and discussed from a general theoretical perspect ..."
Abstract - Cited by 415 (2 self) - Add to MetaCart
This paper examines whether temporal difference methods for training connectionist networks, such as Suttons's TD(lambda) algorithm can be successfully applied to complex real-world problems. A number of important practical issues are identified and discussed from a general theoretical

Coevolutionary Temporal Difference Learning for Othello

by Marcin Szubert, Krzysztof Krawiec
"... Abstract — This paper presents Coevolutionary Temporal Difference Learning (CTDL), a novel way of hybridizing coevolutionary search with reinforcement learning that works by interlacing one-population competitive coevolution with temporal difference learning. The coevolutionary part of the algorithm ..."
Abstract - Cited by 14 (8 self) - Add to MetaCart
Abstract — This paper presents Coevolutionary Temporal Difference Learning (CTDL), a novel way of hybridizing coevolutionary search with reinforcement learning that works by interlacing one-population competitive coevolution with temporal difference learning. The coevolutionary part
Next 10 →
Results 1 - 10 of 2,342
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University