MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Reinforcement Learning in Board Games. (2004)

by Imran Ghory May
Add To MetaCart

Abstract:

This project investigates the application of the TD(#) reinforcement learning algorithm and neural networks to the problem of producing an agent that can play board games. It provides a survey of the progress that has been made in this area over the last decade and extends this by suggesting some new possibilities for improvements (based upon theoretical and past empirical evidence). This includes the identification and a formalization (for the first time) of key game properties that are important for TD-Learning and a discussion of di#erent methods of generate training data. Also included is the development of a TD-learning game system (including a game-independent benchmarking engine) which is capable of learning any zero-sum two-player board game. The primary purpose of the development of this system is to allow potential improvements of the system to be tested and compared in a standardized fashion. Experiments have been conduct with this system using the games Tic-Tac-Toe and Connect 4 to examine a number of di#erent potential improvements.

Citations

931 Learning to predict by the methods of temporal differences – Sutton - 1988
487 Some studies in machine learning using the game of checkers II: Recent progress – Samuel - 1967
298 Practical issues in temporal difference learning – Tesauro - 1992
270 Temporal difference learning and TD-Gammon – Tesauro - 1995
45 Learning to play the game of chess – Thrun - 1995
32 A knowledge-based Approach to Connect-Four. The Game is Solved – Allis - 1988
31 Modular neural networks for learning context-dependent game strategies – Boyan - 1992
30 Knightcap: A chess program that learns by combining TD() with minimax search – Baxter, Tridgell, et al. - 1997
23 Programming backgammon using self-teaching neural nets – Tesauro
22 Experiments with Multi-ProbCut and a New High-Quality Evaluation Function for Othello – Buro
21 Learning evaluation functions for large acyclic domains – Boyan, Moore - 1996
14 Learning Piece values Using Temporal Differences – Beal, Smith - 1997
14 Why did TD-Gammon work – Pollack, Blair - 1997
13 The games computers (and people) play – Schaeffer - 2000
9 Learning to play chess using temporal differences – Baxter, Tridgell, et al. - 2000
8 Strategy acquisition for the game Othello based on reinforcement learning – Yoshioka, Ishii, et al. - 1999
4 Learning to Play Games from Experience: An Application – Olson - 1993
3 2004) Dedicated TD-Learning for Stronger Gameplay: applications to Go – Ekker, Werf, et al. - 2004
3 Improving Temporal Difference Learning for Deterministic Sequential Decision Problems – Ragg, Braunn, et al. - 1994
3 Temporal difference learning for heuristic search and game playing – Beal, Smith
3 Learning of Position Evaluation in the Game of Othello – Leouski - 1995
2 Patist and Marco Wiering (2004), Learning to Play Draughts using Temporal Difference Learning with Neural Networks and Databases – Peter
2 TDLeaf(λ): Combining temporal difference learning with game-tree search – Baxter, Tridgell, et al. - 1998
2 2000) NeuroDraughts An Application of Temporal Difference Learning to Draughts. Undergraduate thesis – Lynch
2 Performance Analysis of a New Updating Rule for TD(λ) Learning in Feedforward Networks for Position Evaluation – Chan, King, et al. - 1996
2 2002) Learn from your opponent - but what if he/she/it knows less than you?. Step by Step – Beal - 2002
2 Erfolgsorientiertes Lernen mit Tiefensuche in Bauernendspielen – Schäfer - 1993
1 Sejnowski (2001) Learning to evaluate Go positions via temporal difference methods – Dayan, Schraudolph, et al.
1 Wiering (2004) Learning to play chess using TD(λ)-learning with database games – Mannen, Marco
1 Beating the world champion The state of the art in computer game playing. New Approaches to Board Games Research – Allis - 1994
1 Automated Feature Selection to Maximize Learning in Artificial Intelligence – Turian - 1995
1 2002) TD-Learning and Coevolution – Bardeen
1 2001) Temporal difference learning applied to game playing and the results of application to shogi – Beal, Smith
1 Nikhil Deshpande (1998) Temporal Difference Learning in Chinese Chess – Trinh, Bashi
1 Honte, a Go-Playing Program using Neural Nets. Machines that learn to play games – Dahl - 1999
1 Deep Fritz: A Championship Level Othello Program – Wang
1 den Herik, Jos W.H.M. Uiterwijk and Jack van Rijswijck (2002) Games solved: Now and in the future – van
1 van den Herik (2002) Learning – Winands, Kocsis, et al.
1 2001) Machine Learning and the Game of Go. Undergraduate thesis – McQuade
1 and Mark Slagell Quixote: The Quixo Temporal-Difference Environment – Clifton
1 Markian Hlynka and Vili Jussila (2001) Temporal Difference Learning Applied to a High-Performance Game-Playing Program – Schaeffer
1 Nonlinear TD/Backprop pseudo C-code GTE Laboratories – Sutton, Jr - 1992
1 Wiering (1995) TD Learning of Game Evaluation Functions with Hierarchical Neural Architectures – Marco
1 Panagiotis Kanellopoulos (2001) On verifying game designs and playing strategies using reinforcement learning – Kalles
1 Exploration of the Practical Issues of Learning Prediction-Control Tasks Using Temporal Difference Learning Methods – Isbell - 1992
1 Auto-apprentissage, l’aide de rseaux de neurones, de fonctions heuristiques utilises dans les jeux stratgiques – Isabell - 1993