Results 1 - 10
of
33
Co-Evolution in the Successful Learning of Backgammon Strategy
- Machine Learning
, 1998
"... Following Tesauro's work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and choosing the move with the highest evaluation. However, no ..."
Abstract
-
Cited by 119 (24 self)
- Add to MetaCart
Following Tesauro's work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and choosing the move with the highest evaluation. However, no back-propagation, reinforcement or temporal difference learning methods were employed. Instead we apply simple hill-climbing in a relative fitness environment. We start with an initial champion of all zero weights and proceed simply by playing the current champion network against a slightly mutated challenger and changing weights if the challenger wins. Surprisingly, this worked rather well. We investigate how the peculiar dynamics of this domain enabled a previously discarded weak method to succeed, by preventing suboptimal equilibria in a "meta-game" of self-learning. Keywords: coevolution, backgammon, reinforcement, temporal difference learning, self-learning Running Head: CO-EVOLUTIONARY LEA...
Coevolution of A Backgammon Player
- Proceedings Artificial Life V
"... One of the persistent themes in Artificial Life research is the use of co-evolutionary arms races in the development of specific and complex behaviors. However, other than Sims’s work on artificial robots, most of the work has attacked very simple games of prisoners dilemma or predator and prey. Fol ..."
Abstract
-
Cited by 88 (11 self)
- Add to MetaCart
(Show Context)
One of the persistent themes in Artificial Life research is the use of co-evolutionary arms races in the development of specific and complex behaviors. However, other than Sims’s work on artificial robots, most of the work has attacked very simple games of prisoners dilemma or predator and prey. Following Tesauro’s work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and choosing the move with the highest evaluation. However, no back-propagation, reinforcement
Co-evolving Intertwined Spirals
- in Proceedings of the Fifth Annual Conference on Evolutionary Programming
, 1996
"... We recently solved the two spirals problem, a difficult neural network benchmark classification problem, using the genetic programming primitives set up by [Koza, 1992]. Instead of using absolute fitness, we use a relative fitness based on a competition for coverage of the data set. This is a form o ..."
Abstract
-
Cited by 49 (15 self)
- Add to MetaCart
We recently solved the two spirals problem, a difficult neural network benchmark classification problem, using the genetic programming primitives set up by [Koza, 1992]. Instead of using absolute fitness, we use a relative fitness based on a competition for coverage of the data set. This is a form of co-evolutionary search because the fitness function changes with the population. Because niches are opened by proportionate reproduction, rather than crowded out, and because of the crossover operator, we find solutions which have a nice modular structure. Our experiments used our Massively Parallel Genetic Programming (MPGP) system running on a SIMD machine of 4096 processors, the Maspar MP-2.
Parallel Distributed Genetic Programming
- SCHOOL OF COMPUTER SCIENCE, UNIVERSITY OF BIRMINGHAM
, 1999
"... This chapter describes Parallel Distributed Genetic Programming (PDGP), a form of Genetic Programming (GP) which is suitable for the development of programs with a high degree of parallelism and an ecient and effective reuse of partial results. Programs are represented in PDGP as graphs with node ..."
Abstract
-
Cited by 36 (9 self)
- Add to MetaCart
This chapter describes Parallel Distributed Genetic Programming (PDGP), a form of Genetic Programming (GP) which is suitable for the development of programs with a high degree of parallelism and an ecient and effective reuse of partial results. Programs are represented in PDGP as graphs with nodes representing functions and terminals, and links representing the flow of control and results. In the simplest form of PDGP links are directed and unlabelled, in which case PDGP can be considered a generalisation of standard GP. However, more complex representations can be used, which allow the exploration of a large space of possible programs including standard tree-like programs, logic networks, neural networks, recurrent transition networks, finite state automata, etc.
Dynamics of co-evolutionary learning
- In Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior
, 1996
"... Co-evolutionary learning, which involves the embedding of adaptive learning agents in a t-ness environment which dynamically responds to their progress, is a potential solution for many technological chicken and egg problems, and is at the heart of several recent and surprising successes, such as Si ..."
Abstract
-
Cited by 31 (8 self)
- Add to MetaCart
Co-evolutionary learning, which involves the embedding of adaptive learning agents in a t-ness environment which dynamically responds to their progress, is a potential solution for many technological chicken and egg problems, and is at the heart of several recent and surprising successes, such as Sim's arti cial robot and Tesauro's backgammon player. We recently solved the two spirals problem, a di cult neural network benchmark classi cation problem, using the genetic programming primitives set up by [Koza, 1992]. Instead of using absolute tness, we use a relative tness [Angeline & Pollack, 1993] based on a competition for coverage of the data set. As the population reproduces, the tness function driving the selection changes, and subproblem niches are opened, rather than crowded out. The solutions found by our method have a symbiotic structure which suggests that by holding niches open, crossover is better able to discover modular building blocks. 1
A genome compiler for high performance genetic programming”, Genetic Programming 1998
- University of Wisconsin
, 1998
"... of computational resources is used by the evaluate step, which evaluates candidate solutions with respect to an objective function. Thus, one of the challenges in imple-menting a high-performance GP system is speeding up the evaluation step as much as possible. We were made acutely aware of the need ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
of computational resources is used by the evaluate step, which evaluates candidate solutions with respect to an objective function. Thus, one of the challenges in imple-menting a high-performance GP system is speeding up the evaluation step as much as possible. We were made acutely aware of the need for an efficient individual evaluation process when we attempted to apply GP to image compression (see Section 4.2). Initially, we implemented the application using lil-gp [lo], a standard GP system used by numerous researchers, and found that it was prohibitively slow to study genetic programming-based image compression- each run took about 2 days on a 296MHz Sun UltraSparc 2. We therefore sought to significantly improve the speed of execution of the GP system. In standard GP, s-expressions are recursively evaluated, Genetic Programming is very computationally expensive. For most applications, the vast majority of time is spent evaluating candidate solutions, so it is desirable to make individual evaluation as efficient as possible. We describe a genome compiler which compiles s-expressions to machine code, resulting in significant speedup of individual evaluations over standard GP systems. Based on performance results with symbolic regression, we show that the execution of the genome compiler system is comparable to the fastest alternative GP systems. We also demonstrate the utility of compilation on a real-world problem, lossless image compression. A somewhat surprising result is that in our test domains, the overhead negligible. of compilation is 1
Some Steps Towards a Form of Parallel Distributed Genetic Programming
, 1996
"... This paper describes PDGP (Parallel Distributed Genetic Programming), a new form of genetic programming which is suitable for the development of fine-grained parallel programs. PDGP is based on a graph-like representation for parallel programs which is manipulated by crossover and mutation operators ..."
Abstract
-
Cited by 20 (15 self)
- Add to MetaCart
This paper describes PDGP (Parallel Distributed Genetic Programming), a new form of genetic programming which is suitable for the development of fine-grained parallel programs. PDGP is based on a graph-like representation for parallel programs which is manipulated by crossover and mutation operators which guarantee the syntactic correctness of the offspring. The paper describes these operators and reports some preliminary results obtained with this paradigm
Why did TD-Gammon Work
- Advances in Neural Information Processing Systems 9
"... Although TD-Gammon is one of the major successes in machine learning, it has not led to similar impressive breakthroughs in temporal difference learning for other applications or even other games. We were able to replicate some of the success of TD-Gammon, developing a competitive evaluation functio ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
(Show Context)
Although TD-Gammon is one of the major successes in machine learning, it has not led to similar impressive breakthroughs in temporal difference learning for other applications or even other games. We were able to replicate some of the success of TD-Gammon, developing a competitive evaluation function on a 4000 parameter feed-forward neural network, without using back-propagation, reinforcement or temporal difference learning methods. Instead we apply simple hill-climbing in a relative fitness environment. These results and further analysis suggest that the surprising success of Tesauro’s program had more to do with the co-evolutionary structure of the learning task and the dynamics of the backgammon game itself. 1