Results 1 - 10
of
59
Cooperative Coevolution: An Architecture for Evolving Coadapted Subcomponents
- Evolutionary Computation
, 2000
"... To successfully apply evolutionary algorithms to the solution of increasingly complex problems, we must develop effective techniques for evolving solutions in the form of interacting coadapted subcomponents. One of the major difficulties is finding computational extensions to our current evolutionar ..."
Abstract
-
Cited by 153 (4 self)
- Add to MetaCart
To successfully apply evolutionary algorithms to the solution of increasingly complex problems, we must develop effective techniques for evolving solutions in the form of interacting coadapted subcomponents. One of the major difficulties is finding computational extensions to our current evolutionary paradigms that will enable such subcomponents to “emerge ” rather than being hand designed. In this paper, we describe an architecture for evolving such subcomponents as a collection of cooperating species. Given a simple stringmatching task, we show that evolutionary pressure to increase the overall fitness of the ecosystem can provide the needed stimulus for the emergence of an appropriate number of interdependent subcomponents that cover multiple niches, evolve to an appropriate level of generality, and adapt as the number and roles of their fellow subcomponents change over time. We then explore these issues within the context of a more complicated domain through a case study involving the evolution of artificial neural networks.
Spacetime Constraints Revisited
"... The Spacetime Constraints (SC) paradigm, whereby the animator specifies what an animated figure should do but not how to do it, is a very appealing approach to animation. However, the algorithms available for realizing the SC approach are limited. Current techniques are local in nature: they all use ..."
Abstract
-
Cited by 89 (8 self)
- Add to MetaCart
The Spacetime Constraints (SC) paradigm, whereby the animator specifies what an animated figure should do but not how to do it, is a very appealing approach to animation. However, the algorithms available for realizing the SC approach are limited. Current techniques are local in nature: they all use some kind of perturbational analysis to refine an initial trajectory. We propose a global search algorithm that is capable of generating multiple novel trajectories for SC problems from scratch. The key elements of our search strategy are a method for encoding trajectories as behaviors, and a genetic search algorithm for choosing behavior parameters that is currently implemented on a massively parallel computer. We describe the algorithm and show computed solutions to SC problems for 2D articulated figures. CR Categories: I.2.6 [Artificial Intelligence]: Learning--- parameter learning. I.2.6 [Artificial Intelligence]: Problem Solving, Control Methods and Search---heuristic methods. I.3.7 [...
Robot Shaping: Experiment In Behavior Engineering
, 1997
"... its performance. In fact, we use the expression robot shaping to denote the use of learning as a means to translate suggestions coming from an external trainer into an effective control strategy that allows a robot to achieve a goal. We borrowed the term shaping from experimental psychology (Skinne ..."
Abstract
-
Cited by 69 (4 self)
- Add to MetaCart
its performance. In fact, we use the expression robot shaping to denote the use of learning as a means to translate suggestions coming from an external trainer into an effective control strategy that allows a robot to achieve a goal. We borrowed the term shaping from experimental psychology (Skinner, 1938), because training an artificial robot somewhat resembles what experimental psychologists do in their laboratories, when they train an experimental subject to produce a predefined response. The important point, which differentiates our approach from most current research on learning autonomous agents, is that the trainer plays a fundamental role in the learning process: most of the book is aimed at showing how to use a trainer to develop control systems for simulated and real robots. We also use the term behavior engineering to characterize a new technological discipline, the objective of which is to provide techniques, methodologies and t
Beyond pleasure and pain
- American Psychologist
, 1997
"... People approach pleasure and avoid pain. To discover the true nature of approach-avoidance motivation, psychologists need to move beyond this hedonic principle to the principles that underlie the different ways that it operates. One such principle is regulatory focus, which distinguishes self-regula ..."
Abstract
-
Cited by 64 (4 self)
- Add to MetaCart
People approach pleasure and avoid pain. To discover the true nature of approach-avoidance motivation, psychologists need to move beyond this hedonic principle to the principles that underlie the different ways that it operates. One such principle is regulatory focus, which distinguishes self-regulation with a promotion focus (accomplishments and aspirations)from self-regulation with a prevention focus (safety and responsibilities). This principle is used to reconsider the fundamental nature of approach-avoidance, expectancy-value relations, and emotional and evaluative sensitivities. Both types of regulatory focus are applied to phenonomena that have been treated in terms of either promotion (e.g., well-being) or prevention (e.g., cognitive dissonance). Then, regulatory focus is distinguished from regulatory anticipation and regulatory reference, 2 other principles underlying the different ways that people approach pleasure and avoid pain. It seems that our entire psychical activity is bent upon procuring pleasure and avoiding pain, that it is automatically regulated by the PLEASURE-PRINCIPLE. (Freud, 1920/1952, p. 365) People are motivated to approach pleasure and avoid pain. From the ancient Greeks, through 17th- and 18thcentury British philosophers, to 20th-century psychologists, this hedonic or pleasure principle has dominated scholars ' understanding of people's motivation. It is the basic motivational assumption of theories across all areas of psychology, including theories of emotion in psychobiology (e.g., Gray, 1982), conditioning in animal learning
Reinforcement Learning And Its Application To Control
, 1992
"... Learning control involves modifying a controller's behavior to improve its performance as measured by some predefined index of performance (IP). If control actions that improve performance as measured by the IP are known, supervised learning methods, or methods for learning from examples, can be us ..."
Abstract
-
Cited by 49 (2 self)
- Add to MetaCart
Learning control involves modifying a controller's behavior to improve its performance as measured by some predefined index of performance (IP). If control actions that improve performance as measured by the IP are known, supervised learning methods, or methods for learning from examples, can be used to train the controller. But when such control actions are not known a priori, appropriate control behavior has to be inferred from observations of the IP. One can distinguish between two classes of methods for training controllers under such circumstances. Indirect methods involve constructing a model of the problem's IP and using the model to obtain training information for the controller. On the other hand, direct, or model-free,...
Robot Shaping: Developing Situated Agents through Learning
, 1993
"... Learning plays a vital role in the development of situated agents. In this paper, we explore the use of reinforcement learning to "shape" a robot to perform a predefined target behavior. We connect both simulated and real robots to ALECSYS, a parallel implementation of a learning classifier system w ..."
Abstract
-
Cited by 48 (1 self)
- Add to MetaCart
Learning plays a vital role in the development of situated agents. In this paper, we explore the use of reinforcement learning to "shape" a robot to perform a predefined target behavior. We connect both simulated and real robots to ALECSYS, a parallel implementation of a learning classifier system with an extended genetic algorithm. After classifying different kinds of Animatlike behaviors, we explore the effects on learning of different types of agent's architecture (monolithic, flat and hierarchical) and of training strategies. In particular, hierarchical architecture requires the agent to learn how to coordinate basic learned responses. We show that the best results are achieved when both the agent's architecture and the training strategy match the structure of the behavior pattern to be learned. We report the results of a number of experiments carried out both in simulated and in real environments, and show that the results of simulations carry smoothly to real robots. While most o...
Learning to Solve Markovian Decision Processes
, 1994
"... This dissertation is about building learning control architectures for agents embedded in finite, stationary, and Markovian environments. Such architectures give embedded agents the ability to improve autonomously the efficiency with which they can achieve goals. Machine learning researchers have d ..."
Abstract
-
Cited by 43 (3 self)
- Add to MetaCart
This dissertation is about building learning control architectures for agents embedded in finite, stationary, and Markovian environments. Such architectures give embedded agents the ability to improve autonomously the efficiency with which they can achieve goals. Machine learning researchers have developed reinforcement learning (RL) algorithms based on dynamic programming (DP) that use the agent's experience in its environment to improve its decision policy incrementally. This is achieved by adapting an evaluation function in such a way that the decision policy that is "greedy" with respect to it improves with experience. This dissertation focuses on finite, stationary and Markovian environments for two reasons: it allows the develop...
Learning to Drive a Bicycle using Reinforcement Learning and Shaping
, 1998
"... We present and solve a real-world problem of learning to drive a bicycle. We solve the problem by online reinforcement learning using the Sarsa()-algorithm. Then we solve the composite problem of learning to balance a bicycle and then drive to a goal. In our approach the reinforcement function is in ..."
Abstract
-
Cited by 42 (3 self)
- Add to MetaCart
We present and solve a real-world problem of learning to drive a bicycle. We solve the problem by online reinforcement learning using the Sarsa()-algorithm. Then we solve the composite problem of learning to balance a bicycle and then drive to a goal. In our approach the reinforcement function is independent of the task the agent tries to learn to solve. 1 Introduction Here we consider the problem of learning to balance on a bicycle. Having done this we want to drive the bicycle to a goal. The second problem is not as straightforward as it may seem. The learning agent has to solve two problems at the same time: Balancing on the bicycle and driving to a specific place. Recently, ideas from behavioural psychology have been adapted by reinforcement learning to solve this type of problem. We will return to this in section 3. In reinforcement learning an agent interacts with an environment or a system. At each time step the agent receives information on the state of the system and chooses ...
Autonomous shaping: knowledge transfer in reinforcement learning
- In Proceedings of the 23rd Internation Conference on Machine Learning
, 2006
"... We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting in accelerated learning in later tasks that are related but distinct. Such agents ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting in accelerated learning in later tasks that are related but distinct. Such agents can be trained on a sequence of relatively easy tasks in order to develop a more informative measure of reward that can be transferred to improve performance on more difficult tasks without requiring a hand coded shaping function. We use a rod positioning task to show that this significantly improves performance even after a very brief training period. 1.

