Results 1 - 10
of
24
Creating Advice-Taking Reinforcement Learners
- Machine Learning
, 1996
"... . Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training episodes. We present and evaluate a design that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, ..."
Abstract
-
Cited by 84 (10 self)
- Add to MetaCart
. Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training episodes. We present and evaluate a design that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, at any time and in a natural manner, by an external observer. In our approach, the advice-giver watches the learner and occasionally makes suggestions, expressed as instructions in a simple imperative programming language. Based on techniques from knowledge-based neural networks, we insert these programs directly into the agent's utility function. Subsequent reinforcement learning further integrates and refines the advice. We present empirical evidence that investigates several aspects of our approach and show that, given good advice, a learner can achieve statistically significant gains in expected reward. A second experiment shows that advice improves the expected reward regardless of the...
From Implicit Skills to Explicit Knowledge: A Bottom-Up Model of Skill Learning
, 1999
"... This paper presents a skill learning model CLARION. Different from existing models of mostly high-level skill learning that use a top-down approach (that is, turning declarative knowledge into procedural knowledge through practice), we adopt a bottom-up approach toward low-level skill learning, wher ..."
Abstract
-
Cited by 84 (31 self)
- Add to MetaCart
This paper presents a skill learning model CLARION. Different from existing models of mostly high-level skill learning that use a top-down approach (that is, turning declarative knowledge into procedural knowledge through practice), we adopt a bottom-up approach toward low-level skill learning, where procedural knowledge develops first and declarative knowledge develops later. Our model is formed by integrating connectionist, reinforcement, and symbolic learning methods to perform on-line reactive learning. It adopts a two-level dual-representation framework (Sun, 1995), with a combination of localist and distributed representation. We compare the model with human data in a minefield navigation task, demonstrating some match between the model and human data in several respects.
Strategic Advice for Hierarchical Planners
"... AI planning systems have traditionally operated as stand-alone blackboxes, taking a description of a domain and a set of goals, and automatically synthesizing a plan for achieving those goals. Such designs severely restrict the influence that users can have on the resultant plans. This paper describ ..."
Abstract
-
Cited by 42 (10 self)
- Add to MetaCart
AI planning systems have traditionally operated as stand-alone blackboxes, taking a description of a domain and a set of goals, and automatically synthesizing a plan for achieving those goals. Such designs severely restrict the influence that users can have on the resultant plans. This paper describes an Advisable Planner framework that marries an advice-taking interface to AI planning technology. The framework is designed to enable users to interact with planning systems at high levels of abstraction in order to influence the plan generation process in terms that are meaningful to them. Advice consists of taskspecific constraints on both the desired solution and the refinement decisions that underlie the planning process. The paper emphasizes strategic advice, which expresses recommendations on how goals and actions are to be accomplished. The main contributions are a formal language and semantics for strategic advice, and a sound and complete HTNstyle algorithm for generating plans ...
Incorporating Advice into Agents that Learn from Reinforcements
- In Proceedings of the Twelfth National Conference on Artificial Intelligence
, 1994
"... Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training episodes. We present an approach that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, at any time ..."
Abstract
-
Cited by 39 (5 self)
- Add to MetaCart
Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training episodes. We present an approach that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, at any time and in a natural manner, by an external observer. In our approach, the advice-giver watches the learner and occasionally makes suggestions, expressed as instructions in a simple programming language. Based on techniques from knowledge-based neural networks, these programs are inserted directly into the agent's utility function. Subsequent reinforcement learning further integrates and refines the advice. We present empirical evidence that shows our approach leads to statistically-significant gains in expected reward. Importantly, the advice improves the expected reward regardless of the stage of training at which it is given. Introduction A successful and increasingly popular method for cr...
Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression
- In Proceedings of the 20th National Conference on Artificial Intelligence
, 2005
"... We present a novel formulation for providing advice to a reinforcement learner that employs supportvector regression as its function approximator. Our new method extends a recent advice-giving technique, called Knowledge-Based Kernel Regression (KBKR), that accepts advice concerning a single action ..."
Abstract
-
Cited by 37 (13 self)
- Add to MetaCart
We present a novel formulation for providing advice to a reinforcement learner that employs supportvector regression as its function approximator. Our new method extends a recent advice-giving technique, called Knowledge-Based Kernel Regression (KBKR), that accepts advice concerning a single action of a reinforcement learner. In KBKR, users can say that in some set of states, an action’s value should be greater than some linear expression of the current state. In our new technique, which we call Preference KBKR (Pref-KBKR), the user can provide advice in a more natural manner by recommending that some action is preferred over another in the specified set of states. Specifying preferences essentially means that users are giving advice about policies rather than Q values, which is a more natural way for humans to present advice. We present the motivation for preference advice and a proof of the correctness of our extension to KBKR. In addition, we show empirical results that our method can make effective use of advice on a novel reinforcement-learning task, based on the RoboCup simulator, which we call Breakaway. Our work demonstrates the significant potential of advice-giving techniques for addressing complex reinforcement learning problems, while further demonstrating the use of support-vector regression for reinforcement learning.
Shaping Robot Behavior Using Principles from Instrumental Conditioning
, 1997
"... Shaping by successive approximations is an important animal training technique in which behavior is gradually adjusted in response to strategically timed reinforcements. We describe a computational model of this shaping process and its implementation on a mobile robot. Innate behaviors in our model ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
Shaping by successive approximations is an important animal training technique in which behavior is gradually adjusted in response to strategically timed reinforcements. We describe a computational model of this shaping process and its implementation on a mobile robot. Innate behaviors in our model are sequences of actions and enabling conditions, and shaping is a behavior editing process realized by multiple editing mechanisms. The model replicates some fundamental phenomena associated with instrumental learning in animals, and allows an RWI B21 robot to learn several distinct tasks derived from the same innate behavior. 1. Introduction Service dogs trained to assist a disabled person will respond to over 60 verbal commands to, for example, turn on lights, open a refrigerator door, or retrieve a dropped object [9]. Chicks can be taught to play a toy piano (peck out a key sequence until a reinforcement is received at the end of the tune) [6], and rats have been conditioned to perform c...
Using advice to transfer knowledge acquired in one reinforcement learning task to another
- In Proceedings of the Sixteenth European Conference on Machine Learning
, 2005
"... Abstract. We present a method for transferring knowledge learned in one task to a related task. Our problem solvers employ reinforcement learning to acquire a model for one task. We then transform that learned model into advice for a new task. A human teacher provides a mapping from the old task to ..."
Abstract
-
Cited by 34 (11 self)
- Add to MetaCart
Abstract. We present a method for transferring knowledge learned in one task to a related task. Our problem solvers employ reinforcement learning to acquire a model for one task. We then transform that learned model into advice for a new task. A human teacher provides a mapping from the old task to the new task to guide this knowledge transfer. Advice is incorporated into our problem solver using a knowledge-based support vector regression method that we previously developed. This advice-taking approach allows the problem solver to refine or even discard the transferred knowledge based on its subsequent experiences. We empirically demonstrate the effectiveness of our approach with two games from the RoboCup soccer simulator: KeepAway and BreakAway. Our results demonstrate that a problem solver learning to play BreakAway using advice extracted from KeepAway outperforms a problem solver learning without the benefit of such advice. 1
Evaluation and Selection of Biases in Machine Learning
- ACM Computing Surveys
, 1995
"... In this introduction, we define the term bias as it is used in machine learning systems. We motivate the importance of automated methods for evaluating and selecting biases using a framework of bias selection as sem'ch in bias and meta-bias spaces. Recent research in the field of mac}fine learning b ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
In this introduction, we define the term bias as it is used in machine learning systems. We motivate the importance of automated methods for evaluating and selecting biases using a framework of bias selection as sem'ch in bias and meta-bias spaces. Recent research in the field of mac}fine learning bias is stmmarized.
Socially embedded learning of the office-conversant mobile robot jijo-2
- In Proceedings of 15th International Joint Conference on Artificial Intelligence (IJCAI-97
, 1997
"... This paper explores a newly developing direction of machine learning called ''socially embedded learning". In this research we have been building an office-conversant mobile robot which autonomously moves around in an office environment, actively gathers information through close interaction wi ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
This paper explores a newly developing direction of machine learning called ''socially embedded learning". In this research we have been building an office-conversant mobile robot which autonomously moves around in an office environment, actively gathers information through close interaction with this environment including sensing multi-modal data and making dialog with people in the office, and acquires knowledge about the environment with which it ultimately becomes conversant. Here our major concerns are in how the close interaction between the learning system and its social environment can help or accelerate the systems learning process, and what kinds of prepared mechanisms are necessary for the emergence of such interactions. The office-conversant robot is a platform on which we implement our ideas and test their feasibility in a real-world setting. An overview of the system is given and two examples of implemented ideas, i.e. dialog-based map acquisition and route acquisition by following, are described in detail.
Learning from an Automated Training Agent
- Adaptation and Learning in Multiagent Systems
, 1996
"... A learning agent employing reinforcement learning is hindered because it only receives the critic's sparse and weakly informative training information. We present an approach in which an automated training agent may also provide occasional instruction to the learner in the form of actions for ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
A learning agent employing reinforcement learning is hindered because it only receives the critic's sparse and weakly informative training information. We present an approach in which an automated training agent may also provide occasional instruction to the learner in the form of actions for the learner to perform. The learner has access to both the critic's feedback and the trainer's instruction. In the experiments, we vary the level of the trainer's interaction with the learner, from allowing the trainer to instruct the learner at almost every time step, to not allowing the trainer to respond at all. We also vary a parameter that controls how the learner incorporates the trainer's actions. The results show significant reductions in the average number of training trials necessary to learn to perform the task. 1 INTRODUCTION In reinforcement learning, an automated agent attempts to develop a policy that indicates the actions to choose in performing a multiple-step ...

