Results 1  10
of
21
Learning from demonstration
 Advances in Neural Information Processing Systems 9
, 1997
"... By now it is widely accepted that learning a task from scratch, i.e., without any prior knowledge, is a daunting undertaking. Humans, however, rarely attempt to learn from scratch. They extract initial biases as well as strategies how to approach a learning problem from instructions and/or demonstra ..."
Abstract

Cited by 311 (30 self)
 Add to MetaCart
By now it is widely accepted that learning a task from scratch, i.e., without any prior knowledge, is a daunting undertaking. Humans, however, rarely attempt to learn from scratch. They extract initial biases as well as strategies how to approach a learning problem from instructions and/or demonstrations of other humans. For learning control, this paper investigates how learning from demonstration can be applied in the context of reinforcement learning. We consider priming the Qfunction, the value function, the policy, and the model of the task dynamics as possible areas where demonstrations can speed up learning. In general nonlinear learning problems, only modelbased reinforcement learning shows significant speedup after a demonstration, while in the special case of linear quadratic regulator (LQR) problems, all methods profit from the demonstration. In an implementation of pole balancing on a complex anthropomorphic robot arm, we demonstrate that, when facing the complexities of real signal processing, modelbased reinforcement learning offers the most robustness for LQR problems. Using the suggested methods, the robot learns pole balancing in just a single trial after a 30 second long demonstration of the human instructor. 1.
Learning in the Presence of Concept Drift and Hidden Contexts
 Machine Learning
, 1996
"... . Online learning in domains where the target concept depends on some hidden context poses serious problems. A changing context can induce changes in the target concepts, producing what is known as concept drift. We describe a family of learning algorithms that flexibly react to concept drift and c ..."
Abstract

Cited by 191 (0 self)
 Add to MetaCart
. Online learning in domains where the target concept depends on some hidden context poses serious problems. A changing context can induce changes in the target concepts, producing what is known as concept drift. We describe a family of learning algorithms that flexibly react to concept drift and can take advantage of situations where contexts reappear. The general approach underlying all these algorithms consists of (1) keeping only a window of currently trusted examples and hypotheses; (2) storing concept descriptions and reusing them when a previous context reappears; and (3) controlling both of these functions by a heuristic that constantly monitors the system's behavior. The paper reports on experiments that test the systems' performance under various conditions such as different levels of noise and different extent and rate of concept drift. Keywords: Incremental concept learning, online learning, context dependence, concept drift, forgetting 1. Introduction The work presen...
Constructive Incremental Learning from Only Local Information
, 1998
"... ... This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields. ..."
Abstract

Cited by 161 (37 self)
 Add to MetaCart
... This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields.
Locally Weighted Learning for Control
, 1996
"... Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We ex ..."
Abstract

Cited by 159 (17 self)
 Add to MetaCart
Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We explain various forms that control tasks can take, and how this affects the choice of learning paradigm. The discussion section explores the interesting impact that explicitly remembering all previous experiences has on the problem of learning to control.
Interaction and Intelligent Behavior
, 1994
"... This thesis addresses situated, embodied agents interacting in complex domains. It focuses on two problems: 1) synthesis and analysis of intelligent group behavior, and 2) learning in complex group environments. Basic behaviors, control laws that cluster constraints to achieve particular goals and h ..."
Abstract

Cited by 147 (20 self)
 Add to MetaCart
This thesis addresses situated, embodied agents interacting in complex domains. It focuses on two problems: 1) synthesis and analysis of intelligent group behavior, and 2) learning in complex group environments. Basic behaviors, control laws that cluster constraints to achieve particular goals and have the appropriate compositional properties, are proposed as effective primitives for control and learning. The thesis describes the process of selecting such basic behaviors, formally specifying them, algorithmically implementing them, and empirically evaluating them. All of the proposed ideas are validated with a group of up to 20 mobile robots using a basic behavior set consisting of: safewandering, following, aggregation, dispersion, and homing. The set of basic behaviors acts as a substrate for achieving more complex highlevel goals and tasks. Two behavior combination operators are introduced, and verified by combining subsets of the above basic behavior set to implement collective flocking, foraging, and docking. A methodology is introduced for automatically constructing higherlevel behaviors
Reinforcement Learning in the MultiRobot Domain
 Autonomous Robots
, 1997
"... This paper describes a formulation of reinforcement learning that enables learning in noisy, dynamic environemnts such as in the complex concurrent multirobot learning domain. The methodology involves minimizing the learning space through the use behaviors and conditions, and dealing with the credi ..."
Abstract

Cited by 136 (20 self)
 Add to MetaCart
This paper describes a formulation of reinforcement learning that enables learning in noisy, dynamic environemnts such as in the complex concurrent multirobot learning domain. The methodology involves minimizing the learning space through the use behaviors and conditions, and dealing with the credit assignment problem through shaped reinforcement in the form of heterogeneous reinforcement functions and progress estimators. We experimentally validate the approach on a group of four mobile robots learning a foraging task. 1 Introduction Developing effective methods for realtime learning has been an ongoing challenge in autonomous agent research and is being explored in the mobile robot domain. In the last decade, reinforcement learning (RL), a class of approaches in which the agent learns based on reward and punishment it receives from the environment, has become the methodology of choice for learning in a variety of domains, including robotics. In this paper we describe a formulat...
Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization
, 1993
"... ..."
Efficient Locally Weighted Polynomial Regression Predictions
 In Proceedings of the 1997 International Machine Learning Conference
"... Locally weighted polynomial regression (LWPR) is a popular instancebased algorithm for learning continuous nonlinear mappings. For more than two or three inputs and for more than a few thousand datapoints the computational expense of predictions is daunting. We discuss drawbacks with previous appr ..."
Abstract

Cited by 81 (11 self)
 Add to MetaCart
Locally weighted polynomial regression (LWPR) is a popular instancebased algorithm for learning continuous nonlinear mappings. For more than two or three inputs and for more than a few thousand datapoints the computational expense of predictions is daunting. We discuss drawbacks with previous approaches to dealing with this problem, and present a new algorithm based on a multiresolution search of a quicklyconstructible augmented kdtree. Without needing to rebuild the tree, we can make fast predictions with arbitrary local weighting functions, arbitrary kernel widths and arbitrary queries. The paper begins with a new, faster, algorithm for exact LWPR predictions. Next we introduce an approximation that achieves up to a twoordersof magnitude speedup with negligible accuracy losses. Increasing a certain approximation parameter achieves greater speedups still, but with a correspondingly larger accuracy degradation. This is nevertheless useful during operations such as the early stages...
Statistical learning by imitation of competing constraints in joint space and . . .
, 2009
"... ..."
Robot learning by nonparametric regression
 In: (Ed.), Proceedings of the International Conference on Intelligent Robots and Systems (IROS'94
, 1994
"... Abstract: We present an approach to robot learning grounded on a nonparametric regression technique, locally weighted regression. The model of the task to be performed is represented by infinitely many local linear models, i.e., the (hyper) tangent planes at every point in input space at which a pr ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
Abstract: We present an approach to robot learning grounded on a nonparametric regression technique, locally weighted regression. The model of the task to be performed is represented by infinitely many local linear models, i.e., the (hyper) tangent planes at every point in input space at which a prediction is to be made, i.e., at every query point. Such a model, however, is only generated at the time of prediction and is not retained. This is in contrast to other methods using a finite set of linear models to accomplish a piecewise linear model. Architectural parameters of our approach, such as distance metrics, are also a function of the current query point instead of being global. Statistical tests are presented for when a local model is good enough such that it can be reliably used to build a local controller. These statistical measures also direct the exploration of the robot. We explicitly deal with the case where prediction accuracy requirements exist during exploration: By gradually shifting a center of exploration and controlling the speed of the shift with local prediction accuracy, a goaldirected exploration of state space takes place along the fringes of the current data support until the task goal is achieved. We illustrate this approach by describing how it has been used to enable a robot to learn a challenging juggling task: Within 40 to 100 trials the robot accomplished the task goal starting out with no initial experiences. 1.