Results 1 - 10
of
91
Learning attractor landscapes for learning motor primitives
- in Advances in Neural Information Processing Systems
, 2003
"... Many control problems take place in continuous state-action spaces, e.g., as in manipulator robotics, where the control objective is often defined as finding a desired trajectory that reaches a particular goal state. While reinforcement learning offers a theoretical framework to learn such control p ..."
Abstract
-
Cited by 86 (22 self)
- Add to MetaCart
Many control problems take place in continuous state-action spaces, e.g., as in manipulator robotics, where the control objective is often defined as finding a desired trajectory that reaches a particular goal state. While reinforcement learning offers a theoretical framework to learn such control policies from scratch, its applicability to higher dimensional continuous state-action spaces remains rather limited to date. Instead of learning from scratch, in this paper we suggest to learn a desired complex control policy by transforming an existing simple canonical control policy. For this purpose, we represent canonical policies in terms of differential equations with well-defined attractor properties. By nonlinearly transforming the canonical attractor dynamics using techniques from nonparametric regression, almost arbitrary new nonlinear policies can be generated without losing the stability properties of the canonical system. We demonstrate our techniques in the context of learning a set of movement skills for a humanoid robot from demonstrations of a human teacher. Policies are acquired rapidly, and, due to the properties of well formulated differential equations, can be re-used and modified on-line under dynamic changes of the environment. The linear parameterization of nonparametric regression moreover lends itself to recognize and classify previously learned movement skills. Evaluations in simulations and on an actual 30 degree-offreedom humanoid robot exemplify the feasibility and robustness of our approach. 1
Movement Imitation with Nonlinear Dynamical Systems in Humanoid Robots
- In IEEE International Conference on Robotics and Automation (ICRA2002
, 2002
"... This article presents a new approach to movement planning, on-line trajectory modification, and imitation learning by representing movement plans based on a set of nonlinear di#erential equations with well-defined attractor dynamics. In contrast to non-autonomous movement representations like spline ..."
Abstract
-
Cited by 83 (14 self)
- Add to MetaCart
This article presents a new approach to movement planning, on-line trajectory modification, and imitation learning by representing movement plans based on a set of nonlinear di#erential equations with well-defined attractor dynamics. In contrast to non-autonomous movement representations like splines, the resultant movement plan remains an autonomous set of nonlinear di#erential equations that forms a control policy (CP) which is robust to strong external perturbations and that can be modified on-line by additional perceptual variables. The attractor landscape of the control policy can be learned rapidly with a locally weighted regression technique with guaranteed convergence of the learning algorithm and convergence to the movement target. This property makes the system suitable for movement imitation and also for classifying demonstrated movement according to the parameters of the learning system.
Reinforcement learning for humanoid robotics
- Autonomous Robot
, 2003
"... Abstract. The complexity of the kinematic and dynamic structure of humanoid robots make conventional analytical approaches to control increasingly unsuitable for such systems. Learning techniques offer a possible way to aid controller design if insufficient analytical knowledge is available, and lea ..."
Abstract
-
Cited by 69 (19 self)
- Add to MetaCart
Abstract. The complexity of the kinematic and dynamic structure of humanoid robots make conventional analytical approaches to control increasingly unsuitable for such systems. Learning techniques offer a possible way to aid controller design if insufficient analytical knowledge is available, and learning approaches seem mandatory when humanoid systems are supposed to become completely autonomous. While recent research in neural networks and statistical learning has focused mostly on learning from finite data sets without stringent constraints on computational efficiency, learning for humanoid robots requires a different setting, characterized by the need for real-time learning performance from an essentially infinite stream of incrementally arriving data. This paper demonstrates how even high-dimensional learning problems of this kind can successfully be dealt with by techniques from nonparametric regression and locally weighted learning. As an example, we describe the application of one of the most advanced of such algorithms, Locally Weighted Projection Regression (LWPR), to the on-line learning of three problems in humanoid motor control: the learning of inverse dynamics models for model-based control, the learning of inverse kinematics of redundant manipulators, and the learning of oculomotor reflexes. All these examples demonstrate fast, i.e., within seconds or minutes, learning convergence with highly accurate final peformance. We conclude that real-time learning for complex motor system like humanoid robots is possible with appropriately tailored algorithms, such that increasingly autonomous robots with massive learning abilities should be achievable in the near future. 1.
Incremental Online Learning in High Dimensions
- Neural Computation
, 2005
"... Locally weighted projection regression (LWPR) is a new algorithm for incremental nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it employs nonparametric regression with locally linear models. In order to stay computationally e ..."
Abstract
-
Cited by 67 (12 self)
- Add to MetaCart
Locally weighted projection regression (LWPR) is a new algorithm for incremental nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it employs nonparametric regression with locally linear models. In order to stay computationally e#cient and numerically robust, each local model performs the regression analysis with a small number of univariate regressions in selected directions in input space in the spirit of partial least squares regression. We discuss when and how local learning techniques can successfully work in high dimensional spaces and review the various techniques for local dimensionality reduction before finally deriving the LWPR algorithm. The properties of LWPR are that it i) learns rapidly with second order learning methods based on incremental training, ii) uses statistically sound stochastic leave-one-out cross validation for learning without the need to memorize training data, iii) adjusts its weighting kernels based only on local information in order to minimize the danger of negative interference of incremental learning, iv) has a computational complexity that is linear in the number of inputs, and v) can deal with a large number of - possibly redundant - inputs, as shown in various empirical evaluations with up to 90 dimensional data sets. For a probabilistic interpretation, predictive variance and confidence intervals are derived. To our knowledge, LWPR is the first truly incremental spatially localized learning method that can successfully and e#ciently operate in very high dimensional spaces.
A Survey of Robot Learning from Demonstration
"... We present a comprehensive survey of robot Learning from Demonstration (LfD), a technique that develops policies from example state to action mappings. We introduce the LfD design choices in terms of demonstrator, problem space, policy derivation and performance, and contribute the foundations for a ..."
Abstract
-
Cited by 63 (15 self)
- Add to MetaCart
We present a comprehensive survey of robot Learning from Demonstration (LfD), a technique that develops policies from example state to action mappings. We introduce the LfD design choices in terms of demonstrator, problem space, policy derivation and performance, and contribute the foundations for a structure in which to categorize LfD research. Specifically, we analyze and categorize the multiple ways in which examples are gathered, ranging from teleoperation to imitation, as well as the various techniques for policy derivation, including matching functions, dynamics models and plans. To conclude we discuss LfD limitations and related promising areas for future research.
Learning from demonstration and adaptation of biped locomotion
- Robotics and Autonomous Systems
, 2004
"... Abstract — In this paper, we report on our research for learning biped locomotion from human demonstration. Our ultimate goal is to establish a design principle of a controller in order to achieve natural human-like locomotion. We suggest dynamical movement primitives as a CPG of a biped robot, an a ..."
Abstract
-
Cited by 59 (6 self)
- Add to MetaCart
Abstract — In this paper, we report on our research for learning biped locomotion from human demonstration. Our ultimate goal is to establish a design principle of a controller in order to achieve natural human-like locomotion. We suggest dynamical movement primitives as a CPG of a biped robot, an approach we have previously proposed for learning and encoding complex human movements. Demonstrated trajectories are learned through the movement primitives by locally weighted regression, and the frequency of the learned trajectories is adjusted automatically by a novel frequency adaptation algorithm based on phase resetting and entrainment of oscillators. Numerical simulations demonstrate the effectiveness of the proposed locomotion controller. I.
Learning inverse kinematics
- in Proc. IROS, 2001
"... Real-time control of the endeffector of a humanoid robot in external coordinates requires computationally efficient solutions of the inverse kinematics problem. In this context, this paper investigates inverse kinematics learningfor resolved motion rate control (RMRC) employingan optimization criter ..."
Abstract
-
Cited by 58 (11 self)
- Add to MetaCart
Real-time control of the endeffector of a humanoid robot in external coordinates requires computationally efficient solutions of the inverse kinematics problem. In this context, this paper investigates inverse kinematics learningfor resolved motion rate control (RMRC) employingan optimization criterion to resolve kinematic redundancies. Our learningapproach is based on the key observations that learningan inverse of a non uniquely invertible function can be accomplished by augmenting the input representation to the inverse model and by usinga spatially localized learningapproach. We apply this strategy to inverse kinematics learningand demonstrate how a recently developed statistical learning algorithm, Locally Weighted Projection Regression, allows efficient learning of inverse kinematic mappings in an incremental fashion even when input spaces become rather high dimensional. The resultingperformance of the inverse kinematics is comparable to Liegeois ’ [9] analytical pseudo-inverse with optimization. Our results are illustrated with a 30 degree of freedom humanoid robot. 1
Locally Weighted Projection Regression: An O(n) Algorithm for Incremental Real Time Learning in High Dimensional Space
- in Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000
"... Locally weighted projection regression is a new algorithm that achieves nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it uses locally linear models, spanned by a small number of univariate regressions in selected direct ..."
Abstract
-
Cited by 51 (16 self)
- Add to MetaCart
Locally weighted projection regression is a new algorithm that achieves nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it uses locally linear models, spanned by a small number of univariate regressions in selected directions in input space. This paper evaluates different methods of projection regression and derives a nonlinear function approximator based on them. This nonparametric local learning system i) learns rapidly with second order learning methods based on incremental training, ii) uses statistically sound stochastic cross validation to learn iii) adjusts its weighting kernels based on local information only, iv) has a computational complexity that is linear in the number of inputs, and v) can deal with a large number of - possibly redundant - inputs, as shown in evaluations with up to 50 dimensional data sets. To our knowledge, this is the first truly incremental spatially localized l...
Trajectory Formation for Imitation with Nonlinear Dynamical Systems
, 2001
"... This article e xplore s ane approach to le rning by imitation and traje5 ory formation byre reC-- ting move - me ts as mixture s of nonline r di#e e tialeC-- tions with we ll-de fine d attractor dynamics. An obseC e move me nt is approximate by finding a be5 fit of the mixture mode to its data by ar ..."
Abstract
-
Cited by 47 (5 self)
- Add to MetaCart
This article e xplore s ane approach to le rning by imitation and traje5 ory formation byre reC-- ting move - me ts as mixture s of nonline r di#e e tialeC-- tions with we ll-de fine d attractor dynamics. An obseC e move me nt is approximate by finding a be5 fit of the mixture mode to its data by areC---fl--k e le5R square reC--6:---k2 te hnique In contrast to non-autonomous move me t re pr e se tationslike spline7 the re sultant moveC-- t plan r e mains an autonomous se of nonlineC di#eCG tial ek ations that forms a control policy which is robust to strong ek e rnal pe rturbations and that can be modifie by additional pe rce tual variable s. This move me nt policy r e mains the same for a give targe5 r e ardlefl of the initial conditions, and canek5 ly be reR se for ne w targe s. We e aluate the traje5 ory formation syste (TFS) in the conte xt of a humanoid robot simulation that is part of the Virtual Traine r (VT) proje5 , which aims at supe rvising reR bilitatione xe cise in stroke:G tie ts. A typical re habilitatione xe cise was colle6Gfl with a Sarcos SeC suit, ade:C: to re5 rd joint angular move me t from human subje7C7 and approximate and reC5 duce with our imitation te hniqueC Our re sults deC nstrate that multijoint human move me ts can be e56 de succeGk2CC6 , and that thissyste allows robust modifications of the move - me nt policy through eke rnal variable s.
On-line EM Algorithm for the Normalized Gaussian Network
, 1999
"... A Normalized Gaussian Network (NGnet) (Moody and Darken 1989) is a network of local linear regression units. The model softly partitions the input space by normalized Gaussian functions and each local unit linearly approximates the output within the partition. In this article, we propose a new on ..."
Abstract
-
Cited by 46 (6 self)
- Add to MetaCart
A Normalized Gaussian Network (NGnet) (Moody and Darken 1989) is a network of local linear regression units. The model softly partitions the input space by normalized Gaussian functions and each local unit linearly approximates the output within the partition. In this article, we propose a new on-line EM algorithm for the NGnet, which is derived from the batch EM algorithm (Xu, Jordan and Hinton 1995) by introducing a discount factor. We show that the on-line EM algorithm is equivalent to the batch EM algorithm if a specific scheduling of the discount factor is employed. In addition, we show that the on-line EM algorithm can be considered as a stochastic approximation method to find the maximum likelihood estimator. A new regularization method is proposed in order to deal with a singular input distribution. In order to manage dynamic environments, where the input-output distribution of data changes over time, unit manipulation mechanisms such as unit production, unit deletion...

