## Imitation learning of Globally Stable Non-Linear Point-to-Point Robot Motions using Nonlinear Programming (2010)

Venue: | in Proceeding of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS |

Citations: | 17 - 8 self |

### BibTeX

@INPROCEEDINGS{Khansari-zadeh10imitationlearning,

author = {S. Mohammad Khansari-zadeh and Aude Billard},

title = {Imitation learning of Globally Stable Non-Linear Point-to-Point Robot Motions using Nonlinear Programming},

booktitle = {in Proceeding of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS},

year = {2010}

}

### OpenURL

### Abstract

Abstract — This paper presents a methodology for learning arbitrary discrete motions from a set of demonstrations. We model a motion as a nonlinear autonomous (i.e. time-invariant) dynamical system, and define the sufficient conditions to make such a system globally asymptotically stable at the target. The convergence of all trajectories is ensured starting from any point in the operational space. We propose a learning method, called Stable Estimator of Dynamical Systems (SEDS), that estimates parameters of a Gaussian Mixture Model via an optimization problem under non-linear constraints. Being time-invariant and globally stable, the system is able to handle both temporal and spatial perturbations, while performing the motion as close to the demonstrations as possible. The method is evaluated through a set of robotic experiments. I.

### Citations

9033 | Maximum likelihood from incomplete data via the EM algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...GPR) [5], Locally Weighted Projection Regression (LWPR) [6], or Gaussian Mixture Regression (GMR) [7], where the parameters of the Gaussian Mixture are optimized through Expectation Maximization (EM) =-=[8]-=-. GMR and GPR find an optimal model of ˆ f by maximizing the likelihood that the complete model represents the data well, while LWPR minimizes the mean-square error between the estimates and the data ... |

2759 |
Estimating the Dimension of a Model, in
- Schwarz
- 1979
(Show Context)
Citation Context ...milar to GMM, SEDS requires a user to predefine the number of Gaussian functions, i.e. K. While it still remains an open question, in practice we noticed that the Bayesian Information Criterion (BIC) =-=[17]-=- can be used as a relatively good estimate to the optimum number of K. Finally, an assumption made in this paper is that represented motions can be modeled with a first order time-invariant ODE. While... |

866 |
Nonlinear programming: theory and algorithms
- Bazaraa, Sherali, et al.
- 1993
(Show Context)
Citation Context ... corresponding demonstrated trajectory ξn by starting from the same initial points as were demonstrated, i.e. ˆ ξ0,n = ξ0,n , ∀n ∈ 1..N. Eq. 12-13 correspond to a Non-linear Programming (NLP) problem =-=[15]-=- that can be solved using different optimization techniques such as Newton-Like algorithms [15], Dynamic Programming [16], etc. In this paper we use a quasi-Newton method to solve the optimization pro... |

741 |
Applied Nonlinear Control
- Slotine, Li
- 1991
(Show Context)
Citation Context ...ble behavior) even for relatively simple 2 dimensional dynamics. These errors are due to the fact that there is yet no theoretical solution for ensuring stability of arbitrary nonlinear autonomous DS =-=[10]-=-. Figure 2 illustrates an example of unstable estimation of a non-linear DS using the above three methods for learning a two dimensional motion. Figure 2(a) presents the stability analysis of the dyna... |

561 | Active learning with statistical models
- Cohn, Ghahramani, et al.
- 1996
(Show Context)
Citation Context ... t,n ; θ k ) = ) (3) { ∀n ∈ 1..N t ∈ 0..T n (4) √ 1 (2π) 2d |Σk 1 2 ξ |e− ([ξt,n, ˙ ξ t,n ]−µ k ) T (Σ k ) −1 ([ξ t,n , ˙ ξ t,n ]−µ k ) (5) Taking the posterior mean estimate of P( ˙ ξ|ξ) yields (see =-=[14]-=-): ˆ˙ξ = K∑ k=1 π k N (ξ; θ k ) ∑ K i=1 πi N (ξ; θ i ) (µk ˙ ξ + Σ k ˙ ξξ (Σ k ξ ) −1 (ξ − µ k ξ )) (6) The resulting nonlinear function ˆ f(ξ; θ) from Eq. 6 usually contains several spurious attracto... |

313 | Gaussian processes for machine learning
- Rasmussen, Williams
- 2005
(Show Context)
Citation Context ...ssical control and planning approaches when the underlying model cannot be well estimated. Existing approaches to the statistical estimation of f in Eq. 1 use either Gaussian Process Regression (GPR) =-=[5]-=-, Locally Weighted Projection Regression (LWPR) [6], or Gaussian Mixture Regression (GMR) [7], where the parameters of the Gaussian Mixture are optimized through Expectation Maximization (EM) [8]. GMR... |

161 |
Computational approaches to motor learning by imitation
- Schaal, Ijspeert, et al.
- 2004
(Show Context)
Citation Context ...get which may never exist, or be very narrow. Besides, finding this region of attraction becomes computationally costly and a non-trivial task in higher dimensions. Among works done on DS, [2], [11], =-=[12]-=-, [13] ensure the stability of ˆ f by modulating it with an inherently stable unidimensional linear dynamics. The modulation between ˆ f and the stable linear dynamics is controlled with a time depend... |

155 | On Learning, Representing and Generalizing a Task in a Humanoid Robot
- Calinon, Guenter, et al.
- 2007
(Show Context)
Citation Context ...isting approaches to the statistical estimation of f in Eq. 1 use either Gaussian Process Regression (GPR) [5], Locally Weighted Projection Regression (LWPR) [6], or Gaussian Mixture Regression (GMR) =-=[7]-=-, where the parameters of the Gaussian Mixture are optimized through Expectation Maximization (EM) [8]. GMR and GPR find an optimal model of ˆ f by maximizing the likelihood that the complete model re... |

143 | Movement Imitation with Nonlinear Dynamical Systems in Humanoid Robots
- Ijspeert, Nakanishi, et al.
- 2002
(Show Context)
Citation Context ...xperiments. I. INTRODUCTION We use Programming by Demonstration (PbD) [1], also referred to as Learning by Imitation, to teach a robot how to move its limbs to perform a discrete, i.e. point-to-point =-=[2]-=-. In PbD, an agent (e.g. human, robot, etc.) shows the robot a task a few times (usually between 35 times to make the task bearable for the trainer). To avoid addressing the correspondence problem [3]... |

79 | Locally weighted projection regression: An O(n) algorithm for incremental real time learning in high dimensional space
- Vijayakumar, Schaal
(Show Context)
Citation Context ...erlying model cannot be well estimated. Existing approaches to the statistical estimation of f in Eq. 1 use either Gaussian Process Regression (GPR) [5], Locally Weighted Projection Regression (LWPR) =-=[6]-=-, or Gaussian Mixture Regression (GMR) [7], where the parameters of the Gaussian Mixture are optimized through Expectation Maximization (EM) [8]. GMR and GPR find an optimal model of ˆ f by maximizing... |

47 | Learning and Generalization of Motor Skills by Learning from Demonstration
- Pastor, Hoffmann, et al.
- 2009
(Show Context)
Citation Context ...ich may never exist, or be very narrow. Besides, finding this region of attraction becomes computationally costly and a non-trivial task in higher dimensions. Among works done on DS, [2], [11], [12], =-=[13]-=- ensure the stability of ˆ f by modulating it with an inherently stable unidimensional linear dynamics. The modulation between ˆ f and the stable linear dynamics is controlled with a time dependent ph... |

43 |
A.: Dynamical system modulation for robot learning via kinesthetic demonstrations
- Hersch, Guenter, et al.
- 2008
(Show Context)
Citation Context ...ed target which may never exist, or be very narrow. Besides, finding this region of attraction becomes computationally costly and a non-trivial task in higher dimensions. Among works done on DS, [2], =-=[11]-=-, [12], [13] ensure the stability of ˆ f by modulating it with an inherently stable unidimensional linear dynamics. The modulation between ˆ f and the stable linear dynamics is controlled with a time ... |

34 |
Discovering optimal imitation strategies
- Billard, Epars, et al.
- 2004
(Show Context)
Citation Context ...eleoperating it using motion sensors (see Figure 1). We hence focus on the “what to imitate” problem and derive a means to extract the generic characteristics of the dynamics of the motion. Following =-=[4]-=-, we assume that the relevant features of the movement, i.e. those to imitate, are the features that appear most frequently, i.e. the invariants across the demonstration. As a result, demonstrations s... |

33 | The agent-based perspective on imitation
- Dautenhahn, Nehaniv
- 2002
(Show Context)
Citation Context ...[2]. In PbD, an agent (e.g. human, robot, etc.) shows the robot a task a few times (usually between 35 times to make the task bearable for the trainer). To avoid addressing the correspondence problem =-=[3]-=-, motions are demonstrated from the robot’s point of view by the user guiding the robot passively through the task. In our experiments, this is done either by back-driving the robot or by teleoperatin... |

8 |
Springer Handbook of Robotics
- Billard, Calinon, et al.
- 2008
(Show Context)
Citation Context ...ations, while performing the motion as close to the demonstrations as possible. The method is evaluated through a set of robotic experiments. I. INTRODUCTION We use Programming by Demonstration (PbD) =-=[1]-=-, also referred to as Learning by Imitation, to teach a robot how to move its limbs to perform a discrete, i.e. point-to-point [2]. In PbD, an agent (e.g. human, robot, etc.) shows the robot a task a ... |

7 | Bm: An iterative algorithm to learn stable non-linear dynamical systems with gaussian mixture models
- Khansari-Zadeh, Billard
- 2010
(Show Context)
Citation Context ...hat the complete model represents the data well, while LWPR minimizes the mean-square error between the estimates and the data (for a more detailed comparison and discussion on these methods refer to =-=[9]-=-) Because all of the aforementioned methods do not optimize under the constraint of making the system stable at the attractor, they are not guaranteed to result in a stable estimate of the motion. In ... |

2 |
Applied dynamic programming for optimization of dynamical systems
- Robinett
- 2005
(Show Context)
Citation Context ...,n , ∀n ∈ 1..N. Eq. 12-13 correspond to a Non-linear Programming (NLP) problem [15] that can be solved using different optimization techniques such as Newton-Like algorithms [15], Dynamic Programming =-=[16]-=-, etc. In this paper we use a quasi-Newton method to solve the optimization problem [15]. The first and second terms inside the integral of Eq. 12 force the solver to optimize reproduction in both pos... |