## Dynamic imitation in a humanoid robot through nonparametric probabilistic inference (2006)

### Cached

### Download Links

- [www.roboticsproceedings.org]
- [roboticsproceedings.org]
- [www.cs.washington.edu]
- [homes.cs.washington.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of Robotics: Science and Systems (RSS’06 |

Citations: | 35 - 5 self |

### BibTeX

@INPROCEEDINGS{Grimes06dynamicimitation,

author = {David B. Grimes and Rawichote Chalodhorn and Rajesh P. N. Rao},

title = {Dynamic imitation in a humanoid robot through nonparametric probabilistic inference},

booktitle = {In Proceedings of Robotics: Science and Systems (RSS’06},

year = {2006},

publisher = {MIT Press}

}

### OpenURL

### Abstract

Abstract — We tackle the problem of learning imitative wholebody motions in a humanoid robot using probabilistic inference in Bayesian networks. Our inference-based approach affords a straightforward method to exploit rich yet uncertain prior information obtained from human motion capture data. Dynamic imitation implies that the robot must interact with its environment and account for forces such as gravity and inertia during imitation. Rather than explicitly modeling these forces and the body of the humanoid as in traditional approaches, we show that stable imitative motion can be achieved by learning a sensorbased representation of dynamic balance. Bayesian networks provide a sound theoretical framework for combining prior kinematic information (from observing a human demonstrator) with prior dynamic information (based on previous experience) to model and subsequently infer motions which, with high probability, will be dynamically stable. By posing the problem as one of inference in a Bayesian network, we show that methods developed for approximate inference can be leveraged to efficiently perform inference of actions. Additionally, by using nonparametric inference and a nonparametric (Gaussian process) forward model, our approach does not make any strong assumptions about the physical environment or the mass and inertial properties of the humanoid robot. We propose an iterative, probabilistically constrained algorithm for exploring the space of motor commands and show that the algorithm can quickly discover dynamically stable actions for whole-body imitation of human motion. Experimental results based on simulation and subsequent execution by a HOAP-2 humanoid robot demonstrate that our algorithm is able to imitate a human performing actions such as squatting and a one-legged balance. I.

### Citations

7556 |
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
- Pearl
- 1988
(Show Context)
Citation Context ...cause we always want to enforce this condition for all t, we observe that bt = 1. In this sense, the variable bt is simply a notational device and is akin to a dummy child of dt in belief propagation =-=[19]-=- used for indicating evidence. Note that although we want to highly constrain dt, we also want to maintain a belief state over the dynamics configuration due to uncertainty in the forward model and th... |

1309 | Factor graphs and the sum-product algorithm
- Kschischang, Frey, et al.
- 2001
(Show Context)
Citation Context ... in directed Bayesian networks. We note that this difference is only semantic, and adopted out of convenience as any Bayesian network can (8) be represented as a MRF, or more generally a factor graph =-=[26]-=-. Belief propagation formulated for a Bayesian network is more convenient in our setting given the natural conditional semantics of the forward and observation models. Belief propagation (BP) computes... |

1244 | Condensation conditional density propagation for visual tracking
- Isard, Blake
- 1998
(Show Context)
Citation Context ...n view this message distribution as being parameterized, the result when many kernels functions are used is akin to a sample based representation (as in particle filters or the condensation algorithm =-=[27]-=-). Belief propagation computes a belief distribution Bn (x) based on the product of two sets of messages πn (x) and λn (x), which represent the information coming from neighboring parent and children ... |

257 | Learning by watching: Extracting reusable task knowledge from visual observation of human performance
- Kuniyoshi, Inaba, et al.
- 1994
(Show Context)
Citation Context ...f approaches, ranging from using nonlinear dynamical systems for imitation [11] to imitating arm motions using biologically motivated methods [12]. We refer the reader to these and related literature =-=[13]-=-–[16] for more details and alternate approaches to the imitation problem. III. PROBABILISTIC DYNAMIC BALANCE MODEL Our approach is based on the dynamic Bayesian network (DBN) shown in Figure 1. Imitat... |

224 | Algorithms for inverse reinforcement learning
- Ng, Russell
- 2000
(Show Context)
Citation Context ... of whole-body motion. Finally, none of these models specify a probabilistic method for the incorporation of uncertain prior information from human kinematic estimates. Inverse reinforcement learning =-=[9]-=- and apprenticeship learning [10] have been proposed to learn controllers for complex systems based on observing an expert and learning their reward function. However, the role of this type of expert ... |

224 | Nonparametric belief propagation
- Sudderth, Ihler, et al.
- 2003
(Show Context)
Citation Context ... models with discrete variables. Recent advances in machine learning have broadened the applicability to general graph structures [23] and to continuous variables in undirected graph structures [24], =-=[25]-=-. The inference approach we adopt is most similar to the NBP [25] method. While NBP is formulated for inference in a Markov random field (MRF) model, our approach uses Pearl’s notation for belief prop... |

222 | M.: PEGASUS: A policy search method for large MDPs and POMDPs
- Ng, Jordan
- 2000
(Show Context)
Citation Context ...rucial for achieving stability: a probabilistic sensor-based model of dynamic balance. Despite compelling advances in solving complex continuous partially observable Markov decision problems (POMDPs) =-=[1]-=-, [2] we pose the problem as one of inference also for a pragmatic reason: to leverage and evaluate recent approximate inference approaches to efficiently solve problems that have previously been rega... |

188 | Correctness of local probability propagation in graphical models with loops
- Weiss
(Show Context)
Citation Context ...ief propagation was originally restricted to tree structured graphical models with discrete variables. Recent advances in machine learning have broadened the applicability to general graph structures =-=[23]-=- and to continuous variables in undirected graph structures [24], [25]. The inference approach we adopt is most similar to the NBP [25] method. While NBP is formulated for inference in a Markov random... |

178 | Style-based inverse kinematics
- Grochow, Martin, et al.
(Show Context)
Citation Context ...se linear principal components analysis (PCA) but other non-linear embedding techniques (such as the GPLVM [17]) may be worth exploring for representing wider classes of motion using fewer dimensions =-=[18]-=-. Using the estimated kinematic motion from several demonstrations of the motion or behavior, we form a matrix C from the d principal component vectors of the posture space. The matrix C represents a ... |

147 | A robot controller using learning by imitation - Hayes, Demiris - 1994 |

145 | Gaussian process latent variable models for visualisation of high dimensional data
- Lawrence
- 2004
(Show Context)
Citation Context ...represent highdimensional data in compact low-dimensional latent spaces. For simplicity, we use linear principal components analysis (PCA) but other non-linear embedding techniques (such as the GPLVM =-=[17]-=-) may be worth exploring for representing wider classes of motion using fewer dimensions [18]. Using the estimated kinematic motion from several demonstrations of the motion or behavior, we form a mat... |

141 |
Webots: Professional mobile robot simulation
- Michel
- 2004
(Show Context)
Citation Context ... allowed for a compromise between kinematically similar imitations and dynamic stability of the resulting motion. We tested an implementation of our method using the robotics simulator package Webots =-=[30]-=-, which provides accurate dynamics simulation of the Fujitsu HOAP-2 robot. We used its sensor simulation capability to also model the necessary gyroscope and foot pressure sensor signals (to which we ... |

139 | R.: Fast sparse Gaussian process methods: The informative vector machine. Advances in neural information processing systems 15
- Lawrence, Seeger, et al.
- 2002
(Show Context)
Citation Context ...en the number of data points grows over about 300). Recently, several approaches have tackled the problem of large kernel matrices, either by applying heuristics to select a subset of the data points =-=[21]-=- or by low-rank approximations of the kernel matrix [22]. V. NONPARAMETRIC ACTION INFERENCE We now present an algorithm for action selection based on belief propagation [19] within the graphical model... |

137 | Sparse gaussian processes using pseudo-inputs
- Snelson, Ghahramani
- 2006
(Show Context)
Citation Context ...ntly, several approaches have tackled the problem of large kernel matrices, either by applying heuristics to select a subset of the data points [21] or by low-rank approximations of the kernel matrix =-=[22]-=-. V. NONPARAMETRIC ACTION INFERENCE We now present an algorithm for action selection based on belief propagation [19] within the graphical model shown in Figure 1. The result of performing belief prop... |

99 | Autonomous helicopter control using reinforcement learning policy search methods
- Bagnell, Schneider
- 2001
(Show Context)
Citation Context ...l for achieving stability: a probabilistic sensor-based model of dynamic balance. Despite compelling advances in solving complex continuous partially observable Markov decision problems (POMDPs) [1], =-=[2]-=- we pose the problem as one of inference also for a pragmatic reason: to leverage and evaluate recent approximate inference approaches to efficiently solve problems that have previously been regarded ... |

98 | Pampas: Real-Valued Graphical Models for Computer Vision
- Isard
- 2003
(Show Context)
Citation Context ...phical models with discrete variables. Recent advances in machine learning have broadened the applicability to general graph structures [23] and to continuous variables in undirected graph structures =-=[24]-=-, [25]. The inference approach we adopt is most similar to the NBP [25] method. While NBP is formulated for inference in a Markov random field (MRF) model, our approach uses Pearl’s notation for belie... |

93 | Gaussian Process Dynamical Models
- WANG, FLEET, et al.
- 2005
(Show Context)
Citation Context ...at) directly from empirical data collected from trials on the robot. Gaussian processes have been shown to be very powerful in learning stochastic nonlinear relationships directly from empirical data =-=[20]-=-. As no finite set of parameters can describe a Gaussian process, this method is called nonparametric. Empirical data gathered from exploration trials form a set of tuples D ⊂ S × A × S constructed vi... |

82 | Learning human arm movements by imitation: Evaluation of a biologically inspired connectionist architecture
- Billard, Mataric
- 2001
(Show Context)
Citation Context ...body of other work on imitation learning using a variety of approaches, ranging from using nonlinear dynamical systems for imitation [11] to imitating arm motions using biologically motivated methods =-=[12]-=-. We refer the reader to these and related literature [13]–[16] for more details and alternate approaches to the imitation problem. III. PROBABILISTIC DYNAMIC BALANCE MODEL Our approach is based on th... |

71 | Exploration and apprenticeship learning in reinforcement learning
- Abbeel, Ng
- 2005
(Show Context)
Citation Context ...none of these models specify a probabilistic method for the incorporation of uncertain prior information from human kinematic estimates. Inverse reinforcement learning [9] and apprenticeship learning =-=[10]-=- have been proposed to learn controllers for complex systems based on observing an expert and learning their reward function. However, the role of this type of expert and that of our human demonstrato... |

64 | Trajectory formation for imitation with nonlinear dynamical systems
- Ijspeert, Nakanishi, et al.
- 2001
(Show Context)
Citation Context ...ust be accounted for in the learning process. There exists a large body of other work on imitation learning using a variety of approaches, ranging from using nonlinear dynamical systems for imitation =-=[11]-=- to imitating arm motions using biologically motivated methods [12]. We refer the reader to these and related literature [13]–[16] for more details and alternate approaches to the imitation problem. I... |

64 | Accelerating reinforcement learning through implicit imitation
- Price, Boutilier
- 2003
(Show Context)
Citation Context ...roaches, ranging from using nonlinear dynamical systems for imitation [11] to imitating arm motions using biologically motivated methods [12]. We refer the reader to these and related literature [13]–=-=[16]-=- for more details and alternate approaches to the imitation problem. III. PROBABILISTIC DYNAMIC BALANCE MODEL Our approach is based on the dynamic Bayesian network (DBN) shown in Figure 1. Imitative m... |

63 | Dynamics filter-concept and implementation of on-line motion generator for human figures
- Yamane, Nakamura
- 2003
(Show Context)
Citation Context ... dynamically balanced biped and humanoid motion has long been considered a difficult and important research problem. Our overall approach is similar in spirit to Yamane and Nakamura’s dynamics filter =-=[3]-=-. However, unlike their approach which requires a physics-based model of the robot, our approach is model-free in the sense of not requiring any knowledge of dynamic properties such as mass or moment ... |

57 | Gaussian process priors with uncertain inputs – application to multiple-step ahead time series forecasting
- Girard, Rasmussen, et al.
- 2003
(Show Context)
Citation Context ...ficient. As future work, we intend to experiment with moment-matching methods which have been shown to be able to approximate the Gaussian process when the inputs are drawn from a normal distribution =-=[29]-=-. This has the potential to further reduce inference time. The computation of λn′ x (uj) as shown in Equation 13 is approached similarly. However, in this case, we have to integrate over an output var... |

35 |
Adaptive gait control of a biped robot based on realtime sensing of the ground
- Kajita, Tani
- 1996
(Show Context)
Citation Context ...of the robot, our approach is model-free in the sense of not requiring any knowledge of dynamic properties such as mass or moment of inertia. Other approaches based on the zeromoment point (ZMP) [4], =-=[5]-=- or inverted pendulum models [6] also require accurate knowledge of physical parameters to achieve stable motion. On the other hand, sensor-based or adaptive approaches are typically aimed at stabiliz... |

27 |
Development of a dynamic biped walking system for humanoid –development of a biped walking robot adapting to the human’s living floor
- Yamaguchi, Kinoshita, et al.
- 1996
(Show Context)
Citation Context ...del-free in the sense of not requiring any knowledge of dynamic properties such as mass or moment of inertia. Other approaches based on the zeromoment point (ZMP) [4], [5] or inverted pendulum models =-=[6]-=- also require accurate knowledge of physical parameters to achieve stable motion. On the other hand, sensor-based or adaptive approaches are typically aimed at stabilization within a particular gait m... |

7 |
Zero moment point-thirty-five years of its life
- Vukobratovic, Borovac
- 2004
(Show Context)
Citation Context ...odel of the robot, our approach is model-free in the sense of not requiring any knowledge of dynamic properties such as mass or moment of inertia. Other approaches based on the zeromoment point (ZMP) =-=[4]-=-, [5] or inverted pendulum models [6] also require accurate knowledge of physical parameters to achieve stable motion. On the other hand, sensor-based or adaptive approaches are typically aimed at sta... |

6 | Reinforcement learning of humanoid rhythmic walking parameters based on visual information
- Ogino, Katoh, et al.
- 2004
(Show Context)
Citation Context ...ire accurate knowledge of physical parameters to achieve stable motion. On the other hand, sensor-based or adaptive approaches are typically aimed at stabilization within a particular gait model [7], =-=[8]-=- and do not easily generalize to other classes of whole-body motion. Finally, none of these models specify a probabilistic method for the incorporation of uncertain prior information from human kinema... |

6 |
Efficient Multiscale Sampling from
- Ihler, Sudderth, et al.
- 2004
(Show Context)
Citation Context ...ng the product grows exponentially when performed repeatedly. Thus, we approximate products of messages based on the technique of multiscale sampling and multiplication of pairs of mixture components =-=[28]-=-. Following [19], we treat observed and hidden variables in the graph identically by allowing a node x to send itself the message λ ⋆ (x). If the node is observed, we model this message as a Dirac del... |

2 |
Gaussian Process Priors With Uncertain Inputs: Multiple-Step Ahead Prediction
- Rasmussen
- 2002
(Show Context)
Citation Context ...ficient. As future work, we intend to experiment with moment-matching methods which have been shown to be able to approximate the Gaussian process when the inputs are drawn from a normal distribution =-=[29]-=-. This has the potential to further reduce inference time. The computation of λn′ x (uj) as shown in Equation 13 is approached similarly. However, in this case, we have to integrate over an output var... |

1 |
Adaptive dynamic control of a biped walking robot with radial basis function neural networks
- Hu, Pratt, et al.
- 1998
(Show Context)
Citation Context ... require accurate knowledge of physical parameters to achieve stable motion. On the other hand, sensor-based or adaptive approaches are typically aimed at stabilization within a particular gait model =-=[7]-=-, [8] and do not easily generalize to other classes of whole-body motion. Finally, none of these models specify a probabilistic method for the incorporation of uncertain prior information from human k... |

1 | Robot learning from demonsration - Atkeson, Schaal |