Results 1 - 10
of
10
Gaussian process latent variable models for visualisation of high dimensional data
- Adv. in Neural Inf. Proc. Sys
, 2004
"... We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input variables of the Gaussian process and compute a lower bound on the ex ..."
Abstract
-
Cited by 91 (1 self)
- Add to MetaCart
We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input variables of the Gaussian process and compute a lower bound on the exact marginal likelihood of the nonlinear latent variable model. The maximization of the variational lower bound provides a Bayesian training procedure that is robust to overfitting and can automatically select the dimensionality of the nonlinear latent space. We demonstrate our method on real world datasets. The focus in this paper is on dimensionality reduction problems, but the methodology is more general. For example, our algorithm is immediately applicable for training Gaussian process models in the presence of missing or uncertain inputs. 1
Monocular 3D Pose Estimation and Tracking by Detection
"... Automatic recovery of 3D human pose from monocular image sequences is a challenging and important research topic with numerous applications. Although current methods are able to recover 3D pose for a single person in controlled environments, they are severely challenged by realworld scenarios, such ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Automatic recovery of 3D human pose from monocular image sequences is a challenging and important research topic with numerous applications. Although current methods are able to recover 3D pose for a single person in controlled environments, they are severely challenged by realworld scenarios, such as crowded street scenes. To address this problem, we propose a three-stage process building on a number of recent advances. The first stage obtains an initial estimate of the 2D articulation and viewpoint of the person from single frames. The second stage allows early data association across frames based on tracking-by-detection. These two stages successfully accumulate the available 2D image evidence into robust estimates of 2D limb positions over short image sequences ( = tracklets). The third and final stage uses those tracklet-based estimates as robust image observations to reliably recover 3D pose. We demonstrate state-of-the-art performance on the HumanEva II benchmark, and also show the applicability of our approach to articulated 3D tracking in realistic street conditions. 1.
Learning GP-BayesFilters via Gaussian process latent variable models
- In Proceedings of robotics: science and systems (RSS
, 2009
"... Abstract — GP-BayesFilters are a general framework for integrating Gaussian process prediction and observation models into Bayesian filtering techniques, including particle filters and extended and unscented Kalman filters. GP-BayesFilters learn nonparametric filter models from training data contain ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
Abstract — GP-BayesFilters are a general framework for integrating Gaussian process prediction and observation models into Bayesian filtering techniques, including particle filters and extended and unscented Kalman filters. GP-BayesFilters learn nonparametric filter models from training data containing sequences of control inputs, observations, and ground truth states. The need for ground truth states limits the applicability of GP-BayesFilters to systems for which the ground truth can be estimated without prohibitive overhead. In this paper we introduce GPBF-LEARN, a framework for training GP-BayesFilters without any ground truth states. Our approach extends Gaussian Process Latent Variable Models to the setting of dynamical robotics systems. We show how weak labels for the ground truth states can be incorporated into the GPBF-LEARN framework. The approach is evaluated using a difficult tracking task, namely tracking a slotcar based on IMU measurements only. I.
Dynamical Binary Latent Variable Models for 3D Human Pose Tracking -- Supplementary Material
"... ..."
Latent Spaces for Dynamic Movement Primitives
"... Abstract — Dynamic movement primitives (DMPs) have been proposed as a powerful, robust and adaptive tool for planning robot trajectories based on demonstrated example movements. Adaptation of DMPs to new task requirements becomes difficult when demonstrated trajectories are only available in joint s ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract — Dynamic movement primitives (DMPs) have been proposed as a powerful, robust and adaptive tool for planning robot trajectories based on demonstrated example movements. Adaptation of DMPs to new task requirements becomes difficult when demonstrated trajectories are only available in joint space, because their parameters do not in general correspond to variables meaningful for the task. This problem becomes more severe with increasing number of degrees of freedom and hence is particularly an issue for humanoid movements. It has been shown that DMP parameters can directly relate to task variables, when DMPs are learned in latent spaces resulting from dimensionality reduction of demonstrated trajectories. As we show here, however, standard dimensionality reduction techniques do not in general provide adequate latent spaces which need to be highly regular. In this work we concentrate on learning discrete (point-topoint) movements and propose a modification of a powerful nonlinear dimensionality reduction technique (Gaussian Process Latent Variable Model). Our modification makes the GPLVM more suitable for the use of DMPs by favouring latent spaces with highly regular structure. Even though in this case the user has to provide a structure hypothesis we show that its precise choice is not important in order to achieve good results. Additionally, we can overcome one of the main disadvantages of the GPLVM with this modification: its dependence on the initialisation of the latent space. We motivate our approach on data from a 7-DoF robotic arm and demonstrate its feasibility on a high-dimensional human motion capture data set. I.
Transfering Nonlinear Representations using Gaussian Processes with a Shared Latent Space
, 2007
"... When a series of problems are related, representations derived from learning earlier tasks may be useful in solving later problems. In this paper we propose a novel approach to transfer learning with low-dimensional, non-linear latent spaces. We show how such representations can be jointly learned a ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
When a series of problems are related, representations derived from learning earlier tasks may be useful in solving later problems. In this paper we propose a novel approach to transfer learning with low-dimensional, non-linear latent spaces. We show how such representations can be jointly learned across multiple tasks in a discriminative probabilistic regression framework. When transferred to new tasks with relatively few training examples, learning can be faster and/or more accurate. Experiments on a digit recognition task show significantly improved performance when compared to baseline performance with the original feature representation or with a representation derived from a semi-supervised learning approach. 1
Backing Off: Hierarchical Decomposition of Activity for 3D Novel Pose Recovery
, 2009
"... For model-based 3D human pose estimation, even simple models of the human body lead to high-dimensional state spaces. Where the class of activity is known a priori, lowdimensional activity models learned from training data make possible a thorough and efficient search for the best pose. Conversely, ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
For model-based 3D human pose estimation, even simple models of the human body lead to high-dimensional state spaces. Where the class of activity is known a priori, lowdimensional activity models learned from training data make possible a thorough and efficient search for the best pose. Conversely, searching for solutions in the full state space places no restriction on the class of motion to be recovered, but is both difficult and expensive. This paper explores a potential middle ground between these approaches, using the hierarchical Gaussian process latent variable model to learn activity at different hierarchical scales within the human skeleton. We show that by training on full-body activity data then descending through the hierarchy in stages and exploring subtrees independently of one another, novel poses may be recovered. Experimental results on motion capture data and monocular video sequences demonstrate the utility of the approach, and comparisons are drawn with existing low-dimensional activity models.
Monocular 3D Human Motion Tracking Using Dynamic Probabilistic Latent Semantic Analysis
"... We propose a new statistical approach to human motion modeling and tracking that utilizes probabilistic latent semantic (PLSA) models to describe the mapping of image features to 3D human pose estimates. PLSA has been successfully used to model the co-occurrence of dyadic data on problems such as im ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We propose a new statistical approach to human motion modeling and tracking that utilizes probabilistic latent semantic (PLSA) models to describe the mapping of image features to 3D human pose estimates. PLSA has been successfully used to model the co-occurrence of dyadic data on problems such as image annotation where image features are mapped to word categories via latent variable semantics. We apply the PLSA approach to motion tracking by extending it to a sequential setting where the latent variables describe intrinsic motion semantics linking human figure appearance to 3D pose estimates. This approach is in contrast to many current methods that directly learn the often high-dimensional image-to-pose mappings and utilize subspace projections as a constraint on the pose space alone. As a consequence, such mappings may often exhibit increased computational complexity and insufficient generalization performance. We demonstrate the utility of the proposed model on the synthetic dataset and the task of 3D human motion tracking in monocular image sequences with arbitrary camera views. Our experiments show that the dynamic PLSA approach can produce accurate pose estimates at a fraction of the computational cost of alternative subspace tracking methods. 1.
Physics-Based Human Motion Modeling for People Tracking: A Short Tutorial
, 2009
"... Physics-based models have proved to be effective in modeling how people move and interact with the environment. Such dynamical models are prevalent in computer graphics and robotics, where they allow physically plausible animation and/or simulation of humanoid motion. Similar models have also proved ..."
Abstract
- Add to MetaCart
Physics-based models have proved to be effective in modeling how people move and interact with the environment. Such dynamical models are prevalent in computer graphics and robotics, where they allow physically plausible animation and/or simulation of humanoid motion. Similar models have also proved useful in biomechanics, allowing clinically meaningful analysis of human motion in terms of muscle and ground reaction forces. In computer vision the use of such models (e.g., as priors for video-based human pose tracking) has been limited. Most prior models in vision, to date, take the form of kinematic priors that can effectively be learned from motion capture data, but are inherently unable to explicitly account for physical plausibility of recovered motions (e.g., consistency with gravity, ground interactions, inertia, etc.). As a result many current methods suffer from visually unpleasant artifacts, (e.g., out of plane rotations, foot skate, etc.), especially when one is limited to monocular observations. Recently, physics-based prior models have been successfully illustrated to address some of these issues. We posit that physics-based prior models are among the next important steps in developing more robust methods to track human motion over time. That said, the models involved are conceptually challenging and carry a high overhead for those unfamiliar with Newtonian mechanics; furthermore good references that address practical issues of importance (particularly as they apply to vision problems) are scarce. This document will cover the motivation for the use of physics-based models for tracking of articulated objects (e.g., people), as well as the formalism required for someone unfamiliar with these models to easily get started. This document is part of the larger set of materials: slides, notes, and Matlab code, that will allow a capable novice to proceed along this innovative research path.
Two Distributed-State Models For Generating High-Dimensional Time Series ∗
"... In this paper we develop a class of nonlinear generative models for high-dimensional time series. We first propose a model based on the restricted Boltzmann machine (RBM) that uses an undirected model with binary latent variables and real-valued “visible ” variables. The latent and visible variables ..."
Abstract
- Add to MetaCart
In this paper we develop a class of nonlinear generative models for high-dimensional time series. We first propose a model based on the restricted Boltzmann machine (RBM) that uses an undirected model with binary latent variables and real-valued “visible ” variables. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. This “conditional ” RBM (CRBM) makes on-line inference efficient and allows us to use a simple approximate learning procedure. We demonstrate the power of our approach by synthesizing various sequences from a model trained on motion capture data and by performing on-line filling in of data lost during capture. We extend the CRBM in a way that preserves its most important computational properties and introduces multiplicative three-way interactions that allow the effective interaction weight between two variables to be modulated by the dynamic state of a third variable. We introduce a factoring of the implied three-way weight tensor to permit a more compact parameterization. The resulting model can capture diverse styles of motion with a single set of parameters, and the three-way interactions greatly improve its ability to blend motion styles or to transition smoothly among them. Videos and source code can be found at

