Results 1  10
of
98
Gaussian process latent variable models for visualisation of high dimensional data
 Adv. in Neural Inf. Proc. Sys
, 2004
"... We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input variables of the Gaussian process and compute a lower bound on the ex ..."
Abstract

Cited by 145 (5 self)
 Add to MetaCart
(Show Context)
We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input variables of the Gaussian process and compute a lower bound on the exact marginal likelihood of the nonlinear latent variable model. The maximization of the variational lower bound provides a Bayesian training procedure that is robust to overfitting and can automatically select the dimensionality of the nonlinear latent space. We demonstrate our method on real world datasets. The focus in this paper is on dimensionality reduction problems, but the methodology is more general. For example, our algorithm is immediately applicable for training Gaussian process models in the presence of missing or uncertain inputs. 1
Peopletrackingbydetection and peopledetectionbytracking
 In CVPR’08
"... Both detection and tracking people are challenging problems, especially in complex real world scenes that commonly involve multiple people, complicated occlusions, and cluttered or even moving backgrounds. People detectors have been shown to be able to locate pedestrians even in complex street scene ..."
Abstract

Cited by 100 (7 self)
 Add to MetaCart
(Show Context)
Both detection and tracking people are challenging problems, especially in complex real world scenes that commonly involve multiple people, complicated occlusions, and cluttered or even moving backgrounds. People detectors have been shown to be able to locate pedestrians even in complex street scenes, but false positives have remained frequent. The identification of particular individuals has remained challenging as well. On the other hand, tracking methods are able to find a particular individual in image sequences, but are severely challenged by realworld scenarios such as crowded street scenes. In this paper, we combine the advantages of both detection and tracking in a single framework. The approximate articulation of each person is detected in every frame based on local features that model the appearance of individual body parts. Prior knowledge on possible articulations and temporal coherency within a walking cycle are modeled using a hierarchical Gaussian process latent variable model (hGPLVM). We show how the combination of these results improves hypotheses for position and articulation of each person in several subsequent frames. We present experimental results that demonstrate how this allows to detect and track multiple people in cluttered scenes with reoccurring occlusions. 1.
NonRigid StructureFromMotion: Estimating Shape and Motion with Hierarchical Priors
, 2007
"... This paper describes methods for recovering timevarying shape and motion of nonrigid 3D objects from uncalibrated 2D point tracks. For example, given a video recording of a talking person, we would like to estimate the 3D shape of the face at each instant, and learn a model of facial deformation. ..."
Abstract

Cited by 56 (1 self)
 Add to MetaCart
This paper describes methods for recovering timevarying shape and motion of nonrigid 3D objects from uncalibrated 2D point tracks. For example, given a video recording of a talking person, we would like to estimate the 3D shape of the face at each instant, and learn a model of facial deformation. Timevarying shape is modeled as a rigid transformation combined with a nonrigid deformation. Reconstruction is illposed if arbitrary deformations are allowed, and thus additional assumptions about deformations are required. We first suggest restricting shapes to lie within a lowdimensional subspace, and describe estimation algorithms. However, this restriction alone is insufficient to constrain reconstruction. To address these problems, we propose a reconstruction method using a Probabilistic Principal Components Analysis (PPCA) shape model, and an estimation algorithm that simultaneously estimates 3D shape and motion for each instant, learns the PPCA model parameters, and robustly fillsin missing data points. We then extend the model to model temporal dynamics in object shape, allowing the algorithm to robustly handle severe cases of missing data.
WiFiSLAM Using Gaussian Process Latent Variable Models
 In Proceedings of IJCAI 2007
, 2007
"... WiFi localization, the task of determining the physical location of a mobile device from wireless signal strengths, has been shown to be an accurate method of indoor and outdoor localization and a powerful building block for locationaware applications. However, most localization techniques require ..."
Abstract

Cited by 51 (5 self)
 Add to MetaCart
WiFi localization, the task of determining the physical location of a mobile device from wireless signal strengths, has been shown to be an accurate method of indoor and outdoor localization and a powerful building block for locationaware applications. However, most localization techniques require a training set of signal strength readings labeled against a ground truth location map, which is prohibitive to collect and maintain as maps grow large. In this paper we propose a novel technique for solving the WiFi SLAM problem using the Gaussian Process Latent Variable Model (GPLVM) to determine the latentspace locations of unlabeled signal strength data. We show how GPLVM, in combination with an appropriate motion dynamics model, can be used to reconstruct a topological connectivity graph from a signal strength sequence which, in combination with the learned Gaussian Process signal strength model, can be used to perform efficient localization. 1
Twin Gaussian Processes for Structured Prediction
, 2010
"... ... generic structured prediction method that uses Gaussian process (GP) priors on both covariates and responses, both multivariate, and estimates outputs by minimizing the KullbackLeibler divergence between two GP modeled as normal distributions over finite index sets of training and testing examp ..."
Abstract

Cited by 34 (4 self)
 Add to MetaCart
... generic structured prediction method that uses Gaussian process (GP) priors on both covariates and responses, both multivariate, and estimates outputs by minimizing the KullbackLeibler divergence between two GP modeled as normal distributions over finite index sets of training and testing examples, emphasizing the goal that similar inputs should produce similar percepts and this should hold, on average, between their marginal distributions. TGP captures not only the interdependencies between covariates, as in a typical GP, but also those between responses, so correlations among both inputs and outputs are accounted for. TGP is exemplified, with promising results, for the reconstruction of 3d human poses from monocular and multicamera video sequences in the recently introduced HumanEva benchmark, where we achieve 5 cm error on average per 3d marker for models trained jointly, using data from multiple people and multiple activities. The method is fast and automatic: it requires no handcrafting of the initial pose, camera calibration parameters, or the availability of a 3d body model associated with human subjects used for training or testing.
Dynamic imitation in a humanoid robot through nonparametric probabilistic inference
 In Proceedings of Robotics: Science and Systems (RSS’06
, 2006
"... Abstract — We tackle the problem of learning imitative wholebody motions in a humanoid robot using probabilistic inference in Bayesian networks. Our inferencebased approach affords a straightforward method to exploit rich yet uncertain prior information obtained from human motion capture data. Dyna ..."
Abstract

Cited by 34 (5 self)
 Add to MetaCart
(Show Context)
Abstract — We tackle the problem of learning imitative wholebody motions in a humanoid robot using probabilistic inference in Bayesian networks. Our inferencebased approach affords a straightforward method to exploit rich yet uncertain prior information obtained from human motion capture data. Dynamic imitation implies that the robot must interact with its environment and account for forces such as gravity and inertia during imitation. Rather than explicitly modeling these forces and the body of the humanoid as in traditional approaches, we show that stable imitative motion can be achieved by learning a sensorbased representation of dynamic balance. Bayesian networks provide a sound theoretical framework for combining prior kinematic information (from observing a human demonstrator) with prior dynamic information (based on previous experience) to model and subsequently infer motions which, with high probability, will be dynamically stable. By posing the problem as one of inference in a Bayesian network, we show that methods developed for approximate inference can be leveraged to efficiently perform inference of actions. Additionally, by using nonparametric inference and a nonparametric (Gaussian process) forward model, our approach does not make any strong assumptions about the physical environment or the mass and inertial properties of the humanoid robot. We propose an iterative, probabilistically constrained algorithm for exploring the space of motor commands and show that the algorithm can quickly discover dynamically stable actions for wholebody imitation of human motion. Experimental results based on simulation and subsequent execution by a HOAP2 humanoid robot demonstrate that our algorithm is able to imitate a human performing actions such as squatting and a onelegged balance. I.
Gaussian Process Latent Variable Models for Human Pose Estimation
"... We describe a generative approach to recover 3D human pose from image silhouettes. Our method is based on learning a shared low dimensional latent representation capable of generating both human pose and image observations through the GPLVM [1]. We learn a dynamical model over the latent space whic ..."
Abstract

Cited by 32 (6 self)
 Add to MetaCart
(Show Context)
We describe a generative approach to recover 3D human pose from image silhouettes. Our method is based on learning a shared low dimensional latent representation capable of generating both human pose and image observations through the GPLVM [1]. We learn a dynamical model over the latent space which allows us to disambiguate between ambiguous silhouettes by temporal consistency. The model has only two free parameters and requires no manual initialization. 1.
S.: Simultaneous learning of nonlinear manifold and dynamical models for highdimensional time series
 In: Proc. ICCV (2007
"... I am very grateful to my advisor, Prof. Stan Sclaroff, for supporting me over the years, and for giving me so much freedom to discover my own research interests and to study diverse topics. Prof. Margrit Betke has always been a very detailed and thorough reviewer, and has kept me honest about all th ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
(Show Context)
I am very grateful to my advisor, Prof. Stan Sclaroff, for supporting me over the years, and for giving me so much freedom to discover my own research interests and to study diverse topics. Prof. Margrit Betke has always been a very detailed and thorough reviewer, and has kept me honest about all the details. Profess David Fleet has provided me invaluable suggestions on using precise words and improving the technical presentation of this thesis with his deep insight into the problem. I would like to express my gratitude to Dr. MingHsuan Yang for hiring me as an intern at Honda Research Institute in 2005, where the early framework of this thesis was formulated. Dr. Fatih Porikli gave me the opportunity to work at Mitsubishi Electronics Research Lab during June to December 2008. I had a good break from my thesis research and worked on medical image analysis problems. I really appreciate his great advice and suggestions on how to start my professional career. Looking back, this journey would not be as enjoyable and fun without all the members from the image and video computing group. I would to like thank John Isidoro for being a great mentor and has tolerated many silly questions I had during his busiest time. Joni
Multifactor Gaussian Process Models for StyleContent Separation
"... We introduce models for density estimation with multiple, hidden, continuous factors. In particular, we propose a generalization of multilinear models using nonlinear basis functions. By marginalizing over the weights, we obtain a multifactor form of the Gaussian process latent variable model. In th ..."
Abstract

Cited by 26 (5 self)
 Add to MetaCart
(Show Context)
We introduce models for density estimation with multiple, hidden, continuous factors. In particular, we propose a generalization of multilinear models using nonlinear basis functions. By marginalizing over the weights, we obtain a multifactor form of the Gaussian process latent variable model. In this model, each factor is kernelized independently, allowing nonlinear mappings from any particular factor to the data. We learn models for human locomotion data, in which each pose is generated by factors representing the person’s identity, gait, and the current state of motion. We demonstrate our approach using timeseries prediction, and by synthesizing novel animation from the model. 1.
TopologicallyConstrained Latent Variable Models
"... In dimensionality reduction approaches, the data are typically embedded in a Euclidean latent space. However for some data sets this is inappropriate. For example, in human motion data we expect latent spaces that are cylindrical or a toroidal, that are poorly captured with a Euclidean space. In thi ..."
Abstract

Cited by 25 (6 self)
 Add to MetaCart
(Show Context)
In dimensionality reduction approaches, the data are typically embedded in a Euclidean latent space. However for some data sets this is inappropriate. For example, in human motion data we expect latent spaces that are cylindrical or a toroidal, that are poorly captured with a Euclidean space. In this paper, we present a range of approaches for embedding data in a nonEuclidean latent space. Our focus is the Gaussian Process latent variable model. In the context of human motion modeling this allows us to (a) learn models with interpretable latent directions enabling, for example, style/content separation, and (b) generalise beyond the data set enabling us to learn transitions between motion styles even though such transitions are not present in the data. 1.