## The Role of Manifold Learning in Human Motion Analysis

Citations: | 2 - 0 self |

### BibTeX

@MISC{Elgammal_therole,

author = {Ahmed Elgammal and Chan Su Lee},

title = {The Role of Manifold Learning in Human Motion Analysis},

year = {}

}

### OpenURL

### Abstract

Abstract. Human body is an articulated object with high degrees of freedom. Despite the high dimensionality of the configuration space, many human motion activities lie intrinsically on low dimensional manifolds. Although the intrinsic body configuration manifolds might be very low in dimensionality, the resulting appearance manifolds are challenging to model given various aspects that affects the appearance such as the shape and appearance of the person performing the motion, or variation in the view point, or illumination. Our objective is to learn representations for the shape and the appearance of moving (dynamic) objects that support tasks such as synthesis, pose recovery, reconstruction, and tracking. We studied various approaches for representing global deformation manifolds that preserve their geometric structure. Given such representations, we can learn generative models for dynamic shape and appearance. We also address the fundamental question of separating style and content on nonlinear manifolds representing dynamic objects. We learn factorized generative models that explicitly decompose the intrinsic body configuration (content) as a function of time from the appearance/shape (style factors) of the person performing the action as time-invariant parameters. We show results on pose recovery, body tracking, gait recognition, as well as facial expression tracking and recognition. 1

### Citations

3003 | Eigenfaces for Recognition
- Turk, Pentland
- 1991
(Show Context)
Citation Context ... models? Linear models, such as PCA [31], have been widely used in appearance modeling to discover subspaces for variations. For example, PCA has been used extensively for face recognition such as in =-=[52, 1, 15, 47]-=- and to model the appearance and view manifolds for 3D object recognition as in [53]. Such subspace analysis can be further extended to decompose multiple orthogonal factors using bilinear models and ... |

2269 |
Principal Component Analysis
- Jolliffe
- 2002
(Show Context)
Citation Context ...ation manifold, view manifold, shape manifold, illumination manifold, etc. Linear, Bilinear and Multi-linear Models: Can we decompose the configuration using linear models? Linear models, such as PCA =-=[31]-=-, have been widely used in appearance modeling to discover subspaces for variations. For example, PCA has been used extensively for face recognition such as in [52, 1, 15, 47] and to model the appeara... |

1718 | Nonlinear dimensionality reduction by locally linear embedding
- Roweis, Saul
(Show Context)
Citation Context ...recover such manifold. Nonlinear Dimensionality Reduction and Decomposition of Orthogonal Factors: Recently some promising frameworks for nonlinear dimensionality reduction have been introduced, e.g. =-=[75, 68, 2, 10, 38, 85, 50]-=-. Such approaches can achieve embedding of nonlinear manifolds through changing the metric from the original space to the embedding space based on local structure of the manifold. While there are vari... |

1636 | Eigenfaces vs. fisherfaces: recognition using class specific linear projection
- Belhumeur, Hespanha, et al.
- 1997
(Show Context)
Citation Context ... models? Linear models, such as PCA [31], have been widely used in appearance modeling to discover subspaces for variations. For example, PCA has been used extensively for face recognition such as in =-=[52, 1, 15, 47]-=- and to model the appearance and view manifolds for 3D object recognition as in [53]. Such subspace analysis can be further extended to decompose multiple orthogonal factors using bilinear models and ... |

1211 | Pfinder: real-time tracking of the human body
- Wren, Azarbayejani, et al.
- 1997
(Show Context)
Citation Context ...e, or models for clothing, etc. Partial recovery of body configuration can also be achieved through intermediate view-based representations (models) that may or may not be tied to specific body parts =-=[18, 12, 86, 33, 6, 27, 87, 22, 73, 24]-=-. In such case constancy of the local appearance of individual body parts is exploited. Alternative paradigms are appearance-based and motion-based approaches where the focus is to track and recognize... |

1096 |
Active shape models their training and application
- Cootes, Taylor, et al.
- 1995
(Show Context)
Citation Context ... models? Linear models, such as PCA [31], have been widely used in appearance modeling to discover subspaces for variations. For example, PCA has been used extensively for face recognition such as in =-=[52, 1, 15, 47]-=- and to model the appearance and view manifolds for 3D object recognition as in [53]. Such subspace analysis can be further extended to decompose multiple orthogonal factors using bilinear models and ... |

1004 |
Visual learning and recognition of 3-D objects from appearance
- Murase, Nayar
- 1995
(Show Context)
Citation Context ...ver subspaces for variations. For example, PCA has been used extensively for face recognition such as in [52, 1, 15, 47] and to model the appearance and view manifolds for 3D object recognition as in =-=[53]-=-. Such subspace analysis can be further extended to decompose multiple orthogonal factors using bilinear models and multi-linear tensor analysis [76, 80]. The pioneering work of Tenenbaum and Freeman ... |

785 | Laplacian eigenmaps for dimensionality reduction and data representation." Neural Computation 15(6
- Belkin, Niyogi
- 2003
(Show Context)
Citation Context ...recover such manifold. Nonlinear Dimensionality Reduction and Decomposition of Orthogonal Factors: Recently some promising frameworks for nonlinear dimensionality reduction have been introduced, e.g. =-=[75, 68, 2, 10, 38, 85, 50]-=-. Such approaches can achieve embedding of nonlinear manifolds through changing the metric from the original space to the embedding space based on local structure of the manifold. While there are vari... |

696 |
Networks for approximation and learning
- Poggio, Girosi
- 1990
(Show Context)
Citation Context ...a is a function. It is well know that learning a smooth mapping from examples is an ill-posed problem unless the mapping is constrained since the mapping will be undefined in other parts of the space =-=[56]-=-. We Argue that, explicit modeling of the visual manifold represents a way to constrain any mapping between the visual input and any other space. Nonlinear embedding of the manifold, as was discussed ... |

648 |
Learning with Kernels Support Vector Machines, Regularization, Optimization and Beyond
- Schölkopf, Smola
- 2002
(Show Context)
Citation Context ...through solving an eigen-value problem on such matrix. It was shown in [3, 26] that these approaches are all instances of kernel-based learning, in particular kernel principle component analysis KPCA =-=[69]-=-. In [4] an approach for embedding out-of-sample points to complement such approaches. Along the same line, our work [19, 21] introduced a general framework for mapping between input and embedding spa... |

566 | EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Represe- ntation
- Black, Jepson
- 1998
(Show Context)
Citation Context ...e, or models for clothing, etc. Partial recovery of body configuration can also be achieved through intermediate view-based representations (models) that may or may not be tied to specific body parts =-=[18, 12, 86, 33, 6, 27, 87, 22, 73, 24]-=-. In such case constancy of the local appearance of individual body parts is exploited. Alternative paradigms are appearance-based and motion-based approaches where the focus is to track and recognize... |

503 | The Recognition of Human Movement Using Temporal Templates
- Bobick, Davis
- 2001
(Show Context)
Citation Context ...al body parts is exploited. Alternative paradigms are appearance-based and motion-based approaches where the focus is to track and recognize human activities without full recovery of the 3D body pose =-=[58, 54, 57, 59, 55, 74, 63, 7, 17]-=-. Recently, there have been research for recovering body posture directly from the visual input by posing the problem as a learning problem through searching a prelabelled database of body posture [51... |

438 |
Multidimensional Scaling
- Cox, Cox
- 2001
(Show Context)
Citation Context ...h nonlinearity, PCA will not be able to discover the underlying manifold. Simply, linear models will not be able to interpolate intermediate poses. For the same reason, multidimensional scaling (MDS) =-=[16]-=- also fails to recover such manifold. Nonlinear Dimensionality Reduction and Decomposition of Orthogonal Factors: Recently some promising frameworks for nonlinear dimensionality reduction have been in... |

344 |
Matrix Differential Calculus with Applications in Statistics and Econometrics, revised edition
- Magnus, Neudecker
- 1999
(Show Context)
Citation Context ...tors using bilinear models and multi-linear tensor analysis [76, 80]. The pioneering work of Tenenbaum and Freeman [76] formulated the separation of style and content using a bilinear model framework =-=[48]-=-. In that work, a bilinear model was used to decompose face appearance into two factors: head pose and different people as style and content interchangeably. They presented a computational framework f... |

335 | Stochastic tracking of 3d human figures using 2d image motion
- Sidenbladh, Black, et al.
- 2000
(Show Context)
Citation Context ...l constrained by the 3D body structure and the dynamics of the action being performed. Such constraints are explicitly exploited to recover the body configuration and motion in model-based approaches =-=[32, 28, 13, 64, 62, 23, 34, 72]-=- through explicitly specifying articulated models of the body parts, joint angles and their kinematics (or dynamics) as well as models for camera geometry and image formation. Recovering body configur... |

329 | 3D model-based tracking of humans in action: A multi-view approach
- Gavrila, Davis
- 1996
(Show Context)
Citation Context ...l constrained by the 3D body structure and the dynamics of the action being performed. Such constraints are explicitly exploited to recover the body configuration and motion in model-based approaches =-=[32, 28, 13, 64, 62, 23, 34, 72]-=- through explicitly specifying articulated models of the body parts, joint angles and their kinematics (or dynamics) as well as models for camera geometry and image formation. Recovering body configur... |

258 | Voice puppetry
- Brand
- 1999
(Show Context)
Citation Context ...ctly from the visual input by posing the problem as a learning problem through searching a prelabelled database of body posture [51, 36, 70] or through learning regression models from input to output =-=[29, 9, 66, 67, 65, 14, 60]-=-. All these approaches pose the problem as a machine learning problem where the objective is to learn input-output mapping from input-output pairs of training data. Such approaches have great potentia... |

241 |
Model based vision: A program to see a walking person
- Hogg
- 1983
(Show Context)
Citation Context ...l constrained by the 3D body structure and the dynamics of the action being performed. Such constraints are explicitly exploited to recover the body configuration and motion in model-based approaches =-=[32, 28, 13, 64, 62, 23, 34, 72]-=- through explicitly specifying articulated models of the body parts, joint angles and their kinematics (or dynamics) as well as models for camera geometry and image formation. Recovering body configur... |

235 |
Some mathematical notes on three-mode factor analysis
- Tucker
- 1966
(Show Context)
Citation Context ...representation of image data was used in [71] for video compression and in [79, 84] for motion analysis and synthesis. N-mode analysis of higher-order tensors was originally proposed and developed in =-=[78, 35, 48]-=- and others. Another extension is algebraic solution for subspace clustering through generalized-PCA [83, 82] Fig. 1. Twenty sample frames from a walking cycle from a side view. Each row represents ha... |

215 | Model-based tracking of self-occluding articulated objects
- Regh, Kanade
- 1995
(Show Context)
Citation Context |

209 | Cardboard People: A Parameterized Model of Articulated Motion
- Ju, Black, et al.
- 1996
(Show Context)
Citation Context ...e, or models for clothing, etc. Partial recovery of body configuration can also be achieved through intermediate view-based representations (models) that may or may not be tied to specific body parts =-=[18, 12, 86, 33, 6, 27, 87, 22, 73, 24]-=-. In such case constancy of the local appearance of individual body parts is exploited. Alternative paradigms are appearance-based and motion-based approaches where the focus is to track and recognize... |

199 | Fast pose estimation with parametersensitive hashing
- Shakhnarovich, Viola, et al.
- 2003
(Show Context)
Citation Context ...17]. Recently, there have been research for recovering body posture directly from the visual input by posing the problem as a learning problem through searching a prelabelled database of body posture =-=[51, 36, 70]-=- or through learning regression models from input to output [29, 9, 66, 67, 65, 14, 60]. All these approaches pose the problem as a machine learning problem where the objective is to learn input-outpu... |

184 |
Toward model-based recognition of human movements in image sequences
- Rohr
- 1994
(Show Context)
Citation Context |

181 | Implicit probabilistic models of human motion for synthesis and tracking
- Sidenbladh, Black, et al.
(Show Context)
Citation Context |

180 | Parameterized modeling and recognition of activities
- Yacoob, Black
- 1999
(Show Context)
Citation Context |

178 | Separating style and content with bilinear models
- Tenenbaum, Freeman
(Show Context)
Citation Context ...and view manifolds for 3D object recognition as in [53]. Such subspace analysis can be further extended to decompose multiple orthogonal factors using bilinear models and multi-linear tensor analysis =-=[76, 80]-=-. The pioneering work of Tenenbaum and Freeman [76] formulated the separation of style and content using a bilinear model framework [48]. In that work, a bilinear model was used to decompose face appe... |

176 | Unsupervised learning of image manifolds by semidefinite programming
- Weinberger, Saul
- 2005
(Show Context)
Citation Context ...recover such manifold. Nonlinear Dimensionality Reduction and Decomposition of Orthogonal Factors: Recently some promising frameworks for nonlinear dimensionality reduction have been introduced, e.g. =-=[75, 68, 2, 10, 38, 85, 50]-=-. Such approaches can achieve embedding of nonlinear manifolds through changing the metric from the original space to the embedding space based on local structure of the manifold. While there are vari... |

169 | Visual tracking of high dof articulated structures: An application to human hand tracking
- Rehg, Kanade
- 1994
(Show Context)
Citation Context ...ese approaches involves searching high dimensional spaces (body configuration and geometric transformation) which is typically formulated deterministically as a nonlinear optimization problem,2 e.g. =-=[61, 62]-=-, or probabilistically as a maximum likelihood problem, e.g. [72]. Such approaches achieve significant success when the search problem is constrained as in tracking context. However, initialization re... |

167 | W4: who? when? where? what? a real time system for detecting and tracking people
- Haritaoglu, Harwood, et al.
- 1998
(Show Context)
Citation Context |

164 | Inferring 3D body pose from silhouettes using activity manifold learning
- Elgammal, Lee
(Show Context)
Citation Context ... can simultaneously solve for the pose, view point, and reconstruct the input. A block diagram for recovering 3D pose and view point given learned manifold models are shown in Figure 4. The framework =-=[20]-=- is based on learning three components as shown in Figure 4-a: 1. Learning Manifold Representation: using nonlinear dimensionality reduction we achieve an embedding of the global deformation manifold ... |

159 |
Space-time gestures
- Darrell, Pentland
- 1993
(Show Context)
Citation Context |

155 | Estimating human body configurations using shape context matching
- Mori, Malik
- 2002
(Show Context)
Citation Context ...17]. Recently, there have been research for recovering body posture directly from the visual input by posing the problem as a learning problem through searching a prelabelled database of body posture =-=[51, 36, 70]-=- or through learning regression models from input to output [29, 9, 66, 67, 65, 14, 60]. All these approaches pose the problem as a machine learning problem where the objective is to learn input-outpu... |

143 | process latent variable models for visualization of high dimensional data
- Lawrence, ”Gaussian
(Show Context)
Citation Context |

138 | Probabilistic tracking in a metric space
- Toyama, Blake
- 2001
(Show Context)
Citation Context ...n in an implicit way. Learning nonlinear deformation manifolds is typically performed in the visual input space or through intermediate representations. For example, Exemplar-based approaches such as =-=[77]-=- implicitly model nonlinear manifolds through points (exemplars) along the manifold. SuchThe Role of Manifold Learning in Human Motion Analysis 3 exemplars are represented in the visual input space. ... |

137 |
Model-based image analysis of human motion using constraint propagation
- O’Rourke, Badler
(Show Context)
Citation Context |

136 | Freeman Bayesian reconstruction of 3D human motion from single-camera video
- Howe, Leventon, et al.
- 2000
(Show Context)
Citation Context ...ctly from the visual input by posing the problem as a learning problem through searching a prelabelled database of body posture [51, 36, 70] or through learning regression models from input to output =-=[29, 9, 66, 67, 65, 14, 60]-=-. All these approaches pose the problem as a machine learning problem where the objective is to learn input-output mapping from input-output pairs of training data. Such approaches have great potentia... |

135 | Recognition of human body motion using phase space constraints
- Campbell, Bobick
- 1995
(Show Context)
Citation Context |

126 | Generalized principal component analysis (GPCA
- Vidal, Ma, et al.
- 2003
(Show Context)
Citation Context ...esis. N-mode analysis of higher-order tensors was originally proposed and developed in [78, 35, 48] and others. Another extension is algebraic solution for subspace clustering through generalized-PCA =-=[83, 82]-=- Fig. 1. Twenty sample frames from a walking cycle from a side view. Each row represents half a cycle. Notice the similarity between the two half cycles. The right part shows the similarity matrix: ea... |

120 | The CMU motion of body (Mobo) database
- Gross, Shi
- 2001
(Show Context)
Citation Context ...he Role of Manifold Learning in Human Motion Analysis 21 In this section we show an example of learning the nonlinear manifold of gait as an example of a dynamic shape. We used CMU Mobo gait data set =-=[25]-=- which contains walking people from multiple synchronized views 4 . For training we selected five people, five cycles each from four different views. i.e., total number of cycles for training is 100=5... |

119 |
28). Image representations for visual learning
- Beymer, Poggio
- 1996
(Show Context)
Citation Context ...work to recover the pose. In order to learn such nonlinear mapping, we use Radial basis function (RBF) interpolation framework. The use of RBF for image synthesis and analysis has been pioneered8 by =-=[56, 5]-=- where RBF networks were used to learn nonlinear mappings between image space and a supervised parameter space. In our work we use RBF interpolation framework in a novel way to learn mapping from unsu... |

119 | Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection
- Kakadiaris, Metaxas
(Show Context)
Citation Context |

116 | A kernel view of the dimensionality reduction of manifolds
- Ham, Lee, et al.
- 2004
(Show Context)
Citation Context ...nity matrix between data points using data dependent kernels, which reflect local manifold structure. Embedding is then achieved through solving an eigen-value problem on such matrix. It was shown in =-=[3, 26]-=- that these approaches are all instances of kernel-based learning, in particular kernel principle component analysis KPCA [69]. In [4] an approach for embedding out-of-sample points to complement such... |

111 | Inferring Body Pose without Tracking Body Parts
- Rosales, Sclaroff
(Show Context)
Citation Context ...ctly from the visual input by posing the problem as a learning problem through searching a prelabelled database of body posture [51, 36, 70] or through learning regression models from input to output =-=[29, 9, 66, 67, 65, 14, 60]-=-. All these approaches pose the problem as a machine learning problem where the objective is to learn input-output mapping from input-output pairs of training data. Such approaches have great potentia... |

103 | Out-of-sample extensions for LLE, Isomap, MDS, eigenmaps, and spectral clustering
- Bengio, Paiement, et al.
- 2003
(Show Context)
Citation Context ...olving an eigen-value problem on such matrix. It was shown in [3, 26] that these approaches are all instances of kernel-based learning, in particular kernel principle component analysis KPCA [69]. In =-=[4]-=- an approach for embedding out-of-sample points to complement such approaches. Along the same line, our work [19, 21] introduced a general framework for mapping between input and embedding spaces. All... |

100 |
A multilinear singular value decomposition
- Lathauwer, Moor, et al.
(Show Context)
Citation Context ...es into orthogonal factors controlling the appearance of the face, including geometry (people), expressions, head pose, and illumination. They employed high order singular value decomposition (HOSVD) =-=[37]-=- to fit multi-linear models. Tensor representation of image data was used in [71] for video compression and in [79, 84] for motion analysis and synthesis. N-mode analysis of higher-order tensors was o... |

100 | Detecting activities
- Polana, Nelson
- 1993
(Show Context)
Citation Context ...al body parts is exploited. Alternative paradigms are appearance-based and motion-based approaches where the focus is to track and recognize human activities without full recovery of the 3D body pose =-=[58, 54, 57, 59, 55, 74, 63, 7, 17]-=-. Recently, there have been research for recovering body posture directly from the visual input by posing the problem as a learning problem through searching a prelabelled database of body posture [51... |

88 | Motion segmentation with missing data using powerfactorization and GPCA
- Vidal, Hartley
- 2004
(Show Context)
Citation Context ...esis. N-mode analysis of higher-order tensors was originally proposed and developed in [78, 35, 48] and others. Another extension is algebraic solution for subspace clustering through generalized-PCA =-=[83, 82]-=- Fig. 1. Twenty sample frames from a walking cycle from a side view. Each row represents half a cycle. Notice the similarity between the two half cycles. The right part shows the similarity matrix: ea... |

87 | Inferring 3d structure with a statistical image-based shape model
- Grauman, Shakhnarovich, et al.
- 2003
(Show Context)
Citation Context ...17]. Recently, there have been research for recovering body posture directly from the visual input by posing the problem as a learning problem through searching a prelabelled database of body posture =-=[51, 36, 70]-=- or through learning regression models from input to output [29, 9, 66, 67, 65, 14, 60]. All these approaches pose the problem as a machine learning problem where the objective is to learn input-outpu... |

86 | Nonlinear manifold learning for visual speech recognition
- Bregler, Omohundro
- 1995
(Show Context)
Citation Context ... in Human Motion Analysis 3 exemplars are represented in the visual input space. HMM models provide a probabilistic piecewise linear approximation which can be used to learn nonlinear manifolds as in =-=[11]-=- and in [9]. Although the intrinsic body configuration manifolds might be very low in dimensionality, the resulting appearance manifolds are challenging to model given various aspects that affect the ... |

84 | Multilinear subspace analysis for image ensembles
- Vasilescu, Terzopoulos
(Show Context)
Citation Context ...ation of the body configuration to a kernel induced space and each ai is a vector representing a parameterization of orthogonal factor i, C is a core tensor, ×i is mode-i tensor product as defined in =-=[37, 81]-=-. For example for the gait case, a generative model for walking silhouettes for different people from different view points will be in the form yt = γ(xt; v, s) = C × v × s × ψ(x) (19) where v is a pa... |