Results 1 -
5 of
5
Towards View-Invariant Expression Analysis Using Analytic Shape Manifolds
"... Abstract — Facial expression analysis is one of the important components for effective human-computer interaction. However, to develop robust and generalizable models for expression analysis one needs to break the dependence of the models on the choice of the coordinate frame of the camera i.e. expr ..."
Abstract
- Add to MetaCart
Abstract — Facial expression analysis is one of the important components for effective human-computer interaction. However, to develop robust and generalizable models for expression analysis one needs to break the dependence of the models on the choice of the coordinate frame of the camera i.e. expression models should generalize across facial poses. To perform this systematically, one needs to understand the space of observed images subject to projective transformations. However, since the projective shape-space is cumbersome to work with, we address this problem by deriving models for expressions on the affine shape-space as an approximation to the projective shape-space by using a Riemannian interpretation of deformations that facial expressions cause on different parts of the face. We use landmark configurations to represent facial deformations and exploit the fact that the affine shape-space can be studied using the Grassmann manifold. This representation enables us to perform various expression analysis and recognition algorithms without the need for the normalization as a preprocessing step. We extend some of the available approaches for expression analysis to the Grassmann manifold and experimentally show promising results, paving the way for a more general theory of view-invariant expression analysis. I.
Statistical Computations on Grassmann and 1 Stiefel manifolds for Image and Video-Based Recognition
, 2010
"... In this paper, we examine image and video based recognition applications where the underlying models have a special structure – the linear subspace structure. We discuss how commonly used parametric models for videos and image-sets can be described using the unified framework of Grassmann and Stiefe ..."
Abstract
- Add to MetaCart
In this paper, we examine image and video based recognition applications where the underlying models have a special structure – the linear subspace structure. We discuss how commonly used parametric models for videos and image-sets can be described using the unified framework of Grassmann and Stiefel manifolds. We first show that the parameters of linear dynamic models are finite dimensional linear subspaces of appropriate dimensions. Unordered image-sets as samples from a finite-dimensional linear subspace naturally fall under this framework. We show that the study of inference over subspaces can be naturally cast as an inference problem on the Grassmann manifold. To perform recognition using subspace-based models, we need tools from the Riemannian geometry of the Grassmann manifold. This involves a study of the geometric properties of the space, appropriate definitions of Riemannian metrics, and definition of geodesics. Further, we derive statistical modeling of inter- and intra-class variations that respect the geometry of the space. We apply techniques such as intrinsic and extrinsic statistics, to enable maximum-likelihood classification. We also provide algorithms for unsupervised clustering derived from the geometry of the manifold. Finally, we demonstrate the improved performance of these methods in a wide variety of vision applications such as activity A preliminary version of this paper appeared in [1].
Generalized Time Warping for Alignment of Human Behavior
"... Temporal alignment of human motion performing similar activities has been a topic of recent interest due to its many applications in animation, tele-rehabilitation or activity recognition. This paper presents generalized time warping (GTW), an extension of dynamic time warping (DTW) for temporally a ..."
Abstract
- Add to MetaCart
Temporal alignment of human motion performing similar activities has been a topic of recent interest due to its many applications in animation, tele-rehabilitation or activity recognition. This paper presents generalized time warping (GTW), an extension of dynamic time warping (DTW) for temporally aligning multi-modal sequences from multiple subjects performing similar activities. GTW solves three major drawbacks of existing approaches based on DTW: (1) GTW provides a feature weighting layer to adapt different modalities (e.g., video and motion capture data), (2) GTW extends DTW by allowing a more flexible time warping as combination of monotonic functions, (3) unlike DTW that typically has a quadratic cost, GTW has linear complexity in terms of the length of the sequence. Experimental results demonstrate that GTW can efficiently solve the multi-modal temporal alignment problem, and outperforms state-of-the-art methods for temporal alignment of signals with the same modality. 1.
Human Action Recognition using Multiple Views: A Comparative Perspective on Recent Developments
"... This paper presents a review and comparative study of recent multi-view 2D and 3D approaches for human action recognition. The approaches are reviewed and categorized due to their nature. We report a comparison of the most promising methods using two publicly available datasets: the INRIA Xmas Motio ..."
Abstract
- Add to MetaCart
This paper presents a review and comparative study of recent multi-view 2D and 3D approaches for human action recognition. The approaches are reviewed and categorized due to their nature. We report a comparison of the most promising methods using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) and the i3DPost Multi-View Human Action and Interaction Dataset. Additionally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D human action recognition.

