Results 1 - 10
of
79
Automatic Analysis of Facial Expressions: The State of the Art
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2000
"... This paper surveys the past work in solving these problems. The capability of the human visual system with respect to these problems is discussed, too. It is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer ..."
Abstract
-
Cited by 207 (11 self)
- Add to MetaCart
This paper surveys the past work in solving these problems. The capability of the human visual system with respect to these problems is discussed, too. It is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer
Dynamic Texture Recognition
, 2001
"... Dynamic textures are sequences of images that exhibit some form of temporal stationarity, such as waves, steam, and foliage. We pose the problem of recognizing and classifying dynamic textures in the space of dynamical systems where each dynamic texture is uniquely represented. Since the space is no ..."
Abstract
-
Cited by 69 (6 self)
- Add to MetaCart
Dynamic textures are sequences of images that exhibit some form of temporal stationarity, such as waves, steam, and foliage. We pose the problem of recognizing and classifying dynamic textures in the space of dynamical systems where each dynamic texture is uniquely represented. Since the space is non-linear, a distance between models must be defined. We examine three different distances in the space of autoregressive models and assess their power. 1.
Recognition of Human Gaits
, 2001
"... We pose the problem of recognizing different types of human gait in the space of dynamical systems where each gait is represented. Established techniques are employed to track a kinematic model of a human body in motion, and the trajectories of the parameters are used to learn a representation of a ..."
Abstract
-
Cited by 47 (9 self)
- Add to MetaCart
We pose the problem of recognizing different types of human gait in the space of dynamical systems where each gait is represented. Established techniques are employed to track a kinematic model of a human body in motion, and the trajectories of the parameters are used to learn a representation of a dynamical system, which defines a gait. Various types of distance between models are then computed. These computations are non trivial due to the fact that, even for the case of linear systems, the space of canonical realizations is not linear. 1.
Pairwise Markov chains
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... Abstract—We propose a new model called a Pairwise Markov Chain (PMC), which generalizes the classical Hidden Markov Chain (HMC) model. The generalization, which allows one to model more complex situations, in particular implies that in PMC the hidden process is not necessarily a Markov process. Howe ..."
Abstract
-
Cited by 37 (21 self)
- Add to MetaCart
Abstract—We propose a new model called a Pairwise Markov Chain (PMC), which generalizes the classical Hidden Markov Chain (HMC) model. The generalization, which allows one to model more complex situations, in particular implies that in PMC the hidden process is not necessarily a Markov process. However, PMC allows one to use the classical Bayesian restoration methods like Maximum A Posteriori (MAP), or Maximal Posterior Mode (MPM). So, akin to HMC, PMC allows one to restore hidden stochastic processes, with numerous applications to signal and image processing, such as speech recognition, image segmentation, and symbol detection or classification, among others. Furthermore, we propose an original method of parameter estimation, which generalizes the classical Iterative Conditional Estimation (ICE) valid for of classical hidden Markov chain model, and whose extension to possibly non-Gaussian and correlated noise is briefly treated. Some preliminary experiments validate the interest of the new model. Index Terms—Bayesian restoration, hidden data, image segmentation, iterative conditional estimation, hidden Markov chain, pairwise Markov chain, unsupervised classification. 1
Multiple Camera Tracking of Interacting and Occluded Human Motion
- Proceedings of the IEEE
, 2001
"... We propose a distributed, real-time computing platform for tracking multiple interacting persons in motion. To combat the negative effects of occlusion and articulated motion we use a multi-view implementation, where each view is first independently processed on a dedicated processor. This monocular ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
We propose a distributed, real-time computing platform for tracking multiple interacting persons in motion. To combat the negative effects of occlusion and articulated motion we use a multi-view implementation, where each view is first independently processed on a dedicated processor. This monocular processing uses a predictor-corrector filter to weigh re-projections of 3-D position estimates, obtained by the central processor, against observations of measurable image motion. The corrected state vectors from each view provide input observations to a Bayesian belief network, in the central processor, with a dynamic, multidimensional topology that varies as a function of scene content and feature confidence. The Bayesian net fuses independent observations from multiple cameras by iteratively resolving independency relationships and confidence levels within the graph, thereby producing the most likely vector of 3-D state estimates given the available data. To maintain temporal continuity we follow the network with a layer of Kalman filtering that updates the 3-D state estimates. We demonstrate the efficacy of the proposed system using a multi-view sequence of several people in motion. Our experiments suggest that, when compared with data fusion based on averaging, the proposed technique yields a noticeable improvement in tracking accuracy.
Extraction of 2d motion trajectories and its application to hand gesture recognition
- PAMI
, 2002
"... AbstractÐWe present an algorithm for extracting and classifying two-dimensional motion in an image sequence based on motion trajectories. First, a multiscale segmentation is performed to generate homogeneous regions in each frame. Regions between consecutive frames are then matched to obtain two-vie ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
AbstractÐWe present an algorithm for extracting and classifying two-dimensional motion in an image sequence based on motion trajectories. First, a multiscale segmentation is performed to generate homogeneous regions in each frame. Regions between consecutive frames are then matched to obtain two-view correspondences. Affine transformations are computed from each pair of corresponding regions to define pixel matches. Pixels matches over consecutive image pairs are concatenated to obtain pixel-level motion trajectories across the image sequence. Motion patterns are learned from the extracted trajectories using a time-delay neural network. We apply the proposed method to recognize 40 hand gestures of American Sign Language. Experimental results show that motion patterns of hand gestures can be extracted and recognized accurately using motion trajectories. Index TermsÐMotion segmentation, motion analysis, motion trajectory, American Sign Language, hand gesture recognition, time-delay neural network. 1
Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition
"... Abstract—Interpretation of images and videos containing humans interacting with different objects is a daunting task. It involves understanding scene/event, analyzing human movements, recognizing manipulable objects, and observing the effect of the human movement on those objects. While each of thes ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
Abstract—Interpretation of images and videos containing humans interacting with different objects is a daunting task. It involves understanding scene/event, analyzing human movements, recognizing manipulable objects, and observing the effect of the human movement on those objects. While each of these perceptual tasks can be conducted independently, recognition rate improves when interactions between them are considered. Motivated by psychological studies of human perception, we present a Bayesian approach which integrates various perceptual tasks involved in understanding human-object interactions. Previous approaches to object and action recognition rely on static shape/appearance feature matching and motion analysis, respectively. Our approach goes beyond these traditional approaches and applies spatial and functional constraints on each of the perceptual elements for coherent semantic interpretation. Such constraints allow us to recognize objects and actions when the appearances are not discriminative enough. We also demonstrate the use of such constraints in recognition of actions from static images without using any motion information. Index Terms—Action recognition, object recognition, functional recognition. Ç 1
Learning Dynamics for Exemplar-based Gesture Recognition
- IN IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
, 2003
"... This paper addresses the problem of capturing the dynamics for exemplar-based recognition systems. Traditional HMM provides a probabilistic tool to capture system dynamics and in exemplar paradigm, HMM states are typically coupled with the exemplars. Alternatively, we propose a non-parametric HMM ap ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
This paper addresses the problem of capturing the dynamics for exemplar-based recognition systems. Traditional HMM provides a probabilistic tool to capture system dynamics and in exemplar paradigm, HMM states are typically coupled with the exemplars. Alternatively, we propose a non-parametric HMM approach that uses a discrete HMM with arbitrary states (decoupled from exemplars) to capture the dynamics over a large exemplar space where a nonparametric estimation approach is used to model the exemplar distribution. This reduces the need for lengthy and non-optimal training of the HMM observation model. We used the proposed approach for view-based recognition of gestures. The approach is based on representing each gesture as a sequence of learned body poses (exemplars). The gestures are recognized through a probabilistic framework for matching these body poses and for imposing temporal constraints between different poses using the proposed nonparametric HMM.
Learning appearance and transparency manifolds of occluded objects in layers
- CVPR03, I:45–52
, 2003
"... Videos and software available at www.psi.toronto.edu/layers.html By mapping a set of input images to points in a lowdimensional manifold or subspace, it is possible to efficiently account for a small number of degrees of freedom. For example, images of a person walking can be mapped to a 1-dimension ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
Videos and software available at www.psi.toronto.edu/layers.html By mapping a set of input images to points in a lowdimensional manifold or subspace, it is possible to efficiently account for a small number of degrees of freedom. For example, images of a person walking can be mapped to a 1-dimensional manifold that measures the phase of the person’s gait. However, when the object is moving around the frame and being occluded by other objects, standard manifold modeling techniques (e.g., principal components analysis, factor analysis, locally linear embedding) try to account for global motion and occlusion. We show how factor analysis can be incorporated into a generative model of layered, 2.5-dimensional vision, to jointly locate objects, resolve occlusion ambiguities, and learn models of the appearance manifolds of objects. We demonstrate the algorithm on a video consisting of four occluding objects, two of which are people who are walking, and occlude each other for most of the duration of the video. Whereas standard manifold modeling techniques fail to extract information about the gaits, the layered model successfully extracts a periodic representation of the gait of each person. 1
Hierarchical Language-based Representation of Events in Video Streams
, 2003
"... We aim to define an event ontology that allows natural representation of complex spatio-temporal events common in the physical world by a composition of simpler events. The events are abstracted into three hierarchies. Primitive events are defined directly from the mobile object properties. Single-t ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
We aim to define an event ontology that allows natural representation of complex spatio-temporal events common in the physical world by a composition of simpler events. The events are abstracted into three hierarchies. Primitive events are defined directly from the mobile object properties. Single-thread composite events are a number of primitive events with temporal sequencing. Multi-thread composite events are a number of single-thread events with temporal /spatial/logical relationships. This hierarchical event representation naturally leads to a language description of the events. We define an Event Recognition Language (ERL) which allows the users to define the events of interest conveniently without interacting with the low level processing in the program. We will also briefly mention some approaches to compute the proposed representation.

