Results 1 - 10
of
16
Probabilistic Tracking in a Metric Space
- in ICCV
, 2001
"... A new, exemplar-based, probabilistic paradigm for visual tracking is presented. Probabilistic mechanisms are attractive because they handle fusion of information, especially temporal fusion, in a principled manner. Exemplars are selected representatives of raw training data, used here to represent p ..."
Abstract
-
Cited by 111 (2 self)
- Add to MetaCart
A new, exemplar-based, probabilistic paradigm for visual tracking is presented. Probabilistic mechanisms are attractive because they handle fusion of information, especially temporal fusion, in a principled manner. Exemplars are selected representatives of raw training data, used here to represent probabilistic mixture distributions of object configurations. Their use avoids tedious hand-construction of object models and problems with changes of topology. Using exemplars in place of a parameterized model poses several challenges, addressed here with what we call the "Metric Mixture" (M # ) approach. The M # model has several valuable properties. Principally, it provides alternatives to standard learning algorithms by allowing the use of metrics that are not embedded in a vector space. Secondly, it uses a noise model that is learned from training data. Lastly, it eliminates any need for an assumption of probabilistic pixelwise independence. Experiments demonstrate the effectiveness of the M # model in two domains: tracking walking people using chamfer distances on binary edge images and tracking mouth movements by means of a shuffle distance. 1
Transformation-invariant clustering using the EM algorithm
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... Abstract—Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in im ..."
Abstract
-
Cited by 47 (11 self)
- Add to MetaCart
Abstract—Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in images and videos), clustering techniques tend to extract cluster centers that account for variations in the input due to transformations, instead of more interesting and potentially useful structure. For example, if images from a video sequence of a person walking across a cluttered background are clustered, it would be more useful for the different clusters to represent different poses and expressions, instead of different positions of the person and different configurations of the background clutter. We describe a way to add transformation invariance to mixture models, by approximating the nonlinear transformation manifold by a discrete set of points. We show how the expectation maximization algorithm can be used to jointly learn clusters, while at the same time inferring the transformation associated with each input. We compare this technique with other methods for filtering noisy images obtained from a scanning electron microscope, clustering images from videos of faces into different categories of identification and pose and removing foreground obstructions from video. We also demonstrate that the new technique is quite insensitive to initial conditions and works better than standard techniques, even when the standard techniques are provided with extra data.
Probabilistic Tracking With Exemplars in a Metric Space
- INTERNATIONAL JOURNAL OF COMPUTER VISION
, 2002
"... A new, exemplar-based, probabilistic paradigm for visual tracking is presented. Probabilistic mechanisms are attractive because they handle fusion of information, especially temporal fusion, in a principled manner. Exemplars are selected representatives of raw training data, used here to represent p ..."
Abstract
-
Cited by 40 (2 self)
- Add to MetaCart
A new, exemplar-based, probabilistic paradigm for visual tracking is presented. Probabilistic mechanisms are attractive because they handle fusion of information, especially temporal fusion, in a principled manner. Exemplars are selected representatives of raw training data, used here to represent probabilistic mixture distributions of object configurations. Their use avoids tedious hand-construction of object models, and problems with changes of topology. Using exemplars
A comparison of algorithms for inference and learning in probabilistic graphical models
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... Computer vision is currently one of the most exciting areas of artificial intelligence re-search, largely because it has recently become possible to record, store and process large amounts of visual data. While impressive achievements have been made in pattern clas-sification problems such as handwr ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
Computer vision is currently one of the most exciting areas of artificial intelligence re-search, largely because it has recently become possible to record, store and process large amounts of visual data. While impressive achievements have been made in pattern clas-sification problems such as handwritten character recognition and face detection, it is even more exciting that researchers may be on the verge of introducing computer vision systems that perform scene analysis, decomposing image input into its constituent objects, lighting conditions, motion patterns, and so on. Two of the main challenges in computer vision are finding efficient models of the physics of visual scenes and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms for computer vision and scene analysis. We review exact techniques and various approximate, computationally efficient techniques, including iterative conditional modes, the expectation maximization (EM) algorithm, the mean field method, variational techniques, structured variational techniques, Gibbs sampling, the sum-product algorithm and “loopy ” belief propagation. We describe how each technique can be applied in a model of multiple, occluding objects, and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.
Action Recognition from Arbitrary Views using 3D Exemplars
"... In this paper, we address the problem of learning compact, view-independent, realistic 3D models of human actions recorded with multiple cameras, for the purpose of recognizing those same actions from a single or few cameras, without prior knowledge about the relative orientations between the camera ..."
Abstract
-
Cited by 29 (2 self)
- Add to MetaCart
In this paper, we address the problem of learning compact, view-independent, realistic 3D models of human actions recorded with multiple cameras, for the purpose of recognizing those same actions from a single or few cameras, without prior knowledge about the relative orientations between the cameras and the subjects. To this aim, we propose a new framework where we model actions using three dimensional occupancy grids, built from multiple viewpoints, in an exemplar-based HMM. The novelty is, that a 3D reconstruction is not required during the recognition phase, instead learned 3D exemplars are used to produce 2D image information that is compared to the observations. Parameters that describe image projections are added as latent variables in the recognition process. In addition, the temporal Markov dependency applied to view parameters allows them to evolve during recognition as with a smoothly moving camera. The effectiveness of the framework is demonstrated with experiments on real datasets and with challenging recognition scenarios. 1.
A Hidden Markov Model Based Framework For Recognition Of Humans From Gait Sequences
- PROC. ICIP
, 2003
"... In this paper we propose a generic framework based on Hidden Markov Models (HMMs) for recognition of individuals from their gait. The HMM framework is suitable, because the gait of an individual can be visualized as his adopting postures from a set, in a sequence which has an underlying structured p ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
In this paper we propose a generic framework based on Hidden Markov Models (HMMs) for recognition of individuals from their gait. The HMM framework is suitable, because the gait of an individual can be visualized as his adopting postures from a set, in a sequence which has an underlying structured probabilistic nature. The postures that the individual adopts can be regarded as the states of the HMM and are typical to that individual and provide a means of discrimination. The framework assumes that, during a walk cycle, the individual transitions among discrete postures or states. An adaptive filter is used to automatically detect the cycle boundaries. Our method is not dependent on the particular feature vector used to represent the gait information contained in the postures. The statistical nature of the HMM lends robustness to the model. In this paper we use the binarized background-subtracted image as the feature vector and use different distance metrics, such as those based on the L1 and L2 norms of the vector difference, and the normalized inner product of the vectors, to measure the similarity between feature vectors. The results we obtain are better than the baseline recognition rates reported before.
Learning Dynamics for Exemplar-based Gesture Recognition
- IN IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
, 2003
"... This paper addresses the problem of capturing the dynamics for exemplar-based recognition systems. Traditional HMM provides a probabilistic tool to capture system dynamics and in exemplar paradigm, HMM states are typically coupled with the exemplars. Alternatively, we propose a non-parametric HMM ap ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
This paper addresses the problem of capturing the dynamics for exemplar-based recognition systems. Traditional HMM provides a probabilistic tool to capture system dynamics and in exemplar paradigm, HMM states are typically coupled with the exemplars. Alternatively, we propose a non-parametric HMM approach that uses a discrete HMM with arbitrary states (decoupled from exemplars) to capture the dynamics over a large exemplar space where a nonparametric estimation approach is used to model the exemplar distribution. This reduces the need for lengthy and non-optimal training of the HMM observation model. We used the proposed approach for view-based recognition of gestures. The approach is based on representing each gesture as a sequence of learned body poses (exemplars). The gestures are recognized through a probabilistic framework for matching these body poses and for imposing temporal constraints between different poses using the proposed nonparametric HMM.
Dynamically constructed bayesian networks for sketch understanding
- MIT Project Oxygen Student Workshop Abstracts
, 2003
"... People sketch to express their early design ideas in many domains, but current computer tools offer few advantages to designers during this sketching phase. Our goal is to construct a general recognition architecture that can be applied ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
People sketch to express their early design ideas in many domains, but current computer tools offer few advantages to designers during this sketching phase. Our goal is to construct a general recognition architecture that can be applied
Statistical Part-Based Models: Theory and Applications in Image Similarity, Object Detection and Region Labeling
, 2005
"... The automatic analysis and indexing of visual content in unconstrained domain are impor-tant and challenging problems for a variety of multimedia applications. Much of the prior research work deals with the problems by modeling images and videos as feature vectors, such as global histogram or block- ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The automatic analysis and indexing of visual content in unconstrained domain are impor-tant and challenging problems for a variety of multimedia applications. Much of the prior research work deals with the problems by modeling images and videos as feature vectors, such as global histogram or block-based representation. Despite substantial research efforts on analysis and indexing algorithms based on this representation, their performance remains unsatisfactory. This dissertation attempts to explore the problem from a different perspective through a part-based representation, where images and videos are represented as a collection of parts with their appearance and relational features. Such representation is partly motivated by the human vision research showing that the human vision system adopts similar mechanism to perceive images. Although part-based representation has been investigated for decades, most of the prior work has been focused on ad hoc or deterministic approaches, which require manual designs of the models and often have poor performance for real-world images or videos due to their inability to model uncertainty and noise. The main focus of this thesis instead is on incorporating statistical modeling and machine learning techniques into the
Multi-cue exemplar-based nonparametric model for gesture recognition
- In ICVGIP
, 2004
"... This paper presents an approach for a multi-cue, viewbased recognition of gestures. We describe an exemplarbased technique that combines two different forms of exemplars- shape exemplars and motion exemplars- in a unified probabilistic framework. Each gesture is represented as a sequence of learned ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper presents an approach for a multi-cue, viewbased recognition of gestures. We describe an exemplarbased technique that combines two different forms of exemplars- shape exemplars and motion exemplars- in a unified probabilistic framework. Each gesture is represented as a sequence of learned body poses as well as a sequence of learned motion parameters. The shape exemplars are comprised of pose contours, and the motion exemplars are represented as affine motion parameters extracted using a robust estimation approach. The probabilistic framework learns by employing a nonparametric estimation technique to model the exemplar distributions. It imposes temporal constraints between different exemplars through a learned Hidden Markov Model (HMM) for each gesture. We use the proposed multi-cue approach to recognize a set of fourteen gestures and contrast it against a shape only, singlecue based system. 1.

