Results 1 - 10
of
24
Mean shift: A robust approach toward feature space analysis
- In PAMI
, 2002
"... A general nonparametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure, the mean shift. We prove for discrete data the convergence ..."
Abstract
-
Cited by 936 (33 self)
- Add to MetaCart
A general nonparametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure, the mean shift. We prove for discrete data the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and thus its utility in detecting the modes of the density. The equivalence of the mean shift procedure to the Nadaraya–Watson estimator from kernel regression and the robust M-estimators of location is also established. Algorithms for two low-level vision tasks, discontinuity preserving smoothing and image segmentation are described as applications. In these algorithms the only user set parameter is the resolution of the analysis, and either gray level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.
Layered Representation of Motion Video using Robust Maximum-Likelihood Estimation of Mixture Models and MDL Encoding
, 1995
"... Representing and modeling the motion and spatial support of multiple objects and surfaces from motion video sequences is an important intermediate step towards dynamic image understanding. One such representation, called layered representation, has recently been proposed. Although a number of algori ..."
Abstract
-
Cited by 176 (4 self)
- Add to MetaCart
Representing and modeling the motion and spatial support of multiple objects and surfaces from motion video sequences is an important intermediate step towards dynamic image understanding. One such representation, called layered representation, has recently been proposed. Although a number of algorithms have been developed for computing these representations, there has not been a consolidated effort into developing a precise mathematical formulation of the problem. This paper presents such a formulation based on maximum likelihood estimation of mixture models and the minimum description length (MDL) encoding principle. The three major issues in layered motion representation are: (i) how many motion models adequately describe image motion, (ii) what are the motion model parameters, and (iii) what is the spatial support layer for each motion model. In order to allow multiple models in the description of image motion, the likelihood function for change in intensity of a pixel is modeled a...
Compact Representations Of Videos Through Dominant And Multiple Motion Estimation
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1996
"... An explosion of on-line image and video data in digital form is already well underway. With the exponential rise in interactive information exploration and dissemination through the WorldWide Web (WWW), the major inhibitors of rapid access to on-line video data are costs and management of capture an ..."
Abstract
-
Cited by 135 (0 self)
- Add to MetaCart
An explosion of on-line image and video data in digital form is already well underway. With the exponential rise in interactive information exploration and dissemination through the WorldWide Web (WWW), the major inhibitors of rapid access to on-line video data are costs and management of capture and storage, lack of real-time delivery, and non-availability of contentbased intelligent search and indexing techniques. The solutions for capture, storage and delivery maybe on the horizon or a little beyond. However, even with rapid delivery, the lack of efficient authoring and querying tools for visual content-based indexing may still inhibit as widespread a use of video information as that of text and traditional tabular data is currently. In order to be able to non-linearly browse and index into videos through visual content, it is necessary to develop authoring tools that can automatically separate moving objects and significant components of the scene, and represent these in a compact ...
Estimating Optical Flow in Segmented Images using Variable-order Parametric Models with Local Deformations
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1996
"... This paper presents a new model for estimating optical flow based on the motion of planar regions plus local deformations. The approach exploits brightness information to organize and constrain the interpretation of the motion by using segmented regions of piecewise smooth brightness to hypothesize ..."
Abstract
-
Cited by 82 (4 self)
- Add to MetaCart
This paper presents a new model for estimating optical flow based on the motion of planar regions plus local deformations. The approach exploits brightness information to organize and constrain the interpretation of the motion by using segmented regions of piecewise smooth brightness to hypothesize planar regions in the scene. Parametric flow models are estimated in these regions in a two step process which first computes a coarse fit and estimates the appropriate parameterization of the motion of the region (two, six, or eight parameters). The initial fit is refined using a generalization of the standard area-based regression approaches. Since the assumption of planarity is likely to be violated, we allow local deformations from the planar assumption in the same spirit as physically-based approaches which model shape using coarse parametric models plus local deformations. This parametric+deformation model exploits the strong constraints of parametric approaches while retaining the ada...
Cooperative Robust Estimation Using Layers of Support
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1991
"... We present an approach to the problem of representing images that contain multiple objects or surfaces. Rather than use an edge-based approach to represent the segmentation of a scene, we propose a multi-layer estimation framework which uses support maps to represent the segmentation of the image in ..."
Abstract
-
Cited by 76 (5 self)
- Add to MetaCart
We present an approach to the problem of representing images that contain multiple objects or surfaces. Rather than use an edge-based approach to represent the segmentation of a scene, we propose a multi-layer estimation framework which uses support maps to represent the segmentation of the image into homogeneous chunks. This support-based approach can represent objects that are split into disjoint regions, or have surfaces that are transparently interleaved. Our framework is based on an extension of robust estimation methods which provide a theoretical basis for supportbased estimation. The Minimum Description Length principle is used to decide how many support maps to use in describing a particular image. We show results applying this framework to heterogeneous interpolation and segmentation tasks on range and motion imagery. 1 Introduction Real-world perceptual systems must deal with complicated and cluttered environments. To succeed in such environments, a system must be able to r...
A Framework for Robust Subspace Learning
- International Journal of Computer Vision
, 2003
"... Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications. ..."
Abstract
-
Cited by 61 (5 self)
- Add to MetaCart
Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications.
Model-based 2D&3D Dominant Motion Estimation for Mosaicing and Video Representation
, 1995
"... It is fairly common in video sequences that a mostly fixed background (scene) is imaged with or without independently moving objects. The dominant background changes in the image plane mostly due to camera operations and motion (zoom, pan, tilt, track etc.). In this paper we address the problem of c ..."
Abstract
-
Cited by 51 (3 self)
- Add to MetaCart
It is fairly common in video sequences that a mostly fixed background (scene) is imaged with or without independently moving objects. The dominant background changes in the image plane mostly due to camera operations and motion (zoom, pan, tilt, track etc.). In this paper we address the problem of computation of the dominant image transformation over time and demonstrate how this can be effectively used for efficient video representation through video mosaicing and image registration. We formulate the problem of dominant component estimation as that of model-based robust estimation using M-estimators with direct, multi-resolution methods. In addition to 2D affine and plane projective models, that have been used in the past, for describing image motion using direct methods, we also employ a true 3D model of motion and scene structure imaged with uncalibrated cameras. This model parameterizes the image motion as that due to a planar component and a parallax component. For rigid 3D scenes...
Robust Parameterized Component Analysis: Theory and Applications to 2D Facial Modeling
- Computer Vision and Image Understanding, 91:53 – 71
, 2002
"... Principal Component Analysis (PCA) has been successfully applied to construct linear models of shape, graylevel, and motion. In particular, PCA has been widely used to model the variation in the appearance of people's faces. We extend previous work on facial modeling for tracking faces in video sequ ..."
Abstract
-
Cited by 33 (6 self)
- Add to MetaCart
Principal Component Analysis (PCA) has been successfully applied to construct linear models of shape, graylevel, and motion. In particular, PCA has been widely used to model the variation in the appearance of people's faces. We extend previous work on facial modeling for tracking faces in video sequences as they undergo significant changes due to facial expressions. Here we develop person-specific facial appearance models (PSFAM), which use modular PCA to model complex intra-person appearance changes. Such models require aligned visual training data; in previous work, this has involved a time consuming and errorprone hand alignment and cropping process. Instead, we introduce parameterized component analysis to learn a subspace that is invariant to affine (or higher order) geometric transformations. The automatic learning of a PSFAM given a training image sequence is posed as a continuous optimization problem and is solved with a mixture of stochastic and deterministic techniques achieving sub-pixel accuracy.
Robust Computer Vision through Kernel Density Estimation
- In 7th European Conf. on Computer Vision
, 2002
"... Two new techniques based on nonparametric estimation of probability densities are introduced which improve on the performance of equivalent robust methods currently employed in computer vision. The first technique draws from the projection pursuit paradigm in statistics, and carries out regressio ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
Two new techniques based on nonparametric estimation of probability densities are introduced which improve on the performance of equivalent robust methods currently employed in computer vision. The first technique draws from the projection pursuit paradigm in statistics, and carries out regression Mestimation with a weak dependence on the accuracy of the scale estimate. The second technique exploits the properties of the multivariate adaptive mean shift, and accomplishes the fusion of uncertain measurements arising from an unknown number of sources. As an example, the two techniques are extensively used in an algorithm for the recovery of multiple structures from heavily corrupted data.
Dynamic Coupled Component Analysis
, 2001
"... We present a method for simultaneously learning linear models of multiple high dimensional data sets and the dependencies between them. For example, we learn asymmetrically coupled linear models for the faces of two dierent people and show how these models can be used to animate one face given a vid ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
We present a method for simultaneously learning linear models of multiple high dimensional data sets and the dependencies between them. For example, we learn asymmetrically coupled linear models for the faces of two dierent people and show how these models can be used to animate one face given a video sequence of the other. We pose the problem as a form of Asymmetric Coupled Component Analysis (ACCA) in which we simultaneously learn the subspaces for reducing the dimensionality of each dataset while coupling the parameters of the low dimensional representations.

