Unsupervised Segmentation of ColorTexture Regions in Images and Video
, 2001
Cited by 206 (1 self)
A new method for unsupervised segmentation of colortexture regions in images and video is presented. This method, which we refer to as JSEG, consists of two independent steps: color quantization and spatial segmentation. In the first step, colors in the image are quantized to several representative classes that can be used to differentiate regions in the image. The image pixels are then replaced by their corresponding color class labels, thus forming a classmap of the image. The focus of this work is on spatial segmentation, where a criterion for "good" segmentation using the classmap is proposed. Applying the criterion to local windows in the classmap results in the "Jimage, " in which high and low values correspond to possible boundaries and interiors of colortexture regions. A region growing method is then used to segment the image based on the multiscale Jimages. A similar approach is applied to video sequences. An additional region tracking scheme is embedded into the region growing process to achieve consistent segmentation and tracking results, even for scenes with nonrigid object motion. Experiments show the robustness of the JSEG algorithm on real images and video.
Spectral grouping using the Nyström method
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2004
Cited by 189 (1 self)
Spectral graph theoretic methods have recently shown great promise for the problem of image segmentation. However, due to the computational demands of these approaches, applications to large problems such as spatiotemporal data and high resolution imagery have been slow to appear. The contribution of this paper is a method that substantially reduces the computational requirements of grouping algorithms based on spectral partitioning making it feasible to apply them to very large grouping problems. Our approach is based on a technique for the numerical solution of eigenfunction problems knownas the Nyström method. This method allows one to extrapolate the complete grouping solution using only a small number of "typical" samples. In doing so, we leverage the fact that there are far fewer coherent groups in a scene than pixels.
Multiway cut for stereo and motion with slanted surfaces
 In International Conference on Computer Vision
, 1999
Cited by 115 (2 self)
Slanted surfaces pose a problem for correspondence algorithms utilizing search because of the greatly increased number of possibilities, when compared with frontoparallel surfaces. In this paper we propose an algorithm to compute correspondence between stereo images or between frames of a motionsequence by minimizingan energy functional that accounts for slanted surfaces. The energy is minimized in a greedy strategy that alternates between segmenting the image into a number of nonoverlapping regions (using the multiwaycut algorithm of Boykov, Veksler, and Zabih) and finding the affine parameters describing the displacement function of each region. A followup step enables the algorithm to escape local minima due to oversegmentation. Experiments on real images show the algorithm’s ability to find an accurate segmentation and displacement map, as well as discontinuities and creases, from a wide variety of stereo and motion imagery. 1
Motion Layer Extraction in the Presence of Occlusion Using Graph Cut
 Proc. IEEE Conf. Computer Vision and Pattern Recognition
, 2004
Cited by 83 (9 self)
Abstract—Extracting layers from video is very important for video representation, analysis, compression, and synthesis. Assuming that a scene can be approximately described by multiple planar regions, this paper describes a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions, detect the occlusion pixels over multiple consecutive frames, and segment the scene into several motion layers. First, after determining a number of seed regions using correspondences in two frames, we expand the seed regions and reject the outliers employing the graph cuts method integrated with level set representation. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, an occlusion order constraint on multiple frames is explored, which enforces that the occlusion area increases with the temporal order in a short period and effectively maintains segmentation consistency over multiple consecutive frames. Then, the correct layer segmentation is obtained by using a graph cuts algorithm and the occlusions between the overlapping layers are explicitly determined. Several experimental results are demonstrated to show that our approach is effective and robust. Index Terms—Layerbased motion segmentation, video analysis, graph cuts, level set representation, occlusion order constraint. æ 1
SpatioTemporal Segmentation of Video by Hierarchical Mean Shift Analysis
 Center for Automat. Res., U. of Md, College Park
, 2002
Cited by 62 (4 self)
We describe a simple new technique for spatiotemporal segmentation of video sequences. Each pixel of a 3D spacetime video stack is mapped to a 7D feature point whose coordinates include three color components, two motion angle components and two motion position components. The clustering of these feature points provides color segmentation and motion segmentation, as well as a consistent labeling of regions over time which amounts to region tracking. For this task we have adopted a hierarchical clustering method which operates by repeatedly applying mean shift analysis over increasing large ranges, using at each pass the cluster centers of the previous pass, with weights equal to the counts of the points that contributed to the clusters. This technique has lower complexity for large mean shift radii than regular mean shift analysis because it can use binary tree structures more efficiently during range search. In addition, it provides a hierarchical segmentation of the data. Applications include video compression and compact descriptions of video sequences for video indexing and retrieval applications.
A Robust Subspace Approach to Layer Extraction
, 2002
Cited by 57 (6 self)
Representing images with layers has many important applications, such as video compression, motion analysis, and 3D scene analysis. This paper presents a robust subspace approach to reliably extracting layers from images by taking advantages of the fact that homographies induced by planar patches in the scene form a low dimensional linear subspace. Such subspace provides not only a feature space where layers in the image domain are mapped onto denser and betterdefined clusters, but also a constraint for detecting outliers in the local measurements, thus making the algorithm robust to outliers. By enforcing the subspace constraint, spatial and temporal redundancy from multiple frames are simultaneously utilized, and noise can be effectively reduced. Good layer descriptions are shown to be extracted in the experimental results.
Optimizing the performance of sparse matrixvector multiplication
, 2000
Object segmentation by long term analysis of point trajectories
 In Proc. European Conference on Computer Vision
, 2010
Cited by 55 (4 self)
Abstract. Unsupervised learning requires a grouping step that defines which data belong together. A natural way of grouping in images is the segmentation of objects or parts of objects. While pure bottomup segmentation from static cues is well known to be ambiguous at the object level, the story changes as soon as objects move. In this paper, we present a method that uses long term point trajectories based on dense optical flow. Defining pairwise distances between these trajectories allows to cluster them, which results in temporally consistent segmentations of moving objects in a video shot. In contrast to multibody factorization, points and even whole objects may appear or disappear during the shot. We provide a benchmark dataset and an evaluation method for this so far uncovered setting. 1
A Unifying Theorem for Spectral Embedding and Clustering
, 2003
Cited by 55 (0 self)
Spectral methods use selected eigenvectors of a data affinity matrix to obtain a data representation that can be trivially clustered or embedded in a lowdimensional space. We present a theorem that explains, for broad classes of affinity matrices and eigenbases, why this works: For successively smaller eigenbases (i.e., using fewer and fewer of the affinity matrix's dominant eigenvalues and eigenvectors), the angles between "similar" vectors in the new representation shrink while the angles between "dissimilar" vectors grow. Specifically, the sum of the squared cosines of the angles is strictly increasing as the dimensionality of the representation decreases. Thus spectral methods work because the truncated eigenbasis amplifies structure in the data so that any heuristic postprocessing is more likely to succeed. We use this result to construct a nonlinear dimensionality reduction (NLDR) algorithm for data sampled from manifolds whose intrinsic coordinate system has linear and cyclic axes, and a novel clusteringbyprojections algorithm that requires no postprocessing and gives superior performance on "challenge problems" from the recent literature.
Motion competition: a variational approach to piecewise parametric motion segmentation
 Int. J. Comput. Vision
, 2005
Cited by 54 (8 self)
Abstract. We present a novel variational approach for segmenting the image plane into a set of regions of parametric motion on the basis of two consecutive frames from an image sequence. Our model is based on a conditional probability for the spatiotemporal image gradient, given a particular velocity model, and on a geometric prior on the estimated motion field favoring motion boundaries of minimal length. Exploiting the Bayesian framework, we derive a cost functional which depends on parametric motion models for each of a set of regions and on the boundary separating these regions. The resulting functional can be interpreted as an extension of the MumfordShah functional from intensity segmentation to motion segmentation. In contrast to most alternative approaches, the problems of segmentation and motion estimation are jointly solved by continuous minimization of a single functional. Minimizing this functional with respect to its dynamic variables results in an eigenvalue problem for the motion parameters and in a gradient descent evolution for the motion discontinuity set. We propose two different representations of this motion boundary: an explicit splinebased implementation which can be applied to the motionbased tracking of a single moving object, and an implicit multiphase level set implementation which allows for the segmentation of an arbitrary number of multiply connected moving objects. Numerical results both for simulated ground truth experiments and for realworld sequences demonstrate the capacity of our approach to segment objects based exclusively on their relative motion.