Mean shift: A robust approach toward feature space analysis
 In PAMI
, 2002
"... A general nonparametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure, the mean shift. We prove for discrete data the convergence ..."
Cited by 2395 (37 self)
A general nonparametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure, the mean shift. We prove for discrete data the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and thus its utility in detecting the modes of the density. The equivalence of the mean shift procedure to the Nadaraya–Watson estimator from kernel regression and the robust Mestimators of location is also established. Algorithms for two lowlevel vision tasks, discontinuity preserving smoothing and image segmentation are described as applications. In these algorithms the only user set parameter is the resolution of the analysis, and either gray level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.
Efficient graphbased image segmentation.
 International Journal of Computer Vision,
, 2004
"... Abstract. This paper addresses the problem of segmenting an image into regions. We define a predicate for measuring the evidence for a boundary between two regions using a graphbased representation of the image. We then develop an efficient segmentation algorithm based on this predicate, and show ..."
Cited by 940 (1 self)
Abstract. This paper addresses the problem of segmenting an image into regions. We define a predicate for measuring the evidence for a boundary between two regions using a graphbased representation of the image. We then develop an efficient segmentation algorithm based on this predicate, and show that although this algorithm makes greedy decisions it produces segmentations that satisfy global properties. We apply the algorithm to image segmentation using two different kinds of local neighborhoods in constructing the graph, and illustrate the results with both real and synthetic images. The algorithm runs in time nearly linear in the number of graph edges and is also fast in practice. An important characteristic of the method is its ability to preserve detail in lowvariability image regions while ignoring detail in highvariability regions.
RealTime Tracking of NonRigid Objects using Mean Shift
 IEEE CVPR 2000
, 2000
"... A new method for realtime tracking of nonrigid objects seen from a moving camera isproposed. The central computational module is based on the mean shift iterations and nds the most probable target position in the current frame. The dissimilarity between the target model (its color distribution) an ..."
Cited by 815 (19 self)
A new method for realtime tracking of nonrigid objects seen from a moving camera isproposed. The central computational module is based on the mean shift iterations and nds the most probable target position in the current frame. The dissimilarity between the target model (its color distribution) and the target candidates is expressed by a metric derived from the Bhattacharyya coefficient. The theoretical analysis of the approach shows that it relates to the Bayesian framework while providing a practical, fast and efficient solution. The capability of the tracker to handle in realtime partial occlusions, significant clutter, and target scale variations, is demonstrated for several image sequences.
Object Tracking: A Survey
, 2006
"... The goal of this article is to review the stateoftheart tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns o ..."
Cited by 701 (7 self)
The goal of this article is to review the stateoftheart tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, objecttoobject and objecttoscene occlusions, and camera motion. Tracking is usually performed in the context of higherlevel applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.
Image Segmentation by Data Driven Markov Chain Monte Carlo
, 2001
"... This paper presents a computational paradigm called Data Driven Markov Chain Monte Carlo (DDMCMC) for image segmentation in the Bayesian statistical framework. The paper contributes to image segmentation in three aspects. Firstly, it designs effective and well balanced Markov Chain dynamics to exp ..."
Cited by 277 (32 self)
This paper presents a computational paradigm called Data Driven Markov Chain Monte Carlo (DDMCMC) for image segmentation in the Bayesian statistical framework. The paper contributes to image segmentation in three aspects. Firstly, it designs effective and well balanced Markov Chain dynamics to explore the solution space and makes the split and merge process reversible at a middle level vision formulation. Thus it achieves globally optimal solution independent of initial segmentations. Secondly, instead of computing a single maximum a posteriori solution, it proposes a mathematical principle for computing multiple distinct solutions to incorporates intrinsic ambiguities in image segmentation. A kadventurers algorithm is proposed for extracting distinct multiple solutions from the Markov chain sequence. Thirdly, it utilizes datadriven (bottomup) techniques, such as clustering and edge detection, to compute importance proposal probabilities, which eectively drive the Markov chain dynamics and achieve tremendous speedup in comparison to traditional jumpdiffusion method[4]. Thus DDMCMC paradigm provides a unifying framework where the role of existing segmentation algorithms, such as, edge detection, clustering, region growing, splitmerge, SNAKEs, region competition, are revealed as either realizing Markov chain dynamics or computing importance proposal probabilities. We report some results on color and grey level image segmentation in this paper and refer to a detailed report and a web site for extensive discussion.
On the Removal of Shadows from Images
, 2006
"... This paper is concerned with the derivation of a progression of shadowfree image representations. First, we show that adopting certain assumptions about lights and cameras leads to a 1D, grayscale image representation which is illuminant invariant at each image pixel. We show that as a consequenc ..."
Cited by 236 (18 self)
(Show Context)
This paper is concerned with the derivation of a progression of shadowfree image representations. First, we show that adopting certain assumptions about lights and cameras leads to a 1D, grayscale image representation which is illuminant invariant at each image pixel. We show that as a consequence, images represented in this form are shadowfree. We then extend this 1D representation to an equivalent 2D, chromaticity representation. We show that in this 2D representation, it is possible to relight all the image pixels in the same way, effectively deriving a 2D image representation which is additionally shadowfree. Finally, we show how to recover a 3D, full color shadowfree image representation by first (with the help of the 2D representation) identifying shadow edges. We then remove shadow edges from the edgemap of the original image by edge inpainting and we propose a method to reintegrate this thresholded edge map, thus deriving the soughtafter 3D shadowfree image.
Image Parsing: Unifying Segmentation, Detection, and Recognition
, 2005
"... In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation in a "parsing graph", in a spirit similar to parsing sentences in speech and natural lang ..."
Cited by 233 (22 self)
In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation in a "parsing graph", in a spirit similar to parsing sentences in speech and natural language. The algorithm constructs the parsing graph and reconfigures it dynamically using a set of reversible Markov chain jumps. This computational framework integrates two popular inference approaches  generative (topdown) methods and discriminative (bottomup) methods. The former formulates the posterior probability in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative probabilities based on a sequence (cascade) of bottomup tests/filters.
A.Blake. Cosegmentation of image pairs by histogram matching  incorporating a global constraint into MRFs
 In CVPR
, 2006
"... We introduce the term cosegmentation which denotes the task of segmenting simultaneously the common parts of an image pair. A generative model for cosegmentation is presented. Inference in the model leads to minimizing an energy with an MRF term encoding spatial coherency and a global constraint whi ..."
Cited by 176 (3 self)
(Show Context)
We introduce the term cosegmentation which denotes the task of segmenting simultaneously the common parts of an image pair. A generative model for cosegmentation is presented. Inference in the model leads to minimizing an energy with an MRF term encoding spatial coherency and a global constraint which attempts to match the appearance histograms of the common parts. This energy has not been proposed previously and its optimization is challenging and NPhard. For this problem a novel optimization scheme which we call trust region graph cuts is presented. We demonstrate that this framework has the potential to improve a wide range of research: Object driven image retrieval, video tracking and segmentation, and interactive image editing. The power of the framework lies in its generality, the common part can be a rigid/nonrigid object (or scene), observed from different viewpoints or even similar objects of the same class. 1.
Synergism in low level vision
 In
, 2002
"... Guiding image segmentation with edge information is an often employed strategy in low level computer vision. To improve the tradeoff between the sensitivity of homogeneous region delineation and the oversegmentation of the image, we have incorporated a recently proposed edge magnitude/confidence m ..."
Cited by 141 (4 self)
(Show Context)
Guiding image segmentation with edge information is an often employed strategy in low level computer vision. To improve the tradeoff between the sensitivity of homogeneous region delineation and the oversegmentation of the image, we have incorporated a recently proposed edge magnitude/confidence map into a color image segmenter based on the mean shift procedure. The new method can recover regions with weak but sharp boundaries and thus can provide a more accurate input for high level interpretation modules. The Edge Detection and Image SegmentatiON (EDISON) system, available for download, implements the proposed technique and provides a complete toolbox for discontinuity preserving filtering, segmentation and edge detection. 1
The variable bandwidth meanshift and datadriven scale selection,” in ICCV,
, 2001
"... Abstract We present two solutions for the scale selection problem in computer vision. The first one is completely nonparametric and is based on the the adaptive estimation of the normalized density gradient. Employing the sample point estimator, we define the Variable Bandwidth Mean Shift, prove it ..."
Cited by 130 (9 self)
(Show Context)
Abstract We present two solutions for the scale selection problem in computer vision. The first one is completely nonparametric and is based on the the adaptive estimation of the normalized density gradient. Employing the sample point estimator, we define the Variable Bandwidth Mean Shift, prove its convergence, and show its superiority over the fixed bandwidth procedure. The second technique has a semiparametric nature and imposes a local structure on the data to extract reliable scale information. The local scale of the underlying density is taken as the bandwidth which maximizes the magnitude of the normalized mean shift vector. Both estimators provide practical tools for autonomous image and quasi realtime video analysis and several examples are shown to illustrate their effectiveness. Motivation for Variable Bandwidth The efficacy of Mean Shift analysis has been demonstrated in computer vision problems such as tracking and segmentation in [5, 61. However, one of the limitations of the mean shift procedure as defined in these papers is that it involves the specification of a scale parameter. While results obtained appear satisfactory, when the local characteristics of the feature space differs significantly across data, it is difficult to find an optimal global bandwidth for the mean shift procedure. In this paper we address the issue of locally adapting the bandwidth. We also study an alternative approach for datadriven scale selection which imposes a local structure on the data. The proposed solutions are tested in the framework of quasi realtime video analysis. We review first the intrinsic limitations of the fixed bandwidth density estimation methods. Then, two of the most popular variable bandwidth estimators, the balloon and the sample point, are introduced and their advantages discussed. We conclude the section by showing that, with some precautions, the performance of the sample point estimator is superior to both fixed bandwidth and balloon estimators.