Results 1  10
of
82
Shape Matching and Object Recognition Using Shape Contexts
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by (1) solv ing for correspondences between points on the two shapes, (2) using the correspondences to estimate an aligning transform ..."
Abstract

Cited by 1243 (19 self)
 Add to MetaCart
We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by (1) solv ing for correspondences between points on the two shapes, (2) using the correspondences to estimate an aligning transform. In order to solve the correspondence problem, we attach a descriptor, the shape context, to each point. The shape context at a reference point captures the distribution of the remaining points relative to it, thus offering a globally discriminative characterization. Corresponding points on two similar shapes will have similar shape con texts, enabling us to solve for correspondences as an optimal assignment problem. Given the point correspondences, we estimate the transformation that best aligns the two shapes; reg ularized thin plate splines provide a flexible class of transformation maps for this purpose. The dissimilarity between the two shapes is computed as a sum of matching errors between corresponding points, together with a term measuring the magnitude of the aligning trans form. We treat recognition in a nearestneighbor classification framework as the problem of finding the stored prototype shape that is maximally similar to that in the image. Results are presented for silhouettes, trademarks, handwritten digits and the COIL dataset.
Pictorial Structures for Object Recognition
 IJCV
, 2003
"... In this paper we present a statistical framework for modeling the appearance of objects. Our work is motivated by the pictorial structure models introduced by Fischler and Elschlager. The basic idea is to model an object by a collection of parts arranged in a deformable configuration. The appearance ..."
Abstract

Cited by 520 (14 self)
 Add to MetaCart
In this paper we present a statistical framework for modeling the appearance of objects. Our work is motivated by the pictorial structure models introduced by Fischler and Elschlager. The basic idea is to model an object by a collection of parts arranged in a deformable configuration. The appearance of each part is modeled separately, and the deformable configuration is represented by springlike connections between pairs of parts. These models allow for qualitative descriptions of visual appearance, and are suitable for generic recognition problems. We use these models to address the problem of detecting an object in an image as well as the problem of learning an object model from training examples, and present efficient algorithms for both these problems. We demonstrate the techniques by learning models that represent faces and human bodies and using the resulting models to locate the corresponding objects in novel images.
Recognition of Shapes by Editing Their Shock Graphs
 Proc. Int’l Conf. Computer Vision
, 2001
"... Abstract—This paper presents a novel framework for the recognition of objects based on their silhouettes. The main idea is to measure the distance between two shapes as the minimum extent of deformation necessary for one shape to match the other. Since the space of deformations is very highdimensio ..."
Abstract

Cited by 149 (7 self)
 Add to MetaCart
Abstract—This paper presents a novel framework for the recognition of objects based on their silhouettes. The main idea is to measure the distance between two shapes as the minimum extent of deformation necessary for one shape to match the other. Since the space of deformations is very highdimensional, three steps are taken to make the search practical: 1) define an equivalence class for shapes based on shockgraph topology, 2) define an equivalence class for deformation paths based on shockgraph transitions, and 3) avoid complexityincreasing deformation paths by moving toward shockgraph degeneracy. Despite these steps, which tremendously reduce the search requirement, there still remain numerous deformation paths to consider. To that end, we employ an editdistance algorithm for shock graphs that finds the optimal deformation path in polynomial time. The proposed approach gives intuitive correspondences for a variety of shapes and is robust in the presence of a wide range of visual transformations. The recognition rates on two distinct databases of 99 and 216 shapes each indicate highly successful within category matches (100 percent in top three matches), which render the framework potentially usable in a range of shapebased recognition applications. Index Terms—Shape deformation, shock graphs, graph matching, edit distance, shape matching, object recognition, dynamic programming. æ 1
Image Categorization by Learning and Reasoning with Regions
 Journal of Machine Learning Research
, 2004
"... Designing computer programs to automatically categorize images using lowlevel features is a challenging research topic in computer vision. In this paper, we present a new learning technique, which extends MultipleInstance Learning (MIL), and its application to the problem of regionbased image cat ..."
Abstract

Cited by 127 (7 self)
 Add to MetaCart
Designing computer programs to automatically categorize images using lowlevel features is a challenging research topic in computer vision. In this paper, we present a new learning technique, which extends MultipleInstance Learning (MIL), and its application to the problem of regionbased image categorization. Images are viewed as bags, each of which contains a number of instances corresponding to regions obtained from image segmentation. The standard MIL problem assumes that a bag is labeled positive if at least one of its instances is positive; otherwise, the bag is negative.
Shape classification using the innerdistance
 PATTERN ANALYSIS AND MACHINE INTELLIGENCE, IEEE TRANSACTIONS ON
, 2007
"... Part structure and articulation are of fundamental importance in computer and human vision. We propose using the innerdistance to build shape descriptors that are robust to articulation and capture part structure. The innerdistance is defined as the length of the shortest path between landmark poin ..."
Abstract

Cited by 96 (6 self)
 Add to MetaCart
Part structure and articulation are of fundamental importance in computer and human vision. We propose using the innerdistance to build shape descriptors that are robust to articulation and capture part structure. The innerdistance is defined as the length of the shortest path between landmark points within the shape silhouette. We show that it is articulation insensitive and more effective at capturing part structures than the Euclidean distance. This suggests that the innerdistance can be used as a replacement for the Euclidean distance to build more accurate descriptors for complex shapes, especially for those with articulated parts. In addition, texture information along the shortest path can be used to further improve shape classification. With this idea, we propose three approaches to using the innerdistance. The first method combines the innerdistance and multidimensional scaling (MDS) to build articulation invariant signatures for articulated shapes. The second method uses the innerdistance to build a new shape descriptor based on shape contexts. The third one extends the second one by considering the texture information along shortest paths. The proposed approaches have been tested on a variety of shape databases, including an articulated shape data set, MPEG7 CEShape1, Kimia silhouettes, the ETH80 data set, two leaf data sets, and a human motion silhouette data set. In all the experiments, our methods demonstrate effective performance compared with other algorithms.
Recognition of Shapes by Editing Shock Graphs
 In IEEE International Conference on Computer Vision
, 2001
"... This paper presents a novel recognition framework which is based on matching shock graphs of 2D shape outlines, where the distance between two shapes is defined to be the cost of the least action path deforming one shape to another. Three key ideas render the implementation of this framework practic ..."
Abstract

Cited by 93 (6 self)
 Add to MetaCart
This paper presents a novel recognition framework which is based on matching shock graphs of 2D shape outlines, where the distance between two shapes is defined to be the cost of the least action path deforming one shape to another. Three key ideas render the implementation of this framework practical. First, the shape space is partitioned by defining an equivalence class on shapes, where two shapes with the same shock graph topology are considered to be equivalent. Second, the space of deformations is discretized by defining all deformations with the same sequence of shock graph transitions as equivalent. Shock transitions are points along the deformation where the shock graph topology changes. Third, we employ a graph edit distance algorithm that searches in the space of all possible transition sequences and finds the globally optimal sequence in polynomial time. The effectiveness of the proposed technique in the presence of a variety of visual transformations including occlusion, articulation and deformation of parts, shadow and highlights, viewpoint variation, and boundary perturbations is demonstrated. Indexing into two separate databases of roughly 100 shapes results in 100% accuracy for top three matches and 99:5% for the next three matches. 1
On aligning curves
 IEEE TPAMI
, 2003
"... We present a novel approach to finding a correspondence (alignment) between two curves. The correspondence is based on a notion of an alignment curve which treats both curves symmetrically. We then define a similarity metric based on the alignment curve using two intrinsic properties of the curve, ..."
Abstract

Cited by 92 (3 self)
 Add to MetaCart
We present a novel approach to finding a correspondence (alignment) between two curves. The correspondence is based on a notion of an alignment curve which treats both curves symmetrically. We then define a similarity metric based on the alignment curve using two intrinsic properties of the curve, namely, length and curvature. The optimal correspondence is found by an efficient dynamicprogramming method both for aligning pairs of curve segments and pairs of closed curves, and is effective in the presence of a variety of transformations of the curve. Finally, the correspondence is shown in application to handwritten character recognition, prototype formation, and object recognition, and is potentially useful in other applications such as registration and tracking.
Kernel Density Estimation and Intrinsic Alignment for Knowledgedriven Segmentation: Teaching Level Sets to Walk
 International Journal of Computer Vision
, 2004
"... We address the problem of image segmentation with statistical shape priors in the context of the level set framework. Our paper makes two contributions: Firstly, we propose to generate invariance of the shape prior to certain transformations by intrinsic registration of the evolving level set fun ..."
Abstract

Cited by 83 (16 self)
 Add to MetaCart
We address the problem of image segmentation with statistical shape priors in the context of the level set framework. Our paper makes two contributions: Firstly, we propose to generate invariance of the shape prior to certain transformations by intrinsic registration of the evolving level set function. In contrast to existing approaches to invariance in the level set framework, this closedform solution removes the need to iteratively optimize explicit pose parameters. Moreover, we will argue that the resulting shape gradient is more accurate in that it takes into account the e#ect of boundary variation on the object's pose.
Self Organization in Vision: Stochastic Clustering for Image Segmentation, Perceptual Grouping, and Image Database Organization
, 2001
"... We present a stochastic clustering algorithm which uses pairwise similarity of elements, and show how it can be used to address various problems in computer vision, including the lowlevel image segmentation, midlevel perceptual grouping, and highlevel image database organization. The clustering p ..."
Abstract

Cited by 76 (4 self)
 Add to MetaCart
We present a stochastic clustering algorithm which uses pairwise similarity of elements, and show how it can be used to address various problems in computer vision, including the lowlevel image segmentation, midlevel perceptual grouping, and highlevel image database organization. The clustering problem is viewed as a graph partitioning problem, where nodes represent data elements and the weights of the edges represent pairwise similarities. We generate samples of cuts in this graph, by using Karger's contraction algorithm, and compute an "average" cut which provides the basis for our solution to the clustering problem. The stochastic nature of our method makes it robust against noise, including accidental edges and small spurious clusters. The complexity of our algorithm is very low: O(E log² N) for N objects, E similarity relations and a fixed accuracy level. In addition, and without additional computational cost, our algorithm provides a hierarchy of nested partitions. We demonstrate the superiority of our method for image segmentation on a few synthetic and real images, B&W and color. Our other examples include the concatenation of edges in a cluttered scene (perceptual grouping), and the organization of an image database for the purpose of multiview 3D object recognition.
Classification with NonMetric Distances: Image Retrieval and Class Representation
, 2000
"... One of the key problems in appearancebased vision is understanding how to use a set of labeled images to classify new images. Classification systems that can model human performance, or that use robust image matching methods, often make use of similarity judgments that are nonmetric; but when the ..."
Abstract

Cited by 71 (0 self)
 Add to MetaCart
One of the key problems in appearancebased vision is understanding how to use a set of labeled images to classify new images. Classification systems that can model human performance, or that use robust image matching methods, often make use of similarity judgments that are nonmetric; but when the triangle inequality is not obeyed, most existing pattern recognition techniques are not applicable. We note that exemplarbased (or nearestneighbor) methods can be applied naturally when using a wide class of nonmetric similarity functions. The key issue, however, is to find methods for choosing good representatives of a class that accurately characterize it. We show that existing condensing techniques for finding class representatives are illsuited to deal with nonmetric dataspaces. We then focus on developing techniques for solving this problem, emphasizing two points: First, we show that the distance between two images is not a good measure of how well one image can represent ...