Results 1  10
of
69
Randomized trees for realtime keypoint recognition
 In Proc. Int. Conf. on Computer Vision and Pattern Recognition (CVPR 2005
, 2005
"... In earlier work, we proposed treating wide baseline matching of feature points as a classification problem, in which each class corresponds to the set of all possible views of such a point. We used a Kmean plus Nearest Neighbor classifier to validate our approach, mostly because it was simple to im ..."
Abstract

Cited by 167 (5 self)
 Add to MetaCart
(Show Context)
In earlier work, we proposed treating wide baseline matching of feature points as a classification problem, in which each class corresponds to the set of all possible views of such a point. We used a Kmean plus Nearest Neighbor classifier to validate our approach, mostly because it was simple to implement. It has proved effective but still too slow for realtime use. In this paper, we advocate instead the use of randomized trees as the classification technique. It is both fast enough for realtime performance and more robust. It also gives us a principled way not only to match keypoints but to select during a training phase those that are the most recognizable ones. This results in a realtime system able to detect and position in 3D planar, nonplanar, and even deformable objects. It is robust to illuminations changes, scale changes and occlusions. 1.
Towards a coherent statistical framework for dense deformable template estimation
 J.R. Statist. Soc.B
, 2006
"... Abstract. The problem of estimating probabilistic deformable template models in the field of computer vision or of probabilistic atlases in the field of computational anatomy has not yet received a coherent statistical formulation and remains a challenge. In this paper, we provide a careful definiti ..."
Abstract

Cited by 81 (9 self)
 Add to MetaCart
(Show Context)
Abstract. The problem of estimating probabilistic deformable template models in the field of computer vision or of probabilistic atlases in the field of computational anatomy has not yet received a coherent statistical formulation and remains a challenge. In this paper, we provide a careful definition and analysis of a well defined statistical model based on dense deformable templates for gray level images of deformable objects. We propose a rigorous Bayesian framework for which we can derived an iterative algorithm for the effective estimation of the geometric and photometric parameters of the model in a small sample setting, together with an asymptotic consistency proof. The model is extended to mixtures of finite numbers of such components leading to a fine description of the photometric and geometric variations. We illustrate some of the ideas with images of handwritten digits, and apply the estimated models to classification through maximum likelihood. 1.
A bayesian, exemplarbased approach to hierarchical shape matching
 IEEE Trans. Pattern Anal. Mach. Intell
"... Abstract—This paper presents a novel probabilistic approach to hierarchical, exemplarbased shape matching. No feature correspondence is needed among exemplars, just a suitable pairwise similarity measure. The approach uses a template tree to efficiently represent and match the variety of shape exem ..."
Abstract

Cited by 74 (8 self)
 Add to MetaCart
(Show Context)
Abstract—This paper presents a novel probabilistic approach to hierarchical, exemplarbased shape matching. No feature correspondence is needed among exemplars, just a suitable pairwise similarity measure. The approach uses a template tree to efficiently represent and match the variety of shape exemplars. The tree is generated offline by a bottomup clustering approach using stochastic optimization. Online matching involves a simultaneous coarsetofine approach over the template tree and over the transformation parameters. The main contribution of this paper is a Bayesian model to estimate the a posteriori probability of the object class, after a certain match at a node of the tree. This model takes into account object scale and saliency and allows for a principled setting of the matching thresholds such that unpromising paths in the tree traversal process are eliminated early on. The proposed approach was tested in a variety of application domains. Here, results are presented on one of the more challenging domains: realtime pedestrian detection from a moving vehicle. A significant speedup is obtained when comparing the proposed probabilistic matching approach with a manually tuned nonprobabilistic variant, both utilizing the same template tree structure. Index Terms—Hierarchical shape matching, chamfer distance, Bayesian models. 1
Learning and using taxonomies for fast visual categorization
 In CVPR
"... The computational complexity of current visual categorization algorithms scales linearly at best with the number of categories. The goal of classifying simultaneously Ncat = 104−105 visual categories requires sublinear classification costs. We explore algorithms for automatically building classific ..."
Abstract

Cited by 58 (3 self)
 Add to MetaCart
(Show Context)
The computational complexity of current visual categorization algorithms scales linearly at best with the number of categories. The goal of classifying simultaneously Ncat = 104−105 visual categories requires sublinear classification costs. We explore algorithms for automatically building classification trees which have, in principle, log Ncat complexity. We find that a greedy algorithm that recursively splits the set of categories into the two minimally confused subsets achieves 520 fold speedups at a small cost in classification performance. Our approach is independent of the specific classification algorithm used. A welcome byproduct of our algorithm is a very reasonable taxonomy of the Caltech256 dataset. 1.
Pop: Patchwork of parts models for object recognition
 International Journal of Computer Vision
, 2004
"... We formulate a deformable template model for objects with a clearly defined mechanism for parameter estimation. A separate model is estimated for each class, and classification is likelihood based no discrmination boundaries are learned. Nonetheless high classification rates are achieved with smal ..."
Abstract

Cited by 57 (3 self)
 Add to MetaCart
We formulate a deformable template model for objects with a clearly defined mechanism for parameter estimation. A separate model is estimated for each class, and classification is likelihood based no discrmination boundaries are learned. Nonetheless high classification rates are achieved with small training samples. The data models are defined on binary oriented edge features that are highly robust to photometric variation and small local deformations. The deformation of an object is defined in terms of locations of a moderate number reference points. Each reference point is associated with a part a probability map assigning a probability for each edge type at each pixel in a window. The likelihood of the edge data on the entire image conditional on the deformation is described as a patchwork of parts (POP) model the edges are assumed conditionally independent, and the marginal at each pixel is obtained by a patchwork operation: averaging the marginal probabilities contributed by each part covering the pixel. Object classes are modeled as mixtures of POP models that are discovered sequentially as more class data is observed. Experiments are presented on the MNIST database, hundreds of deformed LATEX shapes, reading zipcodes, and face detection. 1
Hierarchical testing designs for pattern recognition
, 2003
"... We explore the theoretical foundations of a “twenty questions” approach to pattern recognition. The object of the analysis is the computational process itself rather than probability distributions (Bayesian inference) or decision boundaries (statistical learning). Our formulation is motivated by app ..."
Abstract

Cited by 48 (8 self)
 Add to MetaCart
(Show Context)
We explore the theoretical foundations of a “twenty questions” approach to pattern recognition. The object of the analysis is the computational process itself rather than probability distributions (Bayesian inference) or decision boundaries (statistical learning). Our formulation is motivated by applications to scene interpretation in which there are a great many possible explanations for the data, one (“background”) is statistically dominant, and it is imperative to restrict intensive computation to genuinely ambiguous regions. The focus here is then on pattern filtering: Given a large set Y of possible patterns or explanations, narrow down the true one Y to a small (random) subset ̂Y ⊂ Y of “detected ” patterns to be subjected to further, more intense, processing. To this end, we consider a family of hypothesis tests for Y ∈ A versus the nonspecific alternatives Y ∈ A c. Each test has null type I error and the candidate sets A ⊂ Y are arranged in a hierarchy of nested partitions. These tests are then
Nearoptimal detection of geometric objects by fast multiscale methods
 IEEE Trans. Inform. Theory
, 2005
"... Abstract—We construct detectors for “geometric ” objects in noisy data. Examples include a detector for presence of a line segment of unknown length, position, and orientation in twodimensional image data with additive white Gaussian noise. We focus on the following two issues. i) The optimal detec ..."
Abstract

Cited by 43 (10 self)
 Add to MetaCart
(Show Context)
Abstract—We construct detectors for “geometric ” objects in noisy data. Examples include a detector for presence of a line segment of unknown length, position, and orientation in twodimensional image data with additive white Gaussian noise. We focus on the following two issues. i) The optimal detection threshold—i.e., the signal strength below which no method of detection can be successful for large dataset size. ii) The optimal computational complexity of a nearoptimal detector, i.e., the complexity required to detect signals slightly exceeding the detection threshold. We describe a general approach to such problems which covers several classes of geometrically defined signals; for example, with onedimensional data, signals having elevated mean on an interval, and, indimensional data, signals with elevated mean on a rectangle, a ball, or an ellipsoid. In all these problems, we show that a naive or straightforward approach leads to detector thresholds and algorithms which are asymptotically far away from optimal. At the same time, a multiscale geometric analysis of these classes of objects allows us to derive asymptotically optimal detection thresholds and fast algorithms for nearoptimal detectors. Index Terms—Beamlets, detecting hot spots, detecting line segments, Hough transform, image processing, maxima of Gaussian processes, multiscale geometric analysis, Radon transform. I.
Fast pedestrian detection using a cascade of boosted covariance features,” Circuits and Systems for Video Technology
 IEEE Transactions on
, 2008
"... Abstract—Efficiently and accurately detecting pedestrians plays a very important role in many computer vision applications such as video surveillance and smart cars. In order to find the right feature for this task, we first present a comprehensive experimental study on pedestrian detection using st ..."
Abstract

Cited by 29 (12 self)
 Add to MetaCart
(Show Context)
Abstract—Efficiently and accurately detecting pedestrians plays a very important role in many computer vision applications such as video surveillance and smart cars. In order to find the right feature for this task, we first present a comprehensive experimental study on pedestrian detection using stateoftheart locally extracted features (e.g., local receptive fields, histogram of oriented gradients, and region covariance). Building upon the findings of our experiments, we propose a new, simpler pedestrian detector using the covariance features. Unlike the work in [1], where the feature selection and weak classifier training are performed on the Riemannian manifold, we select features and train weak classifiers in the Euclidean space for faster computation. To this end, AdaBoost with weighted Fisher linear discriminant analysisbased weak classifiers are designed. A cascaded classifier structure is constructed for efficiency in the detection phase. Experiments on different datasets prove that the new pedestrian detector is not only comparable to the stateoftheart pedestrian detectors but it also performs at a faster speed. To further accelerate the detection, we adopt a faster strategy—multiple layer boosting with heterogeneous features—to exploit the efficiency of the Haar feature and the discriminative power of the covariance feature. Experiments show that, by combining the Haar and covariance features, we speed up the original covariance feature detector [1] by up to an order of magnitude in detection time with a slight drop in detection performance. Index Terms—AdaBoost, boosting with heterogeneous features, local features, pedestrian detection/classification, support vector machine. I.
Part and Appearance Sharing: Recursive Compositional Models for MultiView MultiObject Detection
, 2010
"... We propose Recursive Compositional Models (RCMs) for simultaneous multiview multiobject detection and parsing (e.g. view estimation and determining the positions of the object subparts). We represent the set of objects by a family of RCMs where each RCM is a probability distribution defined over a ..."
Abstract

Cited by 24 (5 self)
 Add to MetaCart
We propose Recursive Compositional Models (RCMs) for simultaneous multiview multiobject detection and parsing (e.g. view estimation and determining the positions of the object subparts). We represent the set of objects by a family of RCMs where each RCM is a probability distribution defined over a hierarchical graph which corresponds to a specific object and viewpoint. An RCM is constructed from a hierarchy of subparts/subgraphs which are learnt from training data. Partsharing is used so that different RCMs are encouraged to share subparts/subgraphs which yields a compact representation for the set of objects and which enables efficient inference and learning from a limited number of training samples. In addition, we use appearancesharing so that RCMs for the same object, but different viewpoints, share similar appearance cues which also helps efficient learning. RCMs lead to a multiview multiobject detection system. We illustrate RCMs on four public datasets and achieve stateoftheart performance. Figure 1. This figure is best viewed in color. Each object template is represented by a graph, associated with a shape mask which labels the pixels as background (white), object body (grey), parts (colors). The leaf nodes correspond to the localizable parts (10 oriented boundary segments in this example) on the object boundary. The root models the body appearance. The subgraphs may share the same structure (see the two dotted boxes). We show two examples including input images, probabilistic car body maps and boundary maps which are calculated by the body and boundary appearance potentials respectively.