Results 1 - 10
of
104
Sharing Features: Efficient Boosting Procedures for Multiclass Object Detection
- IN CVPR
, 2004
"... We consider the problem of detecting a large number of different object classes in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, which can be slow and require much training data. We present a multi-class boosting procedure (joint boosting) ..."
Abstract
-
Cited by 186 (14 self)
- Add to MetaCart
We consider the problem of detecting a large number of different object classes in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, which can be slow and require much training data. We present a multi-class boosting procedure (joint boosting) that reduces both the computational and sample complexity, by finding common features that can be shared across the classes. The detectors for each class are trained jointly, rather than independently. For a given performance level, the total number of features required is observed to scale approximately logarithmically with the number of classes. In addition, we find that the features selected by independently trained classifiers are often specific to the class, whereas the features selected by the jointly trained classifiers are more generic features, such as lines and edges.
The PASCAL Visual Object Classes (VOC) challenge
, 2009
"... ... is a benchmark in visual object category recognition and detection, providing the vision and machine learning communities with a standard dataset of images and annotation, and standard evaluation procedures. Organised annually from 2005 to present, the challenge and its associated dataset has be ..."
Abstract
-
Cited by 63 (2 self)
- Add to MetaCart
... is a benchmark in visual object category recognition and detection, providing the vision and machine learning communities with a standard dataset of images and annotation, and standard evaluation procedures. Organised annually from 2005 to present, the challenge and its associated dataset has become accepted as the benchmark for object detection. This paper describes the dataset and evaluation procedure. We review the state-of-the-art in evaluated methods for both classification and detection, analyse whether the methods are statistically different, what they are learning from the images (e.g. the object or its context), and what the methods find easy or confuse. The paper concludes with lessons learnt in the three year history of the challenge, and proposes directions for future improvement and extension.
Learning to detect unseen object classes by betweenclass attribute transfer
- In CVPR
, 2009
"... We study the problem of object classification when training and test classes are disjoint, i.e. no training examples of the target classes are available. This setup has hardly been studied in computer vision research, but it is the rule rather than the exception, because the world contains tens of t ..."
Abstract
-
Cited by 58 (2 self)
- Add to MetaCart
We study the problem of object classification when training and test classes are disjoint, i.e. no training examples of the target classes are available. This setup has hardly been studied in computer vision research, but it is the rule rather than the exception, because the world contains tens of thousands of different object classes and for only a very few of them image, collections have been formed and annotated with suitable class labels. In this paper, we tackle the problem by introducing attribute-based classification. It performs object detection based on a human-specified high-level description of the target objects instead of training images. The description consists of arbitrary semantic attributes, like shape, color or even geographic information. Because such properties transcend the specific learning task at hand, they can be pre-learned, e.g. from image datasets unrelated to the current task. Afterwards, new classes can be detected based on their attribute representation, without the need for a new training phase. In order to evaluate our method and to facilitate research in this area, we have assembled a new largescale dataset, “Animals with Attributes”, of over 30,000 animal images that match the 50 classes in Osherson’s classic table of how strongly humans associate 85 semantic attributes with animal classes. Our experiments show that by using an attribute layer it is indeed possible to build a learning object detection system that does not require any training images of the target classes. 1.
TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context
, 2007
"... This paper details a new approach for learning a discriminative model of object classes, incorporating texture, layout, and context information efficiently. The learned model is used for automatic visual understanding and semantic segmentation of photographs. Our discriminative model exploits textur ..."
Abstract
-
Cited by 44 (5 self)
- Add to MetaCart
This paper details a new approach for learning a discriminative model of object classes, incorporating texture, layout, and context information efficiently. The learned model is used for automatic visual understanding and semantic segmentation of photographs. Our discriminative model exploits texture-layout filters, novel features based on textons, which jointly model patterns of texture and their spatial layout. Unary classification and feature selection is achieved using shared boosting to give an efficient classifier which can be applied to a large number of classes. Accurate image segmentation is achieved by incorporating the unary classifier in a conditional random field, which (i) captures the spatial interactions between class labels of neighboring pixels, and (ii) improves the segmentation of specific object instances. Efficient training of the model on large datasets is achieved by exploiting both random feature selection and piecewise training methods. High classification and segmentation accuracy is
Uncovering shared structures in multiclass classification
- In Proceedings of the Twenty-fourth International Conference on Machine Learning
, 2007
"... This paper suggests a method for multiclass learning with many classes by simultaneously learning shared characteristics common to the classes, and predictors for the classes in terms of these characteristics. We cast this as a convex optimization problem, using trace-norm regularization and study g ..."
Abstract
-
Cited by 40 (0 self)
- Add to MetaCart
This paper suggests a method for multiclass learning with many classes by simultaneously learning shared characteristics common to the classes, and predictors for the classes in terms of these characteristics. We cast this as a convex optimization problem, using trace-norm regularization and study gradient-based optimization both for the linear case and the kernelized setting. 1.
Combining generative models and Fisher kernels for object class recognition
- In ICCV
, 2005
"... Learning models for detecting and classifying object categories is a challenging problem in machine vision. While discriminative approaches to learning and classification have, in principle, superior performance, generative approaches provide many useful features, one of which is the ability to natu ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
Learning models for detecting and classifying object categories is a challenging problem in machine vision. While discriminative approaches to learning and classification have, in principle, superior performance, generative approaches provide many useful features, one of which is the ability to naturally establish explicit correspondence between model components and scene features – this, in turn, allows for the handling of missing data and unsupervised learning in clutter. We explore a hybrid generative/discriminative approach using ‘Fisher kernels ’ [1] which retains most of the desirable properties of generative methods, while increasing the classification performance through a discriminative setting. Furthermore, we demonstrate how this kernel framework can be used to combine different types of features and models into a single classifier. Our experiments, conducted on a number of popular benchmarks, show strong performance improvements over the corresponding generative approach and are competitive with the best results reported in the literature. 1
Transfer Learning for Image Classification with Sparse Prototype Representations
"... To learn a new visual category from few examples, prior knowledge from unlabeled data as well as previous related categories may be useful. We develop a new method for transfer learning which exploits available unlabeled data and an arbitrary kernel function; we form a representation based on kernel ..."
Abstract
-
Cited by 29 (5 self)
- Add to MetaCart
To learn a new visual category from few examples, prior knowledge from unlabeled data as well as previous related categories may be useful. We develop a new method for transfer learning which exploits available unlabeled data and an arbitrary kernel function; we form a representation based on kernel distances to a large set of unlabeled data points. To transfer knowledge from previous related problems we observe that a category might be learnable using only a small subset of reference prototypes. Related problems may share a significant number of relevant prototypes; we find such a concise representation by performing a joint loss minimization over the training sets of related problems with a shared regularization penalty that minimizes the total number of prototypes involved in the approximation. This optimization problem can be formulated as a linear program that can be solved efficiently. We conduct experiments on a news-topic prediction task where the goal is to predict whether an image belongs to a particular news topic. Our results show that when only few examples are available for training a target topic, leveraging knowledge learnt from other topics can significantly improve performance.
Real-time vision on a mobile robot platform
- In IEEE/RSJ International Conference on Intelligent Robots and Systems
, 2005
"... Abstract — Computer vision is a broad and significant ongoing research challenge, even when performed on an individual image or on streaming video from a high-quality stationary camera with abundant computational resources. When faced with streaming video from a lower-quality, rapidly moving camera ..."
Abstract
-
Cited by 26 (12 self)
- Add to MetaCart
Abstract — Computer vision is a broad and significant ongoing research challenge, even when performed on an individual image or on streaming video from a high-quality stationary camera with abundant computational resources. When faced with streaming video from a lower-quality, rapidly moving camera and limited computational resources, the challenge increases. We present our implementation of a vision system on a mobile robot platform that uses a camera image as the primary sensory input. Having to perform all processing, including segmentation and object detection, in real-time on-board the robot, eliminates the possibility of using some state-of-the-art methods that otherwise might apply. We describe the methods that we developed to achieve a practical vision system within these constraints. Our approach is fully implemented and tested on a team of Sony AIBO robots. Index Terms — Vision and Recognition, Legged Robots. I.
Pop: Patchwork of parts models for object recognition
- International Journal of Computer Vision
, 2004
"... We formulate a deformable template model for objects with a clearly defined mechanism for parameter estimation. A separate model is estimated for each class, and classification is likelihood based- no discrmination boundaries are learned. Nonethe-less high classification rates are achieved with smal ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
We formulate a deformable template model for objects with a clearly defined mechanism for parameter estimation. A separate model is estimated for each class, and classification is likelihood based- no discrmination boundaries are learned. Nonethe-less high classification rates are achieved with small training samples. The data models are defined on binary oriented edge features that are highly robust to photometric vari-ation and small local deformations. The deformation of an object is defined in terms of locations of a moderate number reference points. Each reference point is associated with a part- a probability map assigning a probability for each edge type at each pixel in a window. The likelihood of the edge data on the entire image conditional on the deformation is described as a patchwork of parts (POP) model- the edges are assumed conditionally independent, and the marginal at each pixel is obtained by a patchwork operation: averaging the marginal probabilities contributed by each part covering the pixel. Object classes are modeled as mixtures of POP models that are discovered se-quentially as more class data is observed. Experiments are presented on the MNIST database, hundreds of deformed LATEX shapes, reading zipcodes, and face detection. 1
Learning 3D mesh segmentation and labeling
- ACM Trans. on Graphics
, 2010
"... head torso upper arm lower arm hand upper leg lower leg foot ear head torso arm leg tail body fin handle cup top base arm lens bridge antenna head thorax leg abdomen cup handle face hair neck fin stabilizer body wing top leg thumb index middle ring pinky palm big roller medium roller axle handle joi ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
head torso upper arm lower arm hand upper leg lower leg foot ear head torso arm leg tail body fin handle cup top base arm lens bridge antenna head thorax leg abdomen cup handle face hair neck fin stabilizer body wing top leg thumb index middle ring pinky palm big roller medium roller axle handle joint jaws head neck torso leg tail ear head torso back upper arm lower arm hand upper leg lower leg foot tail head wing body leg tail big cube small cube back middle seat leg head tentacle Figure 1: Labeling and segmentation results from applying our algorithm to one mesh each from every category in the Princeton Segmentation Benchmark [Chen et al. 2009]. For each result, the algorithm was trained on the other meshes in the same class, e.g., the human was labeled after training on the other meshes in the human class. This paper presents a data-driven approach to simultaneous segmentation and labeling of parts in 3D meshes. An objective function is formulated as a Conditional Random Field model, with terms assessing the consistency of faces with labels, and terms between labels of neighboring faces. The objective function is learned from a collection of labeled training meshes. The algorithm uses hundreds of geometric and contextual label features and learns different types of segmentations for different tasks, without requiring manual parameter tuning. Our algorithm achieves a significant improvement in results over the state-of-the-art when evaluated on the Princeton Segmentation Benchmark, often producing segmentations and labelings comparable to those produced by humans. 1

