Results 1 - 10
of
375
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
- International Journal of Computer Vision
, 2001
"... In this paper, we propose a computational model of the recognition of real world scenes that bypasses the segmentation and the processing of individual objects or regions. The procedure is based on a very low dimensional representation of the scene, that we term the Spatial Envelope. We propose a se ..."
Abstract
-
Cited by 1313 (81 self)
- Add to MetaCart
(Show Context)
In this paper, we propose a computational model of the recognition of real world scenes that bypasses the segmentation and the processing of individual objects or regions. The procedure is based on a very low dimensional representation of the scene, that we term the Spatial Envelope. We propose a set of perceptual dimensions (naturalness, openness, roughness, expansion, ruggedness) that represent the dominant spatial structure of a scene. Then, we show that these dimensions may be reliably estimated using spectral and coarsely localized information. The model generates a multidimensional space in which scenes sharing membership in semantic categories (e.g., streets, highways, coasts) are projected closed together. The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.
A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics
- in Proc. 8th Int’l Conf. Computer Vision
, 2001
"... This paper presents a database containing ‘ground truth ’ segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the s ..."
Abstract
-
Cited by 954 (14 self)
- Add to MetaCart
(Show Context)
This paper presents a database containing ‘ground truth ’ segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the same image are highly consistent. Use of this dataset is demonstrated in two applications: (1) evaluating the performance of segmentation algorithms and (2) measuring probability distributions associated with Gestalt grouping factors as well as statistics of image region properties. 1.
SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... The need for efficient content-based image retrieval has increased tremendously in many application areas such as biomedicine, military, commerce, education, and Web image classification and searching. We present here SIMPLIcity (Semanticssensitive Integrated Matching for Picture LIbraries), an imag ..."
Abstract
-
Cited by 551 (35 self)
- Add to MetaCart
The need for efficient content-based image retrieval has increased tremendously in many application areas such as biomedicine, military, commerce, education, and Web image classification and searching. We present here SIMPLIcity (Semanticssensitive Integrated Matching for Picture LIbraries), an image retrieval system, which uses semantics classification methods, a wavelet-based approach for feature extraction, and integrated region matching based upon image segmentation. As in other regionbased retrieval systems, an image is represented by a set of regions, roughly corresponding to objects, which are characterized by color, texture, shape, and location. The system classifies images into semantic categories, such as textured-nontextured, graphphotograph. Potentially, the categorization enhances retrieval by permitting semantically-adaptive searching methods and narrowing down the searching range in a database. A measure for the overall similarity between images is developed using a region-matching scheme that integrates properties of all the regions in the images. Compared with retrieval based on individual regions, the overall similarity approach 1) reduces the adverse effect of inaccurate segmentation, 2) helps to clarify the semantics of a particular region, and 3) enables a simple querying interface for region-based image retrieval systems. The application of SIMPLIcity to several databases, including a database of about 200,000 general-purpose images, has demonstrated that our system performs significantly better and faster than existing ones. The system is fairly robust to image alterations.
Blobworld: Image segmentation using Expectation-Maximization and its application to image querying
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1999
"... Retrieving images from large and varied collections using image content as a key is a challenging and important problem. We present a new image representation which provides a transformation from the raw pixel data to a small set of image regions which are coherent in color and texture. This "B ..."
Abstract
-
Cited by 438 (10 self)
- Add to MetaCart
(Show Context)
Retrieving images from large and varied collections using image content as a key is a challenging and important problem. We present a new image representation which provides a transformation from the raw pixel data to a small set of image regions which are coherent in color and texture. This "Blobworld" representation is created by clustering pixels in a joint color-texture-position feature space. The segmentation algorithm is fully automatic and has been run on a collection of 10,000 natural images. We describe a system that uses the Blobworld representation to retrieve images from this collection. An important aspect of the system is that the user is allowed to view the internal representation of the submitted image and the query results. Similar systems do not offer the user this view into the workings of the system; consequently, query results from these systems can be inexplicable, despite the availability of knobs for adjusting the similarity metrics. By finding image regions whi...
Automatic Image Annotation and Retrieval using Cross-Media Relevance Models
, 2003
"... Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based o ..."
Abstract
-
Cited by 431 (14 self)
- Add to MetaCart
(Show Context)
Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models. allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media rele- vance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.
Robust object recognition with cortex-like mechanisms
- IEEE Trans. Pattern Analysis and Machine Intelligence
, 2007
"... Abstract—We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating b ..."
Abstract
-
Cited by 389 (47 self)
- Add to MetaCart
(Show Context)
Abstract—We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating between a template matching and a maximum pooling operation. We demonstrate the strength of the approach on a range of recognition tasks: From invariant single object recognition in clutter to multiclass categorization problems and complex scene understanding tasks that rely on the recognition of both shape-based as well as texture-based objects. Given the biological constraints that the system had to satisfy, the approach performs surprisingly well: It has the capability of learning from only a few training examples and competes with state-of-the-art systems. We also discuss the existence of a universal, redundant dictionary of features that could handle the recognition of most object categories. In addition to its relevance for computer vision, the success of this approach suggests a plausibility proof for a class of feedforward models of object recognition in cortex.
Faceted metadata for image search and browsing
- In ACM Conference on Computer-Human Interaction
, 2003
"... There are currently two dominant interface types for searching and browsing large image collections: keyword-based search, and searching by overall similarity to sample images. This paper presents an alternative in which users are able to navigate explicitly along conceptual dimensions that describe ..."
Abstract
-
Cited by 375 (4 self)
- Add to MetaCart
(Show Context)
There are currently two dominant interface types for searching and browsing large image collections: keyword-based search, and searching by overall similarity to sample images. This paper presents an alternative in which users are able to navigate explicitly along conceptual dimensions that describe the images. The interface makes use of hierarchical faceted metadata and dynamically generated query previews. A usability study, conducted with 32 art history students exploring a collection of 35,000 fine arts images, compares this approach to a standard image search interface. Despite the unfamiliarity and power of the interface (attributes which often lead to rejection of new search interfaces), the study results show that 90 % of the participants preferred the metadata approach overall, 97 % said that it helped them learn more about the collection, 75 % found it more flexible and 72 % found it easier to use than a standard baseline system. These results indicate that that a category-based approach is a successful way to provide access to image collections. KEYWORDS:
Support vector machines for multiple-instance learning
- Advances in Neural Information Processing Systems 15
, 2003
"... This paper presents two new formulations of multiple-instance learning as a maximum margin problem. The proposed extensions of the Support Vector Machine (SVM) learning approach lead to mixed integer quadratic programs that can be solved heuristically. Our generalization of SVMs makes a state-of-the ..."
Abstract
-
Cited by 314 (2 self)
- Add to MetaCart
(Show Context)
This paper presents two new formulations of multiple-instance learning as a maximum margin problem. The proposed extensions of the Support Vector Machine (SVM) learning approach lead to mixed integer quadratic programs that can be solved heuristically. Our generalization of SVMs makes a state-of-the-art classification technique, including non-linear classification via kernels, available to an area that up to now has been largely dominated by special purpose methods. We present experimental results on a pharmaceutical data set and on applications in automated image indexing and document categorization. 1
Integral histogram: A fast way to extract histograms in cartesian spaces,”
- in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR),
, 2005
"... Abstract We present a novel method, which we refer as an integral histogram, to ..."
Abstract
-
Cited by 224 (16 self)
- Add to MetaCart
(Show Context)
Abstract We present a novel method, which we refer as an integral histogram, to