Results 11 - 20
of
133
Image classification for content-based indexing
- IEEE Transactions on Image Processing
, 2001
"... Abstract—Grouping images into (semantically) meaningful categories using low-level visual features is a challenging and important problem in content-based image retrieval. Using binary Bayesian classifiers, we attempt to capture high-level concepts from low-level image features under the constraint ..."
Abstract
-
Cited by 118 (2 self)
- Add to MetaCart
Abstract—Grouping images into (semantically) meaningful categories using low-level visual features is a challenging and important problem in content-based image retrieval. Using binary Bayesian classifiers, we attempt to capture high-level concepts from low-level image features under the constraint that the test image does belong to one of the classes. Specifically, we consider the hierarchical classification of vacation images; at the highest level, images are classified as indoor or outdoor; outdoor images are further classified as city or landscape; finally, a subset of landscape images is classified into sunset, forest, and mountain classes. We demonstrate that a small vector quantizer (whose optimal size is selected using a modified MDL criterion) can be used to model the class-conditional densities of the features, required by the Bayesian methodology. The classifiers have been designed and evaluated on a database of 6931 vacation photographs. Our system achieved a classification accuracy of 90.5 % for indoor/outdoor, 95.3 % for city/landscape, 96.6 % for sunset/forest & mountain, and 96 % for forest/mountain classification problems. We further develop a learning method to incrementally train the classifiers as additional data become available. We also show preliminary results for feature reduction using clustering techniques. Our goal is to combine multiple two-class classifiers into a single hierarchical classifier. Index Terms—Bayesian methods, content-based retrieval, digital libraries, image content analysis, minimum description length, semantic
ImageRover: A Content-Based Image Browser for the World Wide Web
- In Proc. IEEE Workshop on Content-based Access of Image and Video Libraries
, 1997
"... ImageRover is a search by image content navigation tool for the world wide web. To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the approp ..."
Abstract
-
Cited by 117 (3 self)
- Add to MetaCart
ImageRover is a search by image content navigation tool for the world wide web. To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the appropriate image decompositions and indices, and store this extracted information in vector form for searches based on image content. At search time, users can iteratively guide the search through the selection of relevant examples. Search performance is made efficient through the use of an approximate, optimized k-d tree algorithm. The system employs a novel relevance feedback algorithm that selects the distance metrics appropriate for a particular query. Keywords: Image databases, query by image content, content-based retrieval, world wide web search engines. 1 Introduction For a while now there have been software "robots" roving the World Wide Web (WWW) collecting index information about th...
Periodicity, directionality, and randomness: Wold features for image modeling and retrieval
- IEEE Trans. Pattern Analysis and Machine Intelligence
, 1996
"... One of the fundamental challenges in pattern recognition is choosing a set of features appropriate to a class of problems. In applications such as database retrieval, it is important that image features used in pattern comparison provide good measures of image perceptual similarities. In this paper, ..."
Abstract
-
Cited by 103 (5 self)
- Add to MetaCart
One of the fundamental challenges in pattern recognition is choosing a set of features appropriate to a class of problems. In applications such as database retrieval, it is important that image features used in pattern comparison provide good measures of image perceptual similarities. In this paper, we present an image model with a new set of features that address the challenge of perceptual similarity. The model is based on the 2-D Wold decomposition of homogeneous random fields. The three resulting mutually orthogonal subfields have perceptual properties which can be described as "periodicity", "directionality ", and "randomness", approximating what are indicated to be the three most important dimensions of human texture perception. The method presented here improves upon earlier Wold-based models in its tolerance to a variety of local inhomogeneities which arise in natural textures and its invariance under image transformation such as rotation. An image retrieval algorithm based on ...
Vision Texture for Annotation
, 1995
"... This paper demonstrates a new application of computer vision to digital libraries -- the use of texture for annotation, the description of content. Vision-based annotation assists the user in attaching descriptions to large sets of images and video. If a user labels a piece of an image as "water," a ..."
Abstract
-
Cited by 95 (7 self)
- Add to MetaCart
This paper demonstrates a new application of computer vision to digital libraries -- the use of texture for annotation, the description of content. Vision-based annotation assists the user in attaching descriptions to large sets of images and video. If a user labels a piece of an image as "water," a texture model can be used to propagate this label to other "visually similar" regions. However, a serious problem is that no single model has been found to be good enough to reliably match human perception of similarity in pictures. Rather than using one model, the system described here knows several texture models, and is equipped with the ability to choose the one which "best explains" the regions selected by the user for annotating. If none of these models suffices, then it creates new explanations by combining models. Examples are given of annotations propagated by the system on natural scenes. The system provides an average gain of four to one in label prediction over a set of 98 image...
Temporal Texture Modeling
- In IEEE International Conference on Image Processing
, 1996
"... Temporal textures are textures with motion. Examples include wavy water, rising steam and fire. We model image sequences of temporal textures using the spatio-temporal autoregressive model (STAR). This model expresses each pixel as a linear combination of surrounding pixels lagged both in space and ..."
Abstract
-
Cited by 93 (1 self)
- Add to MetaCart
Temporal textures are textures with motion. Examples include wavy water, rising steam and fire. We model image sequences of temporal textures using the spatio-temporal autoregressive model (STAR). This model expresses each pixel as a linear combination of surrounding pixels lagged both in space and in time. The model provides a base for both recognition and synthesis. We show how the least squares method can accurately estimate model parameters for large, causal neighborhoods with more than 1000 parameters. Synthesis results show that the model can adequately capture the spatial and temporal characteristics of many temporal textures. A 95% recognition rate is achieved for a 135 element database with 15 texture classes. 1.
Unsupervised Learning from Dyadic Data
, 1998
"... Dyadic data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This includes event co-occurrences, histogram data, and single stimulus preference data as special cases. Dyadic data arises naturally in many applic ..."
Abstract
-
Cited by 89 (9 self)
- Add to MetaCart
Dyadic data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This includes event co-occurrences, histogram data, and single stimulus preference data as special cases. Dyadic data arises naturally in many applications ranging from computational linguistics and information retrieval to preference analysis and computer vision. In this paper, we present a systematic, domain-independent framework for unsupervised learning from dyadic data by statistical mixture models. Our approach covers different models with flat and hierarchical latent class structures and unifies probabilistic modeling and structure discovery. Mixture models provide both, a parsimonious yet flexible parameterization of probability distributions with good generalization performance on sparse data, as well as structural information about data-inherent grouping structure. We propose an annealed version of the standard Expectation Maximization algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains.
Unsupervised Texture Segmentation in a Deterministic Annealing Framework
, 1998
"... We present a novel optimization framework for unsupervised texture segmentation that relies on statistical tests as a measure of homogeneity. Texture segmentation is formulated as a data clustering problem based on sparse proximity data. Dissimilarities of pairs of textured regions are computed from ..."
Abstract
-
Cited by 82 (9 self)
- Add to MetaCart
We present a novel optimization framework for unsupervised texture segmentation that relies on statistical tests as a measure of homogeneity. Texture segmentation is formulated as a data clustering problem based on sparse proximity data. Dissimilarities of pairs of textured regions are computed from a multi-scale Gabor filter image representation. We discuss and compare a class of clustering objective functions which is systematically derived from invariance principles. As a general optimization framework we propose deterministic annealing based on a mean-field approximation. The canonical way to derive clustering algorithms within this framework as well as an efficient implementation of mean-field annealing and the closely related Gibbs sampler are presented. We apply both annealing variants to Brodatz-like micro-texture mixtures and real-word images.
An Image Database Browser that Learns From User Interaction
, 1996
"... Digital libraries of images and video are rapidly growing in size and availability. To avoid the expense and limitations of text, there is considerable interest in navigation by perceptual and other automatically extractable attributes. Unfortunately, the relevance of an attribute for a query is not ..."
Abstract
-
Cited by 66 (2 self)
- Add to MetaCart
Digital libraries of images and video are rapidly growing in size and availability. To avoid the expense and limitations of text, there is considerable interest in navigation by perceptual and other automatically extractable attributes. Unfortunately, the relevance of an attribute for a query is not always obvious. Queries which go beyond explicit color, shape, and positional cues must incorporate multiple features in complex ways. This dissertation uses machine learning to automatically select and combine features to satisfy a query, based on positive and negative examples from the user. The learning algorithm does not just learn during the course of one session: it learns continuously, across sessions. The learner improves its learning ability by dynamically modifying its inductive bias, based on experience over multiple sessions. Experiments demonstrate the ability to assist image classification, segmentation, and annotation (labeling of image regions). The common theme of this work...
A sparse texture representation using local affine regions
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... This article introduces a texture representation suitable for recognizing images of textured surfaces under a wide range of transformations, including viewpoint changes and non-rigid deformations. At the feature extraction stage, a sparse set of affine Harris and Laplacian regions is found in the im ..."
Abstract
-
Cited by 60 (11 self)
- Add to MetaCart
This article introduces a texture representation suitable for recognizing images of textured surfaces under a wide range of transformations, including viewpoint changes and non-rigid deformations. At the feature extraction stage, a sparse set of affine Harris and Laplacian regions is found in the image. Each of these regions can be thought of as a texture element having a characteristic elliptic shape and a distinctive appearance pattern. This pattern is captured in an affine-invariant fashion via a process of shape normalization followed by the computation of two novel descriptors, the spin image and the RIFT descriptor. When affine invariance is not required, the original elliptical shape serves as an additional discriminative feature for texture recognition. The proposed approach is evaluated in retrieval and classi-fication tasks using the entire Brodatz database and a publicly available collection of 1000 photographs of textured surfaces taken from different viewpoints.
A Sparse Texture Representation Using Affine-Invariant Regions
- In Proc. CVPR
, 2003
"... This paper introduces a texture representation suitable for recognizing images of textured surfaces under a wide range of transformations, including viewpoint changes and nonrigid deformations. At the feature extraction stage, a sparse set of affine-invariant local patches is extracted from the imag ..."
Abstract
-
Cited by 57 (9 self)
- Add to MetaCart
This paper introduces a texture representation suitable for recognizing images of textured surfaces under a wide range of transformations, including viewpoint changes and nonrigid deformations. At the feature extraction stage, a sparse set of affine-invariant local patches is extracted from the image. This spatial selection process permits the computation of characteristic scale and neighborhood shape for every texture element. The proposed texture representation is evaluated in retrieval and classification tasks using the entire Brodatz database and a collection of photographs of textured surfaces taken from different viewpoints. 1.

