Results 1 - 10
of
10
Learning to detect unseen object classes by betweenclass attribute transfer
- In CVPR
, 2009
"... We study the problem of object classification when training and test classes are disjoint, i.e. no training examples of the target classes are available. This setup has hardly been studied in computer vision research, but it is the rule rather than the exception, because the world contains tens of t ..."
Abstract
-
Cited by 58 (2 self)
- Add to MetaCart
We study the problem of object classification when training and test classes are disjoint, i.e. no training examples of the target classes are available. This setup has hardly been studied in computer vision research, but it is the rule rather than the exception, because the world contains tens of thousands of different object classes and for only a very few of them image, collections have been formed and annotated with suitable class labels. In this paper, we tackle the problem by introducing attribute-based classification. It performs object detection based on a human-specified high-level description of the target objects instead of training images. The description consists of arbitrary semantic attributes, like shape, color or even geographic information. Because such properties transcend the specific learning task at hand, they can be pre-learned, e.g. from image datasets unrelated to the current task. Afterwards, new classes can be detected based on their attribute representation, without the need for a new training phase. In order to evaluate our method and to facilitate research in this area, we have assembled a new largescale dataset, “Animals with Attributes”, of over 30,000 animal images that match the 50 classes in Osherson’s classic table of how strongly humans associate 85 semantic attributes with animal classes. Our experiments show that by using an attribute layer it is indeed possible to build a learning object detection system that does not require any training images of the target classes. 1.
Automatic Attribute Discovery and Characterization from Noisy Web Data
"... Abstract. It is common to use domain specific terminology – attributes – to describe the visual appearance of objects. In order to scale the use of these describable visual attributes to a large number of categories, especially those not well studied by psychologists or linguists, it will be necessa ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Abstract. It is common to use domain specific terminology – attributes – to describe the visual appearance of objects. In order to scale the use of these describable visual attributes to a large number of categories, especially those not well studied by psychologists or linguists, it will be necessary to find alternative techniques for identifying attribute vocabularies and for learning to recognize attributes without hand labeled training data. We demonstrate that it is possible to accomplish both these tasks automatically by mining text and image data sampled from the Internet. The proposed approach also characterizes attributes according to their visual representation: global or local, and type: color, texture, or shape. This work focuses on discovering attributes and their visual appearance, and is as agnostic as possible about the textual description. 1
Scalable search-based image annotation of personal images
- In MIR
, 2006
"... With the prevalence of digital cameras, more and more people have considerable digital images on their personal devices. As a result, there are increasing needs to effectively search these personal images. Automatic image annotation may serve the goal, for the annotated keywords could facilitate the ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
With the prevalence of digital cameras, more and more people have considerable digital images on their personal devices. As a result, there are increasing needs to effectively search these personal images. Automatic image annotation may serve the goal, for the annotated keywords could facilitate the search processes. Although many image annotation methods have been proposed in recent years, their effectiveness on arbitrary personal images is constrained by their limited scalability, i.e. limited lexicon of small-scale training set. To be scalable, we propose a searchbased image annotation (SBIA) algorithm that is analogous to Web page search. First, content-based image retrieval (CBIR) technology is used to retrieve a set of visually similar images from a large-scale Web image set. Then, a text-based keyword search (TBKS) technique is used to obtain a ranked list of candidate annotations for each retrieved image. Finally, a fusion algorithm is used to combine the ranked lists into the final annotation list. The application of both efficient search technologies and Webscale image set guarantees the scalability of the proposed algorithm. Experimental results on U. Washington dataset show not only the effectiveness and efficiency of the proposed algorithm but also the advantage of image retrieval using annotation results over that using visual features.
Learning Color Names from Real-World Images
"... Within a computer vision context color naming is the action of assigning linguistic color labels to image pixels. In general, research on color naming applies the following paradigm: a collection of color chips is labelled with color names within a well-defined experimental setup by multiple test su ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Within a computer vision context color naming is the action of assigning linguistic color labels to image pixels. In general, research on color naming applies the following paradigm: a collection of color chips is labelled with color names within a well-defined experimental setup by multiple test subjects. The collected data set is subsequently used to label RGB values in real-world images with a color name. Apart from the fact that this collection process is time consuming, it is unclear to what extent color naming within a controlled setup is representative for color naming in realworld images. Therefore we propose to learn color names from real-world images. Furthermore, we avoid test subjects by using Google Image to collect a data set. Due to limitations of Google Image this data set contains a substantial quantity of wrongly labelled data. The color names are learned using a PLSA model adapted to this task. Experimental results show that color names learned from realworld images significantly outperform color names learned from labelled color chips on retrieval and classification. 1.
Attribute learning in large-scale datasets
"... Abstract. We consider the task of learning visual connections between object categories using the ImageNet dataset, which is a large-scale dataset ontology containing more than 15 thousand object classes. We want to discover visual relationships between the classes that are currently missing (such a ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract. We consider the task of learning visual connections between object categories using the ImageNet dataset, which is a large-scale dataset ontology containing more than 15 thousand object classes. We want to discover visual relationships between the classes that are currently missing (such as similar colors or shapes or textures). In this work welearn20visualattributesandusetheminazero-shottransferlearning experiment as well as to make visual connections between semantically unrelated object categories. 1
Names and faces
"... We show that a large and realistic face dataset can be built from news photographs and their associated captions. Our dataset consists of 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recog ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
We show that a large and realistic face dataset can be built from news photographs and their associated captions. Our dataset consists of 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recognition datasets, because it contains faces captured “in the wild ” in a variety of configurations with respect to the camera, taking a variety of expressions, and under illumination of widely varying color. Faces are extracted from the images and names from the associated caption. Our system uses a clustering procedure to find the correspondence between faces and associated names in news picture-caption pairs. The context in which a name appears in a caption provides powerful cues as to whether it is depicted in the associated image. By incorporating simple natural language techniques, we are able to improve our name assignment significantly. Once the procedure is complete, we have an accurately labeled set of faces, an appearance model for each individual depicted, and a natural language model that can produce accurate results on captions in isolation.
Learning Color Names for Real-World Applications
"... Color names are required in real-world applications such as image retrieval and image annotation. Traditionally, they are learned from a collection of labelled color chips. These color chips are labelled with color names within a well-defined experimental setup by human test subjects. However naming ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Color names are required in real-world applications such as image retrieval and image annotation. Traditionally, they are learned from a collection of labelled color chips. These color chips are labelled with color names within a well-defined experimental setup by human test subjects. However naming colors in real-world images differs significantly from this experimental setting. In this paper, we investigate how color names learned from color chips compare to color names learned from real-world images. To avoid hand labelling real-world images with color names we use Google Image to collect a data set. Due to limitations of Google Image this data set contains a substantial quantity of wrongly labelled data. We propose several variants of the PLSA model to learn color names from this noisy data. Experimental results show that color names learned from real-world images significantly outperform color names learned from labelled color chips for both image retrieval and image annotation. I.
Evaluation strategies for image understanding and retrieval
, 2005
"... We address evaluation of image understanding and retrieval large scale image data in the context of three evaluation projects. The first project is a comprehensive strategy for evaluating image retrieval algorithms and provides an open reference data set for doing so. The second project develops wor ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We address evaluation of image understanding and retrieval large scale image data in the context of three evaluation projects. The first project is a comprehensive strategy for evaluating image retrieval algorithms and provides an open reference data set for doing so. The second project develops word prediction as a semantically relevant evaluation strategy, and applies it to the evaluation of of image processing methods for semantic image analysis. The third project evaluates words for suitability of their visual properties for use in an image annotation framework.
“Cloudiness” OR
"... Human-nameable visual attributes offer many advantages when used as mid-level features for object recognition, but existing techniques to gather relevant attributes can be inefficient (costing substantial effort or expertise) and/or insufficient (descriptive properties need not be discriminative). W ..."
Abstract
- Add to MetaCart
Human-nameable visual attributes offer many advantages when used as mid-level features for object recognition, but existing techniques to gather relevant attributes can be inefficient (costing substantial effort or expertise) and/or insufficient (descriptive properties need not be discriminative). We introduce an approach to define a vocabulary of attributes that is both human understandable and discriminative. The system takes object/scene-labeled images as input, and returns as output a set of attributes elicited from human annotators that distinguish the categories of interest. To ensure a compact vocabulary and efficient use of annotators ’ effort, we 1) show how to actively augment the vocabulary such that new attributes resolve inter-class confusions, and 2) propose a novel “nameability” manifold that prioritizes candidate attributes by their likelihood of being associated with a nameable property. We demonstrate the approach with multiple datasets, and show its clear advantages over baselines that lack a nameability model or rely on a list of expert-provided attributes. 1.
An Optimum Modified Bit Plane Splicing LSB Algorithm for Secret Data Hiding
"... In this paper, we propose an algorithm called Optimum Intensity Based Distributed Hiding (OIBDH) for secret data hiding inside cover images. The algorithm is a modified version of Bit Plane Splicing LSB technique with better hidden capacity and improved embedding process. The proposed algorithm outp ..."
Abstract
- Add to MetaCart
In this paper, we propose an algorithm called Optimum Intensity Based Distributed Hiding (OIBDH) for secret data hiding inside cover images. The algorithm is a modified version of Bit Plane Splicing LSB technique with better hidden capacity and improved embedding process. The proposed algorithm outperforms Bit Plane Splicing LSB technique as more data can be hidden without degrading the quality of the cover image. Furthermore, both algorithms are tested using entropy curves and results show that OIBDH has lower absolute entropy difference compared to Bit Plane Splicing LSB technique in all the tested images.

