• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Automated image annotation using global features and robust nonparametric density estimation (2005)

by A Yavlinsky, E Schofield, S Rüger
Venue:CIVR
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 37
Next 10 →

Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation

by Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, Cordelia Schmid - In ICCV , 2009
"... Image auto-annotation is an important open problem in computer vision. For this task we propose TagProp, a discriminatively trained nearest neighbor model. Tags of test images are predicted using a weighted nearest-neighbor model to exploit labeled training images. Neighbor weights are based on neig ..."
Abstract - Cited by 23 (8 self) - Add to MetaCart
Image auto-annotation is an important open problem in computer vision. For this task we propose TagProp, a discriminatively trained nearest neighbor model. Tags of test images are predicted using a weighted nearest-neighbor model to exploit labeled training images. Neighbor weights are based on neighbor rank or distance. TagProp allows the integration of metric learning by directly maximizing the log-likelihood of the tag predictions in the training set. In this manner, we can optimally combine a collection of image similarity metrics that cover different aspects of image content, such as local shape descriptors, or global color histograms. We also introduce a word specific sigmoidal modulation of the weighted neighbor tag predictions to boost the recall of rare words. We investigate the performance of different variants of our model and compare to existing work. We present experimental results for three challenging data sets. On all three, TagProp makes a marked improvement as compared to the current state-of-the-art. 1.

A New Baseline for Image Annotation

by Ameesh Makadia, Vladimir Pavlovic, Sanjiv Kumar
"... Abstract. Automatically assigning keywords to images is of great interest as it allows one to index, retrieve, and understand large collections of image data. Many techniques have been proposed for image annotation in the last decade that give reasonable performance on standard datasets. However, mo ..."
Abstract - Cited by 23 (0 self) - Add to MetaCart
Abstract. Automatically assigning keywords to images is of great interest as it allows one to index, retrieve, and understand large collections of image data. Many techniques have been proposed for image annotation in the last decade that give reasonable performance on standard datasets. However, most of these works fail to compare their methods with simple baseline techniques to justify the need for complex models and subsequent training. In this work, we introduce a new baseline technique for image annotation that treats annotation as a retrieval problem. The proposed technique utilizes low-level image features and a simple combination of basic distances to find nearest neighbors of a given image. The keywords are then assigned using a greedy label transfer mechanism. The proposed baseline outperforms the current state-of-the-art methods on two standard and one large Web dataset. We believe that such a baseline measure will provide a strong platform to compare and better understand future annotation techniques. 1

Information-theoretic semantic multimedia indexing

by João Magalhães, Stefan Rüger - in ACM Conference on Image and Video Retrieval , 2007
"... To solve the problem of indexing collections with diverse text documents, image documents, or documents with both text and images, one needs to develop a model that supports heterogeneous types of documents. In this paper, we show how information theory supplies us with the tools necessary to develo ..."
Abstract - Cited by 20 (10 self) - Add to MetaCart
To solve the problem of indexing collections with diverse text documents, image documents, or documents with both text and images, one needs to develop a model that supports heterogeneous types of documents. In this paper, we show how information theory supplies us with the tools necessary to develop a unique model for text, image, and text/image retrieval. In our approach, for each possible query keyword we estimate a maximum entropy model based on exclusively continuous features that were preprocessed. The unique continuous feature-space of text and visual data is constructed by using a minimum description length criterion to find the optimal feature-space representation (optimal from an information theory point of view). We evaluate our approach in three experiments: only text retrieval, only image retrieval, and text combined with image retrieval.

The NN k technique for image searching and browsing

by Daniel Heesch , 2005
"... Retrieval of images from large image archives based solely on their visual similarity to a query image provides an exciting alternative to conventional text-based search. For content-based retrieval images are represented in terms of visual features. The question of how to combine these for similari ..."
Abstract - Cited by 9 (4 self) - Add to MetaCart
Retrieval of images from large image archives based solely on their visual similarity to a query image provides an exciting alternative to conventional text-based search. For content-based retrieval images are represented in terms of visual features. The question of how to combine these for similarity computation is typically addressed by eliciting relevance feedback from the user on the retrieved images. We argue in this thesis that the prevailing approach to relevance feedback suffers from three significant shortcomings: firstly, it leaves unsolved the question of how to combine features for the first retrieval; secondly, the advantage of automated content-extraction over manual annotation is greatest for large collections but if the query image is not constrained to come from the indexed collection, content-based retrieval entails imagewise comparisons leading to prohibitive response times; thirdly, users may only have vaguely defined information needs or may change their needs in the course of the interaction. The large majority of relevance feedback techniques are ill-suited for such undirected exploration. We propose a new framework of user interaction that addresses these limitations. It is centred on what we call the NN k idea. The NN k of an image are all those images that are most similar to it under some combination of features. They can be viewed as representatives of the possible

Logistic regression of generic codebooks for semantic image retrieval

by João Magalhães, Stefan Rüger - Int'l Conf. on Image and Video Retrieval , 2006
"... Abstract. This paper is about automatically annotating images with keywords in order to be able to retrieve images with text searches. Our approach is to model keywords such as 'mountain ' and 'city ' in terms of visual features that were extracted from images. In contrast to other algorithms, each ..."
Abstract - Cited by 5 (5 self) - Add to MetaCart
Abstract. This paper is about automatically annotating images with keywords in order to be able to retrieve images with text searches. Our approach is to model keywords such as 'mountain ' and 'city ' in terms of visual features that were extracted from images. In contrast to other algorithms, each specific keyword-model considers not only its own training data but also the whole training set by utilizing correlations of visual features to refine its own model. Initially, the algorithm clusters all visual features extracted from the full imageset, captures its salient structure (e.g. mixture of clusters or patterns) and represents this as a generic codebook. Then keywords that were associated with images in the training set are encoded as a linear combination of patterns from the generic codebook. We evaluate the validity of our approach in an image retrieval scenario with two distinct large datasets of real-world photos and corresponding manual annotations. 1

A simple Bayesian framework for content-based image retrieval

by Katherine A. Heller, Zoubin Ghahramani - In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2006
"... We present a Bayesian framework for content-based image retrieval which models the distribution of color and texture features within sets of related images. Given a userspecified text query (e.g. “penguins”) the system first extracts a set of images, from a labelled corpus, corresponding to that que ..."
Abstract - Cited by 5 (2 self) - Add to MetaCart
We present a Bayesian framework for content-based image retrieval which models the distribution of color and texture features within sets of related images. Given a userspecified text query (e.g. “penguins”) the system first extracts a set of images, from a labelled corpus, corresponding to that query. The distribution over features of these images is used to compute a Bayesian score for each image in a large unlabelled corpus. Unlabelled images are then ranked using this score and the top images are returned. Although the Bayesian score is based on computing marginal likelihoods, which integrate over model parameters, in the case of sparse binary data the score reduces to a single matrix-vector multiplication and is therefore extremely efficient to compute. We show that our method works surprisingly well despite its simplicity and the fact that no relevance feedback is used. We compare different choices of features, and evaluate our results using human subjects. 1.

A large scale system for searching and browsing images from the world wide web

by Alexei Yavlinsky, Daniel Heesch, Stefan Rüger - In Proceedings International Conference Image and Video Retrieval , 2006
"... Abstract. This paper outlines the technical details of a prototype system for searching and browsing over a million images from the World Wide Web using their visual contents. The system relies on two modalities for accessing images — automated image annotation and NN k image network browsing. The u ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
Abstract. This paper outlines the technical details of a prototype system for searching and browsing over a million images from the World Wide Web using their visual contents. The system relies on two modalities for accessing images — automated image annotation and NN k image network browsing. The user supplies the initial query in the form of one or more keywords and is then able to locate the desired images more precisely using a browsing interface. 1

A semantic vector space for query by image example

by João Magalhães, Simon Overell, Stefan Rüger - in ACM SIGIR Conf. on research and development in information retrieval, Multimedia Information Retrieval Workshop , 2007
"... Content-based image retrieval enables the user to search a database for visually similar images. In these scenarios, the user submits an example that is compared to the images in the database by their low-level characteristics such as colour, texture and shape. While visual similarity is essential f ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
Content-based image retrieval enables the user to search a database for visually similar images. In these scenarios, the user submits an example that is compared to the images in the database by their low-level characteristics such as colour, texture and shape. While visual similarity is essential for a vast number of applications, there are cases where a user needs to search for semantically similar images. For example, the user might want to find all images depicting bears on a river. This might be quite difficult using only low-level features, but using concept detectors for “bear ” and “river ” will produce results that are semantically closer to what the user requested. Following this idea, this paper studies a novel paradigm: query by semantic multimedia example. In this setting the user’s query is processed at a semantic level: a vector of concept probabilities is inferred for each image and a similarity metric computes the distance between the concept vector of the query and of the concept vectors of the images in database. The system is evaluated with a COREL Stock Photo collection.

Efficient re-indexing of automatically annotated image collections using keyword combination

by Alexei Yavlinsky, Stefan Rüger - In Proceedings of SPIE Volume 6506. Multimedia Content Access: Algorithms and Systems , 2007
"... This report presents a framework for improving the image index obtained by automated image annotation. Within this framework, the technique of keyword combination is used for fast image re-indexing based on initial automated annotations. It aims to tackle the challenges of limited vocabulary size an ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
This report presents a framework for improving the image index obtained by automated image annotation. Within this framework, the technique of keyword combination is used for fast image re-indexing based on initial automated annotations. It aims to tackle the challenges of limited vocabulary size and low annotation accuracies resulting from differences between training and test collections. It is useful for situations when these two problems are not anticipated at the time of annotation. We show that based on example images from the automatically annotated collection, it is often possible to find multiple keyword queries that can retrieve new image concepts which are not present in the training vocabulary, and improve retrieval results of those that are already present. We demonstrate that this can be done at a very small computational cost and at an acceptable performance tradeoff, compared to traditional annotation models. We present a simple, robust, and computationally efficient approach for finding an appropriate set of keywords for a given target concept. We report results on TRECVID 2005, Getty Image Archive, and Web image datasets, the last two of which were specifically constructed to support realistic retrieval scenarios. 1.

A linear-algebraic technique with an application in semantic image retrieval

by Jonathon S. Hare, Paul H. Lewis, Peter G. B. Enser, Christine J. S - In Proceedings of the International Conference on Image and Video Retrieval , 2006
"... Abstract. This paper presents a novel technique for learning the underlying structure that links visual observations with semantics. The technique, inspired by a text-retrieval technique known as cross-language latent semantic indexing uses linear algebra to learn the semantic structure linking imag ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
Abstract. This paper presents a novel technique for learning the underlying structure that links visual observations with semantics. The technique, inspired by a text-retrieval technique known as cross-language latent semantic indexing uses linear algebra to learn the semantic structure linking image features and keywords from a training set of annotated images. This structure can then be applied to unannotated images, thus providing the ability to search the unannotated images based on keyword. This factorisation approach is shown to perform well, even when using only simple global image features. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University