Results 1 - 10
of
50
Image retrieval: ideas, influences, and trends of the new age
- ACM COMPUTING SURVEYS
, 2008
"... We have witnessed great interest and a wealth of promise in content-based image retrieval as an emerging technology. While the last decade laid foundation to such promise, it also paved the way for a large number of new techniques and systems, got many new people involved, and triggered stronger ass ..."
Abstract
-
Cited by 157 (3 self)
- Add to MetaCart
We have witnessed great interest and a wealth of promise in content-based image retrieval as an emerging technology. While the last decade laid foundation to such promise, it also paved the way for a large number of new techniques and systems, got many new people involved, and triggered stronger association of weakly related fields. In this article, we survey almost 300 key theoretical and empirical contributions in the current decade related to image retrieval and automatic image annotation, and in the process discuss the spawning of related subfields. We also discuss significant challenges involved in the adaptation of existing image retrieval techniques to build systems that can be useful in the real world. In retrospect of what has been achieved so far, we also conjecture what the future may hold for image retrieval research.
On Image Auto-Annotation with Latent Space Models
- MM'03
, 2003
"... Image auto-annotation, i.e., the association of words to whole images, has attracted considerable attention. In particular, unsupervised, probabilistic latent variable models of text and image features have shown encouraging results, but their performance with respect to other approaches remains un ..."
Abstract
-
Cited by 60 (8 self)
- Add to MetaCart
Image auto-annotation, i.e., the association of words to whole images, has attracted considerable attention. In particular, unsupervised, probabilistic latent variable models of text and image features have shown encouraging results, but their performance with respect to other approaches remains unknown. In this paper, we apply and compare two simple latent space models commonly used in text analysis, namely Latent Semantic Analysis (LSA) and Probabilistic LSA (PLSA). Annotation strategies for each model are discussed. Remarkably, we found that, on a 8000-image dataset, a classic LSA model defined on keywords and a very basic image representation performed as well as much more complex, state-of-the-art methods. Furthermore, nonprobabilistic methods (LSA and direct image matching) outperformed PLSA on the same dataset.
Learning an image manifold for retrieval
- In Proc. ACM Multimedia
, 2004
"... We consider the problem of learning a mapping function from low-level feature space to high-level semantic space. Under the assumption that the data lie on a submanifold embedded in a high dimensional Euclidean space, we propose a relevance feedback scheme which is naturally conducted only on the im ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
We consider the problem of learning a mapping function from low-level feature space to high-level semantic space. Under the assumption that the data lie on a submanifold embedded in a high dimensional Euclidean space, we propose a relevance feedback scheme which is naturally conducted only on the image manifold in question rather than the total ambient space. While images are typically represented by feature vectors in R n, the natural distance is often different from the distance induced by the ambient space R n. The geodesic distances on manifold are used to measure the similarities between images. However, when the number of data points is small, it is hard to discover the intrinsic manifold structure. Based on user interactions in a relevance feedback driven query-by-example system, the intrinsic similarities between images can be accurately estimated. We then develop an algorithmic framework to approximate the optimal mapping function by a Radial Basis Function (RBF) neural network. The semantics of a new image can be inferred by the RBF neural network. Experimental results show that our approach is effective in improving the performance of content-based image retrieval systems.
Content-based image retrieval: approaches and trends of the new age
- In Proceedings ACM International Workshop on Multimedia Information Retrieval
, 2005
"... The last decade has witnessed great interest in research on content-based image retrieval. This has paved the way for a large number of new techniques and systems, and a growing interest in associated fields to support such systems. Likewise, digital imagery has expanded its horizon in many directio ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
The last decade has witnessed great interest in research on content-based image retrieval. This has paved the way for a large number of new techniques and systems, and a growing interest in associated fields to support such systems. Likewise, digital imagery has expanded its horizon in many directions, resulting in an explosion in the volume of image data required to be organized. In this paper, we discuss some of the key contributions in the current decade related to image retrieval and automated image annotation, spanning 120 references. We also discuss some of the key challenges involved in the adaptation of existing image retrieval techniques to build useful systems that can handle real-world data. We conclude with a study on the trends in volume and impact of publications in the field with respect to venues/journals and sub-topics.
Effective automatic image annotation via a coherent language model and active learning
- In Proceedings of the 12th annual ACM International Conference on Multimedia (MM’04
, 2004
"... Image annotations allow users to access a large image database with textual queries. There have been several studies on automatic image annotation utilizing machine learning techniques, which automatically learn statistical models from annotated images and apply them to generate annotations for unse ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Image annotations allow users to access a large image database with textual queries. There have been several studies on automatic image annotation utilizing machine learning techniques, which automatically learn statistical models from annotated images and apply them to generate annotations for unseen images. One common problem shared by most previous learning approaches for automatic image annotation is that each annotated word is predicated for an image independently from other annotated words. In this paper, we proposed a coherent language model for automatic image annotation that takes into account the word-toword correlation by estimating a coherent language model for an image. This new approach has two important advantages: 1) it is able to automatically determine the annotation length to improve the accuracy of retrieval results, and 2) it can be used with active learning to significantly reduce the required number of annotated image examples. Empirical studies with Corel dataset are presented to show the effectiveness of the coherent language model for automatic image annotation. Categories and Subject Descriptors
Statistical learning for effective visual information retrieval
- In IEEE International Conference on Image Processing
, 2003
"... For effective retrieval of visual information, statistical learning plays a pivotal role. Statistical learning in such a context faces at least two major mathematical challenges: scarcity of training data, and imbalance of training classes. We present these challenges and outline our methods for add ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
For effective retrieval of visual information, statistical learning plays a pivotal role. Statistical learning in such a context faces at least two major mathematical challenges: scarcity of training data, and imbalance of training classes. We present these challenges and outline our methods for addressing them: active learning, recursive subspace co-training, adaptive dimensionality reduction, class-boundary alignment, and quasi-bagging. 1
MODELING SEMANTIC ASPECTS FOR CROSS-MEDIA IMAGE INDEXING
, 2007
"... To go beyond the query-by-example paradigm in image retrieval, there is a need for semantic indexing of large image collections for intuitive text-based image search. Different models have been proposed to learn the dependencies between the visual content of an image set and the associated text cap ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
To go beyond the query-by-example paradigm in image retrieval, there is a need for semantic indexing of large image collections for intuitive text-based image search. Different models have been proposed to learn the dependencies between the visual content of an image set and the associated text captions, then allowing for the automatic creation of semantic indices for unannotated images. The task, however, remains unsolved. In this paper, we present three alternatives to learn a Probabilistic Latent Semantic Analysis model (PLSA) for annotated images, and evaluate their respective performance for automatic image indexing. Under the PLSA assumptions, an image is modeled as a mixture of latent aspects that generates both image features and text captions, and we investigate three ways to learn the mixture of aspects. We also propose a more discriminative image representation than the traditional Blob histogram, concatenating quantized local color information and quantized local texture descriptors. The first learning procedure of a PLSA model for annotated images is a standard EM algorithm, which implicitly assumes that the visual and the textual modalities can be treated equivalently. The other two models are based on an asymmetric PLSA learning, allowing to constrain the definition of the latent space on the visual or on the textual modality. We demonstrate that the textual modality is more appropriate to learn a semantically meaningful latent space, which translates into improved annotation performance. A comparison of our learning algorithms with respect to recent methods
Incremental Semi-Supervised Subspace Learning for Image Retrieval
- MM'04
, 2004
"... Subspace learning techniques are widespread in pattern recognition research. They include Principal Component Analysis (PCA), Locality Preserving Projection (LPP), etc. These techniques are generally unsupervised which allows them to model data in the absence of labels or categories. In relevance fe ..."
Abstract
-
Cited by 19 (6 self)
- Add to MetaCart
Subspace learning techniques are widespread in pattern recognition research. They include Principal Component Analysis (PCA), Locality Preserving Projection (LPP), etc. These techniques are generally unsupervised which allows them to model data in the absence of labels or categories. In relevance feedback driven image retrieval system, the user provided information can be used to better describe the intrinsic semantic relationships between images. In this paper, we propose a semi-supervised subspace learning algorithm which incrementally learns an adaptive subspace by preserving the semantic structure of the image space, based on user interactions in a relevance feedback driven query-byexample system. Our algorithm is capable of accumulating knowledge from users, which could result in new feature representations for images in the database so that the system’s future retrieval performance can be enhanced. Experiments on a large collection of images have shown the effectiveness and efficiency of our proposed algorithm.
Adaptive feature-space conformal transformation for imbalanced-data learning
- in The Twentieth International Conference on Machine Learning (ICML-2003), (Washington DC
, 2003
"... When the training instances of the target class are heavily outnumbered by non-target training instances, SVMs can be ineffective in determining the class boundary. To remedy this problem, we propose an adaptive conformal transformation (ACT) algorithm. ACT considers feature-space distance and the c ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
When the training instances of the target class are heavily outnumbered by non-target training instances, SVMs can be ineffective in determining the class boundary. To remedy this problem, we propose an adaptive conformal transformation (ACT) algorithm. ACT considers feature-space distance and the class-imbalance ratio when it performs conformal transformation on a kernel function. Experimental results on UCI and real-world datasets show ACT to be effective in improving class prediction accuracy. 1.
Correlated label propagation with application to multi-label learning
- IN: CVPR ’06: PROCEEDINGS OF THE 2006 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
, 2006
"... Many computer vision applications, such as scene analysis and medical image interpretation, are ill-suited for traditional classification where each image can only be associated with a single class. This has stimulated recent work in multi-label learning where a given image can be tagged with multip ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Many computer vision applications, such as scene analysis and medical image interpretation, are ill-suited for traditional classification where each image can only be associated with a single class. This has stimulated recent work in multi-label learning where a given image can be tagged with multiple class labels. A serious problem with existing approaches is that they are unable to exploit correlations between class labels. This paper presents a novel framework for multi-label learning termed Correlated Label Propagation (CLP) that explicitly models interactions between labels in an efficient manner. As in standard label propagation, labels attached to training data points are propagated to test data points; however, unlike standard algorithms that treat each label independently, CLP simultaneously co-propagates multiple labels. Existing work eschews such an approach since naive algorithms for label co-propagation are intractable. We present an algorithm based on properties of submodular functions that efficiently finds an optimal solution. Our experiments demonstrate that CLP leads to significant gains in precision/recall against standard techniques on two real-world computer vision tasks involving several hundred labels.

