Results 1 - 10
of
16
Learning distance metrics with contextual constraints for image retrieval
- Proc. Computer Vision and Pattern Recognition
, 2006
"... Relevant Component Analysis (RCA) has been proposed for learning distance metrics with contextual constraints for image retrieval. However, RCA has two important disadvantages. One is the lack of exploiting negative constraints which can also be informative, and the other is its incapability of capt ..."
Abstract
-
Cited by 14 (5 self)
- Add to MetaCart
Relevant Component Analysis (RCA) has been proposed for learning distance metrics with contextual constraints for image retrieval. However, RCA has two important disadvantages. One is the lack of exploiting negative constraints which can also be informative, and the other is its incapability of capturing complex nonlinear relationships between data instances with the contextual information. In this paper, we propose two algorithms to overcome these two disadvantages, i.e., Discriminative Component Analysis (DCA) and Kernel DCA. Compared with other complicated methods for distance metric learning, our algorithms are rather simple to understand and very easy to solve. We evaluate the performance of our algorithms on image retrieval in which experimental results show that our algorithms are effective and promising in learning good quality distance metrics for image retrieval. 1
Learning a Maximum Margin Subspace for Image Retrieval
"... Abstract—One of the fundamental problems in Content-Based Image Retrieval (CBIR) has been the gap between low-level visual features and high-level semantic concepts. To narrow down this gap, relevance feedback is introduced into image retrieval. With the user-provided information, a classifier can b ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Abstract—One of the fundamental problems in Content-Based Image Retrieval (CBIR) has been the gap between low-level visual features and high-level semantic concepts. To narrow down this gap, relevance feedback is introduced into image retrieval. With the user-provided information, a classifier can be learned to distinguish between positive and negative examples. However, in real-world applications, the number of user feedbacks is usually too small compared to the dimensionality of the image space. In order to cope with the high dimensionality, we propose a novel semisupervised method for dimensionality reduction called Maximum Margin Projection (MMP). MMP aims at maximizing the margin between positive and negative examples at each local neighborhood. Different from traditional dimensionality reduction algorithms such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), which effectively see only the global euclidean structure, MMP is designed for discovering the local manifold structure. Therefore, MMP is likely to be more suitable for image retrieval, where nearest neighbor search is usually involved. After projecting the images into a lower dimensional subspace, the relevant images get closer to the query image; thus, the retrieval performance can be enhanced. The experimental results on Corel image database demonstrate the effectiveness of our proposed algorithm. Index Terms—Multimedia information systems, image retrieval, relevance feedback, dimensionality reduction.
Semi-Supervised SVM Batch Mode Active Learning for Image Retrieval
"... Active learning has been shown as a key technique for improving content-based image retrieval (CBIR) performance. Among various methods, support vector machine (SVM) active learning is popular for its application to relevance feedback in CBIR. However, the regular SVM active learning has two main dr ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Active learning has been shown as a key technique for improving content-based image retrieval (CBIR) performance. Among various methods, support vector machine (SVM) active learning is popular for its application to relevance feedback in CBIR. However, the regular SVM active learning has two main drawbacks when used for relevance feedback. First, SVM often suffers from learning with a small number of labeled examples, which is the case in relevance feedback. Second, SVM active learning usually does not take into account the redundancy among examples, and therefore could select multiple examples in relevance feedback that are similar (or even identical) to each other. In this paper, we propose a novel scheme that exploits both semi-supervised kernel learning and batch mode active learning for relevance feedback in CBIR. In particular, a kernel function is first learned from a mixture of labeled and unlabeled examples. The kernel will then be used to effectively identify the informative and diverse examples for active learning via a min-max framework. An empirical study with relevance feedback of CBIR showed that the proposed scheme is significantly more effective than other state-of-the-art approaches. 1.
ABSTRACT Laplacian Optimal Design for Image Retrieval
"... Relevance feedback is a powerful technique to enhance Content-Based Image Retrieval (CBIR) performance. It solicits the user’s relevance judgments on the retrieved images returned by the CBIR systems. The user’s labeling is then used to learn a classifier to distinguish between relevant and irreleva ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Relevance feedback is a powerful technique to enhance Content-Based Image Retrieval (CBIR) performance. It solicits the user’s relevance judgments on the retrieved images returned by the CBIR systems. The user’s labeling is then used to learn a classifier to distinguish between relevant and irrelevant images. However, the top returned images may not be the most informative ones. The challenge is thus to determine which unlabeled images would be the most informative (i.e., improve the classifier the most) if they were labeled and used as training samples. In this paper, we propose a novel active learning algorithm, called Laplacian Optimal Design (LOD), for relevance feedback image retrieval. Our algorithm is based on a regression model which minimizes the least square error on the measured (or, labeled) images and simultaneously preserves the local geometrical structure of the image space. Specifically, we assume that if two images are sufficiently close to each other, then their measurements (or, labels) are close as well. By constructing a nearest neighbor graph, the geometrical structure of the image space can be described by the graph Laplacian. We discuss how results from the field of optimal experimental design may be used to guide our selection of a subset of images, which gives us the most amount of information. Experimental results on Corel database suggest that the proposed approach achieves higher precision in relevance feedback image retrieval. Categories and Subject Descriptors H.3.3 [Information storage and retrieval]: Information search and retrieval—Relevance feedback; G.3 [Mathematics
Overview of the ImageCLEF 2007 object retrieval task
- In Working Notes of the 2007 CLEF Workshop
, 2007
"... Abstract. We describe the object retrieval task of ImageCLEF 2007, give an overview of the methods of the participating groups, and present and discuss the results. The task was based on the widely used PASCAL object recognition data to train object recognition methods and on the IAPR TC-12 benchmar ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract. We describe the object retrieval task of ImageCLEF 2007, give an overview of the methods of the participating groups, and present and discuss the results. The task was based on the widely used PASCAL object recognition data to train object recognition methods and on the IAPR TC-12 benchmark dataset from which images of objects of the ten different classes bicycles, buses, cars, motorbikes, cats, cows, dogs, horses, sheep, and persons had to be retrieved. Seven international groups participated using a wide variety of methods. The results of the evaluation show that the task was very challenging and that different methods for relevance assessment can have a strong influence on the results of an evaluation. 1
An empirical study on largescale content-based image retrieval
- In IEEE International Conference on Multimedia & Expo (ICME2007
, 2007
"... One key challenge in content-based image retrieval (CBIR) is to develop a fast solution for indexing high-dimensional image contents, which is crucial to building large-scale CBIR systems. In this paper, we propose a scalable content-based image retrieval scheme using locality-sensitive hashing (LSH ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
One key challenge in content-based image retrieval (CBIR) is to develop a fast solution for indexing high-dimensional image contents, which is crucial to building large-scale CBIR systems. In this paper, we propose a scalable content-based image retrieval scheme using locality-sensitive hashing (LSH), and conduct extensive evaluations on a large image testbed of a half million images. To the best of our knowledge, there is less comprehensive study on large-scale CBIR evaluation with a half million images. Our empirical results show that our proposed solution is able to scale for hundreds of thousands of images, which is promising for building web-scale CBIR systems. 1
A Multimodal and Multilevel Ranking Scheme for Large-Scale Video Retrieval
"... Abstract—A critical issue of large-scale multimedia retrieval is how to develop an effective framework for ranking the search results. This problem is particularly challenging for content-based video retrieval due to some issues such as short text queries, insufficient sample learning, fusion of mul ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract—A critical issue of large-scale multimedia retrieval is how to develop an effective framework for ranking the search results. This problem is particularly challenging for content-based video retrieval due to some issues such as short text queries, insufficient sample learning, fusion of multimodal contents, and large-scale learning with huge media data. In this paper, we propose a novel multimodal and multilevel (MMML) ranking framework to attack the challenging ranking problem of content-based video retrieval. We represent the video retrieval task by graphs and suggest a graph based semi-supervised ranking (SSR) scheme, which can learn with small samples effectively and integrate multimodal resources for ranking smoothly. To make the semi-supervised ranking solution practical for large-scale retrieval tasks, we propose a multilevel ranking framework that unifies several different ranking approaches in a cascade fashion. We have conducted empirical evaluations of our proposed solution for automatic search tasks on the benchmark testbed of TRECVID2005. The promising empirical results show that our ranking solutions are effective and very competitive with the state-of-the-art solutions in the TRECVID evaluations. Index Terms—Content-based video retrieval, graph representation, multilevel ranking, multimodal fusion, multimedia retrieval, semi-supervised ranking, support vector machines. I.
iscope: personalized multi-modality image search for mobile devices
- In Proceedings of Mobisys ’09
, 2009
"... Mobile devices are becoming a primary medium for personal information gathering, management, and sharing. Managing personal image data on mobile platforms is a difficult problem due to large data set size, content diversity, heterogeneous individual usage patterns, and resource constraints. This art ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Mobile devices are becoming a primary medium for personal information gathering, management, and sharing. Managing personal image data on mobile platforms is a difficult problem due to large data set size, content diversity, heterogeneous individual usage patterns, and resource constraints. This article presents a user-centric system, called iScope, for personal image management and sharing on mobile devices. iScope uses multi-modality clustering of both content and context information for efficient image management and search, and online learning techniques for predicting images of interest. It also supports distributed content-based search among networked devices while maintaining the same intuitive interface, enabling efficient information sharing among people. We have implemented iScope and conducted in-field experiments using networked Nokia N810 portable Internet tablets. Energy efficiency was a primary design focus during the design and implementation of the iScope search algorithms. Experimental results indicate that iScope improves search time and search energy by 4.1 × and 3.8 × on average, relative to browsing.
Cross-Language and Cross-Media Image Retrieval: An Empirical Study at ImageCLEF 2007
- In Working Notes of the 2007 CLEF Workshop
, 2007
"... Abstract. This paper summarizes our empirical study of cross-language and cross-media image retrieval at the CLEF image retrieval track (ImageCLEF2007). In this year, we participated in the ImageCLEF photo retrieval task, in which the goal of the retrieval task is to search natural photos by some qu ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. This paper summarizes our empirical study of cross-language and cross-media image retrieval at the CLEF image retrieval track (ImageCLEF2007). In this year, we participated in the ImageCLEF photo retrieval task, in which the goal of the retrieval task is to search natural photos by some query with both textual and visual information. In this paper, we study the empirical evaluations of our solutions for the image retrieval tasks in three aspects. First of all, we study the application of language models and smoothing strategies for text-based image retrieval, particularly addressing the short text query issue. Secondly, we study the cross-media image retrieval problem using some simple combination strategy. Lastly, we study the cross-language image retrieval problem between English and Chinese. Finally, we summarize our empirical experiences and indicate some future directions. 1
Laplacian Regularized D-Optimal Design for Active Learning and Its Application to Image Retrieval
"... Abstract—In increasingly many cases of interest in computer vision and pattern recognition, one is often confronted with the situation where data size is very large. Usually, the labels are expensive and the challenge is, thus, to determine which unlabeled samples would be the most informative (i.e. ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract—In increasingly many cases of interest in computer vision and pattern recognition, one is often confronted with the situation where data size is very large. Usually, the labels are expensive and the challenge is, thus, to determine which unlabeled samples would be the most informative (i.e., improve the classifier the most) if they were labeled and used as training samples. Particularly, we consider the problem of active learning of a regression model in the context of experimental design. Classical optimal experimental design approaches are based on least square errors over the measured samples only. They fail to take into account the unmeasured samples. In this paper, we propose a novel active learning algorithm which operates over graphs. Our algorithm is based on a graph Laplacian regularized regression model which simultaneously minimizes the least square error on the measured samples and preserves the local geometrical structure of the data space. By constructing a nearest neighbor graph, the geometrical structure of the data space can be described by the graph Laplacian. We discuss how results from the field of optimal experimental design may be used to guide our selection of a subset of data points, which gives us the most amount of information. Experiments demonstrate its superior performance in comparison with conventional algorithms. Index Terms—Active learning, experimental design, image retrieval, regularization. I.

