Results 1 - 10
of
96
Descriptive visual words and visual phrases for image applications
- Proc. ACM Multimedia
, 2009
"... The Bag-of-visual Words (BoW) image representation has been applied for various problems in the fields of multimedia and computer vision. The basic idea is to represent images as visual documents composed of repeatable and distinctive visual elements, which are comparable to the words in texts. Howe ..."
Abstract
-
Cited by 37 (10 self)
- Add to MetaCart
(Show Context)
The Bag-of-visual Words (BoW) image representation has been applied for various problems in the fields of multimedia and computer vision. The basic idea is to represent images as visual documents composed of repeatable and distinctive visual elements, which are comparable to the words in texts. However, massive experiments show that the commonly used visual words are not as expressive as the text words, which is not desirable because it hinders their effectiveness in various applications. In this paper, Descriptive Visual Words (DVWs) and Descriptive Visual Phrases (DVPs) are proposed as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs. Since images are the carriers of visual objects and scenes, novel descriptive visual element set can be composed by the visual words and their combinations which
Unsupervised joint object discovery and segmentation in internet images
- In CVPR
, 2013
"... HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
(Show Context)
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Learning to Re-Rank: Query-Dependent Image Re-Ranking Using Click Data
"... Our objective is to improve the performance of keyword based image search engines by re-ranking their baseline results. To this end, we address three limitations of existing search engines in this paper. First, there is no straightforward, fully automated way of going from textual queries to visual ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
(Show Context)
Our objective is to improve the performance of keyword based image search engines by re-ranking their baseline results. To this end, we address three limitations of existing search engines in this paper. First, there is no straightforward, fully automated way of going from textual queries to visual features. Image search engines are therefore forced to rely on static and textual features alone for ranking. Visual features are used only for secondary tasks such as finding similar images. Second, image rankers are trained on query-image pairs labeled with relevance judgments determined by human experts. Such labels are well known to be noisy due to various factors including ambiguous queries, unknown user intent and subjectivity in human judgments. This leads to learning a sub-optimal ranker. Finally, a static ranker is typically built to handle disparate user queries. The ranker is therefore unable to adapt its parameters to suit the query at hand which again leads to sub-optimal results. We demonstrate that all of these problems can be mitigated by employing a re-ranking algorithm that leverages aggregate user click data. We hypothesize that images clicked in response to a query are mostly relevant to the query. We therefore re-rank the original search results so as to promote images that are likely to be clicked to the top of the ranked list. Our re-ranking algorithm employs Gaussian Process regression to predict the normalized click count for each image, and combines it with the original ranking score. Our approach is shown to significantly boost the performance of the Bing image search engine on a wide range of queries, especially the tail queries.
The Bicycle Model
- in the Northeast Asian Economic Cooperation”, An Unpublished Paper, Korea Development Institute
, 2000
"... In this paper, we investigate a time-sensitive image retrieval problem, in which given a query keyword, a query time point, and optionally user information, we retrieve the most relevant and temporally suitable images from the database. Inspired by recently emerging interests on query dynamics in in ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
(Show Context)
In this paper, we investigate a time-sensitive image retrieval problem, in which given a query keyword, a query time point, and optionally user information, we retrieve the most relevant and temporally suitable images from the database. Inspired by recently emerging interests on query dynamics in information retrieval research, our time-sensitive image retrieval algorithm can infer users ’ implicit search intent better and provide more engaging and diverse search results according to temporal trends of Web user photos. We model observed image streams as instances of multivariate point processes represented by several different descriptors, and develop a regularized multi-task regression framework that automatically selects and learns stochastic parametric models to solve the relations between image occurrence probabilities and various temporal factors that influence them. Using Flickr datasets of more than seven million images of 30 topics, our experimental results show that the proposed algorithm is more successful in time-sensitive image retrieval than other candidate methods, including ranking SVM, a PageRank-based image ranking, and a generative temporal topic model.
Query Specific Fusion for Image Retrieval
"... Abstract. Recent image retrieval algorithms based on local features indexed by a vocabulary tree and holistic features indexed by compact hashing codes both demonstrate excellent scalability. However, their retrieval precision may vary dramatically among queries. This motivates us to investigate how ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
(Show Context)
Abstract. Recent image retrieval algorithms based on local features indexed by a vocabulary tree and holistic features indexed by compact hashing codes both demonstrate excellent scalability. However, their retrieval precision may vary dramatically among queries. This motivates us to investigate how to fuse the ordered retrieval sets given by multiple retrieval methods, to further enhance the retrieval precision. Thus, we propose a graph-based query specific fusion approach where multiple retrieval sets are merged and reranked by conducting a link analysis on a fused graph. The retrieval quality of an individual method is measured by the consistency of the top candidates ’ nearest neighborhoods. Hence, the proposed method is capable of adaptively integrating the strengths of the retrieval methods using local or holistic features for different queries without any supervision. Extensive experiments demonstrate competitive performance on 4 public datasets, i.e.,theUKbench, Corel-5K, Holidays and San Francisco Landmarks datasets. 1
Brain State Decoding for Rapid Image Retrieval
- ACM Multimedia
, 2009
"... Human visual perception is able to recognize a wide range of targets under challenging conditions, but has limited throughput. Machine vision and automatic content analytics can process images at a high speed, but suffers from inadequate recognition accuracy for general target classes. In this paper ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
(Show Context)
Human visual perception is able to recognize a wide range of targets under challenging conditions, but has limited throughput. Machine vision and automatic content analytics can process images at a high speed, but suffers from inadequate recognition accuracy for general target classes. In this paper, we propose a new paradigm to explore and combine the strengths of both systems. A single trial EEG-based brain machine interface (BCI) subsystem is used to detect objects of interest of arbitrary classes from an initial subset of images. The EEG detection outcomes are used as input to a graph-based pattern mining subsystem to identify, refine, and propagate the labels to retrieve relevant images from a much larger pool. The combined strategy is unique in its generality, robustness, and high throughput. It has great potential for advancing the state of the art in media retrieval applications. We have evaluated and demonstrated significant performance gains of the proposed system with multiple and diverse image classes over several data sets, including those from Internet (Caltech 101) and remote sensing images. In this paper, we will also present insights learned from the experiments and discuss future research directions.
Unsupervised Multi-Feature Tag Relevance Learning for Social Image Retrieval
"... Interpreting the relevance of a user-contributed tag with respect to the visual content of an image is an emerging problem in social image retrieval. In the literature this problem is tackled by analyzing the correlation between tags and images represented by specific visual features. Unfortunately, ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
(Show Context)
Interpreting the relevance of a user-contributed tag with respect to the visual content of an image is an emerging problem in social image retrieval. In the literature this problem is tackled by analyzing the correlation between tags and images represented by specific visual features. Unfortunately, no single feature represents the visual content completely, e.g., global features are suitable for capturing the gist of scenes, while local features are better for depicting objects. To solve the problem of learning tag relevance given multiple features, we introduce in this paper two simple and effective methods: one is based on the classical Borda Count and the other is a method we name UniformTagger. Both methods combine the output of many tag relevance learners driven by diverse features in an unsupervised, rather than supervised, manner. Experiments on 3.5 million social-tagged images and two test sets verify our proposal. Using learned tag relevance as updated tag frequency for social image retrieval, both Borda Count and UniformTagger outperform retrieval without tag relevance learning and retrieval with single-feature tag relevance learning. Moreover, the two unsupervised methods are comparable to a state-of-the-art supervised alternative, but without the need of any training data. Categories andSubjectDescriptors
Robust and Scalable Graph-Based Semisupervised Learning
, 2012
"... Graph-based semisupervised learning (GSSL) provides a promising paradigm for modeling the manifold structures that may exist in massive data sources in highdimensional spaces. It has been shown effective in propagating a limited amount of initial labels to a large amount of unlabeled data, matching ..."
Abstract
-
Cited by 12 (7 self)
- Add to MetaCart
Graph-based semisupervised learning (GSSL) provides a promising paradigm for modeling the manifold structures that may exist in massive data sources in highdimensional spaces. It has been shown effective in propagating a limited amount of initial labels to a large amount of unlabeled data, matching the needs of many emerging applications such as image annotation and information retrieval. In this paper, we provide reviews of several classical GSSL methods and a few promising methods in handling challenging issues often encountered in web-scale applications. First, to successfully incorporate the contaminated noisy labels associated with web data, label diagnosis and tuning techniques applied to GSSL are surveyed. Second, to support scalability to the gigantic scale (millions or billions of samples), recent solutions based on anchor graphs are reviewed. To help researchers pursue new ideas in this area, we also summarize a few popular data sets and software tools publicly available. Important open issues are discussed at the end to stimulate future research.
Annotation Propagation in Large Image Databases via Dense Image Correspondence
"... tree building road sky ..."
(Show Context)
Label Diagnosis through Self Tuning for Web Image Search
- IEEE Computer Vision and Pattern Recognition (CVPR
, 2009
"... Semi-supervised learning (SSL) relies on partial supervision information for prediction, where only a small set of samples are associated with labels. Performance of SSL is significantly degraded if the given labels are not reliable. Such problems arise in realistic applications such as web image se ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
(Show Context)
Semi-supervised learning (SSL) relies on partial supervision information for prediction, where only a small set of samples are associated with labels. Performance of SSL is significantly degraded if the given labels are not reliable. Such problems arise in realistic applications such as web image search using noisy textual tags. This paper proposes a novel and efficient graph based SSL method with the unique capacity of pruning contradictory labels and inferring new labels through a bidirectional and alternating optimization process. The objective is to automatically identify the most suitable samples for manipulation, labeling or unlabeling, and meanwhile estimate a smooth classification function over a weighted graph. Different from other graph based SSL approaches, the proposed method employs a bivariate objective function and iteratively modifies label variables on both labeled and unlabeled samples. Starting from such a SSL setting, we present a relearning framework to improve the performance of base learner, particularly for the application of web image search. Besides the toy demonstration on artificial data, we evaluated the proposed method on Flickr image search with unreliable textual labels. Experimental results confirm the significant improvements of the method over the baseline text based search engine and the state-of-the-art SSL methods. 1.