Results 1 -
7 of
7
Supervised learning of semantic classes for image annotation and retrieval
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2007
"... Abstract—A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to- ..."
Abstract
-
Cited by 74 (10 self)
- Add to MetaCart
Abstract—A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning. Index Terms—Content-based image retrieval, semantic image annotation and retrieval, weakly supervised learning, multiple instance learning, Gaussian mixtures, expectation-maximization, image segmentation, object recognition. 1
Formulating semantic image annotation as a supervised learning problem
- IEEE CVPR
, 2005
"... We introduce a new method to automatically annotate and retrieve images using a vocabulary of image semantics. The novel contributions include a discriminant formulation of the problem, a multiple instance learning solution that enables the estimation of concept probability distributions without pri ..."
Abstract
-
Cited by 42 (5 self)
- Add to MetaCart
We introduce a new method to automatically annotate and retrieve images using a vocabulary of image semantics. The novel contributions include a discriminant formulation of the problem, a multiple instance learning solution that enables the estimation of concept probability distributions without prior image segmentation, and a hierarchical description of the density of each image class that enables very efficient training. Compared to current methods of image annotation and retrieval, the one now proposed has significantly smaller time complexity and better recognition performance. Specifically, its recognition complexity is O(CxR), where C is the number of classes (or image annotations) and R is the number of image regions, while the best results in the literature have complexity O(TxR), where T is the number of training images. Since the number of classes grows substantially slower than that of training images, the proposed method scales better during training, and processes test images faster. This is illustrated through comparisons in terms of complexity, time, and recognition performance with current state-of-the-art methods. 1.
A semi-supervised learning approach to object recognition with spatial integration of local features and segmentation cues
- In Toward Category-Level Object Recognition
, 2006
"... Abstract. This chapter presents a principled way of formulating models for automatic local feature selection in object class recognition when there is little supervised data. Moreover, it discusses how one could formulate sensible spatial image context models using a conditional random field for int ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract. This chapter presents a principled way of formulating models for automatic local feature selection in object class recognition when there is little supervised data. Moreover, it discusses how one could formulate sensible spatial image context models using a conditional random field for integrating local features and segmentation cues (superpixels). By adopting sparse kernel methods and Bayesian model selection and data association, the proposed model identifies the most relevant sets of local features for recognizing object classes, achieves performance comparable to the fully supervised setting, and consistently outperforms existing methods for image classification. 1
Bayesian formulations of multiple instance learning with applications to general object recognition
, 2004
"... In presenting this thesis in partial fulfilment of the requirements for an advanced ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In presenting this thesis in partial fulfilment of the requirements for an advanced
Learning Visual Contexts for Image Annotation From Flickr Groups
"... Abstract—We present an extension of automatic image annotation that takes the context of a picture into account. Our core assumption is that users do not only provide individual images to be tagged, but group their pictures into batches (e.g., all snapshots taken over the same holiday trip), whereas ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract—We present an extension of automatic image annotation that takes the context of a picture into account. Our core assumption is that users do not only provide individual images to be tagged, but group their pictures into batches (e.g., all snapshots taken over the same holiday trip), whereas the images within a batch are likely to have a common style. These batches are matched with categories learned from Flickr groups, and an accurate context-specific annotation is performed. In quantitative experiments, we demonstrate that Flickr groups, with their user-driven categorization and their rich group space, provide an excellent basis for learning context categories. Our approach—which can be integrated with virtually any annotation model—is demonstrated to give significant improvements of above 100%, compared to standard annotations of individual images. Index Terms—Content-based image retrieval, context, image annotation. I.
Semi-Supervised Learning of Object Categories from Paired Local Features
"... This paper presents a semi-supervised learning (SSL) approach to find similarities of images using statistics of local matches. SSL algorithms are well known for leveraging a large amount of unlabeled data as well as a small amount of labeled data to boost classification performance. Our approach pr ..."
Abstract
- Add to MetaCart
This paper presents a semi-supervised learning (SSL) approach to find similarities of images using statistics of local matches. SSL algorithms are well known for leveraging a large amount of unlabeled data as well as a small amount of labeled data to boost classification performance. Our approach proposes to formulate the problem of matching two images as an SSL based classification problem of image pairs with a minimal amount of labeled pairs. We apply a Gaussian random field model to represent each image pair as vertices in a weighted graph and the optimal configuration of the field is obtained by harmonic energy minimization. A symmetrical feature selection criterion is first introduced to select robust matches of local keypoints between two images. The Mallows distance is then adopted to combine multiple cues from statistics of local matches. Our experiments confirm that our SSL based approach not only boost classification performance but also improve robustness of the learned category model using only simple local keypoint features.
Multiple Instance Learning from Weakly Labeled
"... Abstract. Automatic video tagging systems are targeted at assigning semantic concepts (“tags”) to videos by linking textual descriptions with the audio-visual video content. To train such systems, we investigate online video from portals such as YouTube TM as a large-scale, freely available knowledg ..."
Abstract
- Add to MetaCart
Abstract. Automatic video tagging systems are targeted at assigning semantic concepts (“tags”) to videos by linking textual descriptions with the audio-visual video content. To train such systems, we investigate online video from portals such as YouTube TM as a large-scale, freely available knowledge source. Tags provided by video owners serve as weak annotations indicating that a target concept appears in a video, but not when it appears. This situation resembles the multiple instance learning (MIL) scenario, in which classifiers are trained on labeled bags (videos) of unlabeled samples (the frames of a video). We study MIL in quantitative experiments on real-world online videos. Our key findings are: (1) conventional MIL tends to neglect valuable information in the training data and thus performs poorly. (2) By relaxing the MIL assumption, a tagging system can be built that performs comparable or better than its supervised counterpart. (3) Improvements by MIL are minor compared to a kernel-based model we proposed recently [13]. 1

