Results 1  10
of
20
RankLoss Support Instance Machines for MIML Instance Annotation
"... Multiinstance multilabel learning (MIML) is a framework for supervised classification where the objects to be classified are bags of instances associated with multiple labels. For example, an image can be represented as a bag of segments and associated with a list of objects it contains. Prior wor ..."
Abstract

Cited by 23 (10 self)
 Add to MetaCart
(Show Context)
Multiinstance multilabel learning (MIML) is a framework for supervised classification where the objects to be classified are bags of instances associated with multiple labels. For example, an image can be represented as a bag of segments and associated with a list of objects it contains. Prior work on MIML has focused on predicting label sets for previously unseen bags. We instead consider the problem of predicting instance labels while learning from data labeled only at the bag level. We propose RankLoss Support Instance Machines, which optimize a regularized rankloss objective and can be instantiated with different aggregation models connecting instancelevel predictions with baglevel predictions. The aggregation models that we consider are equivalent to defining a “support instance ” for each bag, which allows efficient optimization of the rankloss objective using primal subgradient descent. Experiments on artificial and realworld datasets show that the proposed methods achieve higher accuracy than other loss functions used in prior work, e.g., Hamming loss, and recent work in ambiguous label classification.
Hybrid Generative/Discriminative Learning for Automatic Image Annotation
"... Automatic image annotation (AIA) raises tremendous challenges to machine learning as it requires modeling of data that are both ambiguous in input and output, e.g., images containing multiple objects and labeled with multiple semantic tags. Even more challenging is that the number of candidate tags ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
(Show Context)
Automatic image annotation (AIA) raises tremendous challenges to machine learning as it requires modeling of data that are both ambiguous in input and output, e.g., images containing multiple objects and labeled with multiple semantic tags. Even more challenging is that the number of candidate tags is usually huge (as large as the vocabulary size) yet each image is only related to a few of them. This paper presents a hybrid generativediscriminative classifier to simultaneously address the extreme dataambiguity and overfittingvulnerability issues in tasks such as AIA. Particularly: (1) an ExponentialMultinomial Mixture (EMM) model is established to capture both the input and output ambiguity and in the meanwhile to encourage prediction sparsity; and (2) the prediction ability of the EMM model is explicitly maximized through discriminative learning that integrates variational inference of graphical models and the pairwise formulation of ordinal regression. Experiments show that our approach achieves both superior annotation performance and better tag scalability. 1
Towards Discovering What Patterns Trigger What Labels ∗
"... In many real applications, especially those involving data objects with complicated semantics, it is generally desirable to discover the relation between patterns in the input space and labels corresponding to different semantics in the output space. This task becomes feasible with MIML (MultiInst ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
(Show Context)
In many real applications, especially those involving data objects with complicated semantics, it is generally desirable to discover the relation between patterns in the input space and labels corresponding to different semantics in the output space. This task becomes feasible with MIML (MultiInstance MultiLabel learning), a recently developed learning framework, where each data object is represented by multiple instances and is allowed to be associated with multiple labels simultaneously. In this paper, we propose KISAR, an MIML algorithm that is able to discover what instances trigger what labels. By considering the fact that highly relevant labels usually share some patterns, we develop a convex optimization formulation and provide an alternating optimization solution. Experiments show that KISAR is able to discover reasonable relations between input patterns and output labels, and achieves performances that are highly competitive with many stateoftheart MIML algorithms.
Multiinstance multilabel learning with weak label
 in Proceedings of the 23rd International Joint Conference on Artificial Intelligence
"... MultiInstance MultiLabel learning (MIML) deals with data objects that are represented by a bag of instances and associated with a set of class labels simultaneously. Previous studies typically assume that for every training example, all positive labels are tagged whereas the untagged labels are al ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
MultiInstance MultiLabel learning (MIML) deals with data objects that are represented by a bag of instances and associated with a set of class labels simultaneously. Previous studies typically assume that for every training example, all positive labels are tagged whereas the untagged labels are all negative. In many real applications such as image annotation, however, the learning problem often suffers from weak label; that is, users usually tag only a part of positive labels, and the untagged labels are not necessarily negative. In this paper, we propose the MIMLwel approach which works by assuming that highly relevant labels share some common instances, and the underlying class means of bags for each label are with a large margin. Experiments validate the effectiveness of MIMLwel in handling the weak label problem. 1
MultiModal Image Annotation with MultiInstance MultiLabel LDA ∗
"... This paper studies the problem of image annotation in a multimodal setting where both visual and textual information are available. We propose Multimodal Multiinstance Multilabel Latent Dirichlet Allocation (M3LDA), where the model consists of ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
This paper studies the problem of image annotation in a multimodal setting where both visual and textual information are available. We propose Multimodal Multiinstance Multilabel Latent Dirichlet Allocation (M3LDA), where the model consists of
MultiInstance Mixture Models and SemiSupervised Learning
"... Multiinstance (MI) learning is a variant of supervised learning where labeled examples consist of bags (i.e. multisets) of feature vectors instead of just a single feature vector. Under standard assumptions, MI learning can be understood as a type of semisupervised learning (SSL). The difference b ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Multiinstance (MI) learning is a variant of supervised learning where labeled examples consist of bags (i.e. multisets) of feature vectors instead of just a single feature vector. Under standard assumptions, MI learning can be understood as a type of semisupervised learning (SSL). The difference between MI learning and SSL is that positive bag labels provide weak label information for the instances that they contain. MI learning tasks can be approximated as SSL tasks by disregarding this weak label information, allowing the direct application of existing SSL techniques. To give insight into this connection we first introduce multiinstance mixture models (MIMMs), an adaption of mixture model classifiers for multiinstance data. We show how to learn such models using an ExpectationMaximization algorithm in the case where the instancelevel class distributions are members of an exponential family. The cost of the semisupervised approximation to multiinstance learning is explored, both theoretically and empirically, by analyzing the properties of MIMMs relative to semisupervised mixture models. 1
Instance annotation for multiinstance multilabel learning,” Transactions on Knowledge Discovery from Data (TKDD
, 2012
"... Multiinstance multilabel learning (MIML) is a framework for supervised classification where the objects to be classified are bags of instances associated with multiple labels. For example, an image can be represented as a bag of segments and associated with a list of objects it contains. Prior wor ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Multiinstance multilabel learning (MIML) is a framework for supervised classification where the objects to be classified are bags of instances associated with multiple labels. For example, an image can be represented as a bag of segments and associated with a list of objects it contains. Prior work on MIML has focused on predicting label sets for previously unseen bags. We instead consider the problem of predicting instance labels while learning from data labeled only at the bag level. We propose a regularized rankloss objective designed for instance annotation, which can be instantiated with different aggregation models connecting instancelevel labels with baglevel label sets. The aggregation models that we consider can be factored as a linear function of a “support instance ” for each class, which is a single feature vector representing a whole bag. Hence we name our proposed methods rankloss Support Instance Machines (SIM). We propose two optimization methods for the rankloss objective, which is nonconvex. One is a heuristic method that alternates between updating support instances, and solving a convex problem in which the support instances are treated as constant. The other is to apply the constrained concaveconvex procedure (CCCP), which can also be interpreted as iteratively updating support instances and solving a convex problem. To solve the convex problem, we employ the Pegasos framework of primal subgradient descent, and prove that it finds an suboptimal solution in runtime that is linear in the number of bags, instances, and 1 . Additionally, we
EFFICIENT INSTANCE ANNOTATION IN MULTIINSTANCE LEARNING
"... The cost associated with manually labeling every individual instance in large datasets is prohibitive. Significant labeling efforts can be saved by assigning a collective label to a group of instances (a bag). This setup prompts the need for algorithms that allow labeling individual instances (inst ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
The cost associated with manually labeling every individual instance in large datasets is prohibitive. Significant labeling efforts can be saved by assigning a collective label to a group of instances (a bag). This setup prompts the need for algorithms that allow labeling individual instances (instance annotation) based on baglevel labels. Probabilistic models in which instancelevel labels are latent variables can be used for instance annotation. Bruteforce computation of instancelevel label probabilities is exponential in the number of instances per bag due to marginalization over all possible combinations. Existing solutions for addressing this issue include approximate methods such as sampling or variational inference. This paper proposes a discriminative probability model and an expectation maximization procedure for inference to address the instance annotation problem. A key contribution is a dynamic programming solution for exact computation of instance probabilities in quadratic time. Experiments on bird song, image annotation, and two synthetic datasets show a significant accuracy improvement by 4%14 % over a recent stateoftheart rank loss SIM method. Index Terms — Multiinstance learning, discriminative model, expectation maximization, logistic regression, dynamic programming 1.
2013b. Fast multiinstance multilabel learning
"... In multiinstance multilabel learning (MIML), one object is represented by multiple instances and simultaneously associated with multiple labels. Existing MIML approaches have been found useful in many applications; however, most of them can only handle moderatesized data. To efficiently handle ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
In multiinstance multilabel learning (MIML), one object is represented by multiple instances and simultaneously associated with multiple labels. Existing MIML approaches have been found useful in many applications; however, most of them can only handle moderatesized data. To efficiently handle large data sets, we propose the MIMLfast approach, which first constructs a lowdimensional subspace shared by all labels, and then trains label specific linear models to optimize approximated ranking loss via stochastic gradient descent. Although the MIML problem is complicated, MIMLfast is able to achieve excellent performance by exploiting label relations with shared space and discovering subconcepts for complicated labels. Experiments show that the performance of MIMLfast is highly competitive to stateoftheart techniques, whereas its time cost is much less; particularly, on a data set with 30K bags and 270K instances, where none of existing approaches can return results in 24 hours, MIMLfast takes only 12 minutes. Moreover, our approach is able to identify the most representative instance for each label, and thus providing a chance to understand the relation between input patterns and output semantics.
Dimensionality reduction and topic modelling: from latent semantic indexing to latent dirichlet allocation and beyond
 in Mining Text Data, C. Aggarwal and
, 2012
"... ..."
(Show Context)