Results 1 - 10
of
16
Active learning by querying informative and representative examples
- in Advances in Neural Information Processing Systems (NIPS'10
, 2010
"... Most active learning approaches select either informative or representative unla-beled instances to query their labels. Although several active learning algorithms have been proposed to combine the two criteria for query selection, they are usu-ally ad hoc in finding unlabeled instances that are bot ..."
Abstract
-
Cited by 34 (4 self)
- Add to MetaCart
(Show Context)
Most active learning approaches select either informative or representative unla-beled instances to query their labels. Although several active learning algorithms have been proposed to combine the two criteria for query selection, they are usu-ally ad hoc in finding unlabeled instances that are both informative and repre-sentative. We address this challenge by a principled approach, termed QUIRE, based on the min-max view of active learning. The proposed approach provides a systematic way for measuring and combining the informativeness and represen-tativeness of an instance. Extensive experimental results show that the proposed QUIRE approach outperforms several state-of-the-art active learning approaches. 1
Multilabel learning by exploiting label correlations locally
- In AAAI
, 2012
"... It is well known that exploiting label correlations is important for multi-label learning. Existing approaches typically exploit label correlations globally, by assum-ing that the label correlations are shared by all the in-stances. In real-world tasks, however, different instances may share differe ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
(Show Context)
It is well known that exploiting label correlations is important for multi-label learning. Existing approaches typically exploit label correlations globally, by assum-ing that the label correlations are shared by all the in-stances. In real-world tasks, however, different instances may share different label correlations, and few correla-tions are globally applicable. In this paper, we propose the ML-LOC approach which allows label correlations to be exploited locally. To encode the local influence of label correlations, we derive a LOC code to enhance the feature representation of each instance. The global dis-crimination fitting and local correlation sensitivity are incorporated into a unified framework, and an alternat-ing solution is developed for the optimization. Experi-mental results on a number of image, text and gene data sets validate the effectiveness of our approach.
Multi-label hypothesis reuse
- In KDD
"... Multi-label learning arises in many real-world tasks where an object is naturally associated with multiple concepts. It is well-accepted that, in order to achieve a good performance, the relationship among labels should be exploited. Most existing approaches require the label relationship as prior k ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
(Show Context)
Multi-label learning arises in many real-world tasks where an object is naturally associated with multiple concepts. It is well-accepted that, in order to achieve a good performance, the relationship among labels should be exploited. Most existing approaches require the label relationship as prior knowledge, or exploit by counting the label co-occurrence. In this paper, we propose the MAHR approach, which is able to automatically discover and exploit label relationship. Our basic idea is that, if two labels are related, the hypothesis generated for one label can be helpful for the other label. MAHR implements the idea as a boosting approach with a hypothesis reuse mechanism. In each boosting round, the base learner for a label is generated by not only learning on its own task but also reusing the hypotheses from other labels, and the amount of reuse across labels provides an es-timate of the label relationship. Extensive experimental re-sults validate that MAHR is able to achieve superior perfor-mance and discover reasonable label relationship. Moreover, we disclose that the label relationship is usually asymmetric.
Optimizing the F-Measure in Multi-Label Classification: Plug-in Rule Approach versus Structured Loss Minimization
"... We compare the plug-in rule approach for optimizing the Fβ-measure in multi-label classification with an approach based on structured loss minimization, such as the structured support vector machine (SSVM). Whereas the former derives an optimal prediction from a probabilistic model in a separate inf ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
We compare the plug-in rule approach for optimizing the Fβ-measure in multi-label classification with an approach based on structured loss minimization, such as the structured support vector machine (SSVM). Whereas the former derives an optimal prediction from a probabilistic model in a separate inference step, the latter seeks to optimize the Fβ-measure directly during the training phase. We introduce a novel plug-in rule algorithm that estimates all parameters required for a Bayes-optimal prediction via a set of multinomial regression models, and we compare this algorithm with SSVMs in terms of computational complexity and statistical consistency. As a main theoretical result, we show that our plug-in rule algorithm is consistent, whereas the SSVM approaches are not. Finally, we present results of a large experimental study showing the benefits of the introduced algorithm. 1.
Image Labeling on a Network: Using Social-Network Metadata for Image
"... Classification ..."
(Show Context)
Probabilistic Multi-label Classification with Sparse Feature Learning
"... Multi-label classification is a critical problem in many areas of data analysis such as image labeling and text categorization. In this paper we propose a probabilistic multi-label classification model based on novel sparse feature learning. By employing an individual sparsity inducing ℓ1-norm and a ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Multi-label classification is a critical problem in many areas of data analysis such as image labeling and text categorization. In this paper we propose a probabilistic multi-label classification model based on novel sparse feature learning. By employing an individual sparsity inducing ℓ1-norm and a group sparsity inducing ℓ2,1-norm, the proposed model has the capacity of capturing both label interdepen-dencies and common predictive model structures. We formulate this sparse norm regularized learn-ing problem as a non-smooth convex optimization problem, and develop a fast proximal gradient algo-rithm to solve it for an optimal solution. Our empir-ical study demonstrates the efficacy of the proposed method on a set of multi-label tasks given a limited number of labeled training instances. 1
Hc-search for multi-label prediction: An empirical study
- In Proceedings of AAAI Conference on Artificial Intelligence (AAAI
, 2014
"... Abstract Multi-label learning concerns learning multiple, overlapping, and correlated classes. In this paper, we adapt a recent structured prediction framework called HCSearch for multi-label prediction problems. One of the main advantages of this framework is that its training is sensitive to the ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
Abstract Multi-label learning concerns learning multiple, overlapping, and correlated classes. In this paper, we adapt a recent structured prediction framework called HCSearch for multi-label prediction problems. One of the main advantages of this framework is that its training is sensitive to the loss function, unlike the other multilabel approaches that either assume a specific loss function or require a manual adaptation to each loss function. We empirically evaluate our instantiation of the HC-Search framework along with many existing multilabel learning algorithms on a variety of benchmarks by employing diverse task loss functions. Our results demonstrate that the performance of existing algorithms tends to be very similar in most cases, and that the HCSearch approach is comparable and often better than all the other algorithms across different loss functions.
Hierarchical Multi-Label Classification of Social Text Streams
"... Hierarchical multi-label classification assigns a document to mul-tiple hierarchical classes. In this paper we focus on hierarchical multi-label classification of social text streams. Concept drift, com-plicated relations among classes, and the limited length of docu-ments in social text streams mak ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Hierarchical multi-label classification assigns a document to mul-tiple hierarchical classes. In this paper we focus on hierarchical multi-label classification of social text streams. Concept drift, com-plicated relations among classes, and the limited length of docu-ments in social text streams make this a challenging problem. Our approach includes three core ingredients: short document expan-sion, time-aware topic tracking, and chunk-based structural learn-ing. We extend each short document in social text streams to a more comprehensive representation via state-of-the-art entity link-ing and sentence ranking strategies. From documents extended in this manner, we infer dynamic probabilistic distributions over top-ics by dividing topics into dynamic “global ” topics and “local ” top-ics. For the third and final phase we propose a chunk-based struc-tural optimization strategy to classify each document into multi-ple classes. Extensive experiments conducted on a large real-world dataset show the effectiveness of our proposed method for hierar-chical multi-label classification of social text streams.
Multi-label Classification with Output Kernels
"... Abstract. Although multi-label classification has become an increasingly important problem in machine learning, current approaches remain restricted to learning in the original label space (or in a simple linear projection of the original label space). Instead, we propose to use kernels on output la ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
Abstract. Although multi-label classification has become an increasingly important problem in machine learning, current approaches remain restricted to learning in the original label space (or in a simple linear projection of the original label space). Instead, we propose to use kernels on output label vectors to significantly expand the forms of label dependence that can be captured. The main challenge is to reformulate standard multi-label losses to handle kernels between output vectors. We first demonstrate how a state-of-the-art large margin loss for multi-label classification can be reformulated, exactly, to handle output kernels as well as input kernels. Importantly, the pre-image problem for multi-label classification can be easily solved at test time, while the training procedure can still be simply expressed as a quadratic program in a dual parameter space. We then develop a projected gradient descent training procedure for this new formulation. Our empirical results demonstrate the efficacy of the proposed approach on complex image labeling tasks. 1
D.: Semi-supervised multi-label classification - a simultaneous largemargin, subspace learning approach
- In: ECML/PKDD
"... Abstract. Labeled data is often sparse in common learning scenarios, either because it is too time consuming or too expensive to obtain, while unlabeled data is almost always plentiful. This asymmetry is exacerbated in multi-label learning, where the labeling process is more complex than in the sing ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Labeled data is often sparse in common learning scenarios, either because it is too time consuming or too expensive to obtain, while unlabeled data is almost always plentiful. This asymmetry is exacerbated in multi-label learning, where the labeling process is more complex than in the single label case. Although it is important to consider semisupervised methods for multi-label learning, as it is in other learning scenarios, surprisingly, few proposals have been investigated for this particular problem. In this paper, we present a new semi-supervised multilabel learning method that combines large-margin multi-label classification with unsupervised subspace learning. We propose an algorithm that learns a subspace representation of the labeled and unlabeled inputs, while simultaneously training a supervised large-margin multi-label classifier on the labeled portion. Although joint training of these two interacting components might appear intractable, we exploit recent developments in induced matrix norm optimization to show that these two problems can be solved jointly, globally and efficiently. In particular, we develop an efficient training procedure based on subgradient search and a simple coordinate descent strategy. An experimental evaluation demonstrates that semi-supervised subspace learning can improve the performance of corresponding supervised multi-label learning methods.