Results 1  10
of
16
Largescale Multilabel Learning with Missing Labels
"... The multilabel classification problem has generated significant interest in recent years. However, existing approaches do not adequately address two key challenges: (a) scaling up to problems with a large number (say millions) of labels, and (b) handling data with missing labels. In this paper, ..."
Abstract

Cited by 16 (4 self)
 Add to MetaCart
The multilabel classification problem has generated significant interest in recent years. However, existing approaches do not adequately address two key challenges: (a) scaling up to problems with a large number (say millions) of labels, and (b) handling data with missing labels. In this paper, we directly address both these problems by studying the multilabel problem in a generic empirical risk minimization (ERM) framework. Our framework, despite being simple, is surprisingly able to encompass several recent labelcompression based methods which can be derived as special cases of our method. To optimize the ERM problem, we develop techniques that exploit the structure of specific loss functionssuch as the squared loss function to obtain efficient algorithms. We further show that our learning framework admits excess risk bounds even in the presence of missing labels. Our bounds are tight and demonstrate better generalization performance for lowrank promoting tracenorm regularization when compared to (rank insensitive) Frobenius norm regularization. Finally, we present extensive empirical results on a variety of benchmark datasets and show that our methods perform significantly better than existing label compression based methods and can scale up to very large datasets such as a Wikipedia dataset that has more than 200,000 labels. 1.
Robust bloom filters for large multilabel classification tasks
 Advances in Neural Information Processing Systems 26
, 2013
"... This paper presents an approach to multilabel classification (MLC) with a large number of labels. Our approach is a reduction to binary classification in which label sets are represented by low dimensional binary vectors. This representation follows the principle of Bloom filters, a spaceefficient ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
This paper presents an approach to multilabel classification (MLC) with a large number of labels. Our approach is a reduction to binary classification in which label sets are represented by low dimensional binary vectors. This representation follows the principle of Bloom filters, a spaceefficient data structure originally designed for approximate membership testing. We show that a naive application of Bloom filters in MLC is not robust to individual binary classifiers ’ errors. We then present an approach that exploits a specific feature of realworld datasets when the number of labels is large: many labels (almost) never appear together. Our approach is provably robust, has sublinear training and inference complexity with respect to the number of labels, and compares favorably to stateoftheart algorithms on two large scale multilabel datasets. 1
Efficient Multilabel Classification with Many Labels
"... In multilabel classification, each sample can be associated with a set of class labels. When the number of labels grows to the hundreds or even thousands, existing multilabel classification methods often become computationally inefficient. In recent years, a number of remedies have been proposed. ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
In multilabel classification, each sample can be associated with a set of class labels. When the number of labels grows to the hundreds or even thousands, existing multilabel classification methods often become computationally inefficient. In recent years, a number of remedies have been proposed. However, they are based either on simple dimension reduction techniques or involve expensive optimization problems. In this paper, we address this problem by selecting a small subset of class labels that can approximately span the original label space. This is performed by an efficient randomized sampling procedure where the sampling probability of each class label reflects its importance among all the labels. Experiments on a number of realworld multilabel data sets with many labels demonstrate the appealing performance and efficiency of the proposed algorithm. 1.
Fastxml: a fast, accurate and stable treeclassifier for extreme multilabel learning.
 In KDD,
, 2014
"... ABSTRACT The objective in extreme multilabel classification is to learn a classifier that can automatically tag a data point with the most relevant subset of labels from a large label set. Extreme multilabel classification is an important research problem since not only does it enable the tacklin ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
ABSTRACT The objective in extreme multilabel classification is to learn a classifier that can automatically tag a data point with the most relevant subset of labels from a large label set. Extreme multilabel classification is an important research problem since not only does it enable the tackling of applications with many labels but it also allows the reformulation of ranking problems with certain advantages over existing formulations. Our objective, in this paper, is to develop an extreme multilabel classifier that is faster to train and more accurate at prediction than the stateoftheart Multilabel Random Forest (MLRF) algorithm [2] and the Label Partitioning for Sublinear Ranking (LPSR) algorithm
Multilabel Classification via Featureaware Implicit Label Space Encoding
"... To tackle a multilabel classification problem with many classes, recently label space dimension reduction (LSDR) is proposed. It encodes the original label space to a lowdimensional latent space and uses a decoding process for recovery. In this paper, we propose a novel method termed FaIE to pe ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
To tackle a multilabel classification problem with many classes, recently label space dimension reduction (LSDR) is proposed. It encodes the original label space to a lowdimensional latent space and uses a decoding process for recovery. In this paper, we propose a novel method termed FaIE to perform LSDR via Featureaware Implicit label space Encoding. Unlike most previous work, the proposed FaIE makes no assumptions about the encoding process and directly learns a code matrix, i.e. the encoding result of some implicit encoding function, and a linear decoding matrix. To learn both matrices, FaIE jointly maximizes the recoverability of the original label space from the latent space, and the predictability of the latent space from the feature space, thus making itself featureaware. FaIE can also be specified to learn an explicit encoding function, and extended with kernel tricks to handle nonlinear correlations between the feature space and the latent space. Extensive experiments conducted on benchmark datasets well demonstrate its effectiveness. 1.
Active Learning for Sparse Bayesian Multilabel Classification
"... We study the problem of active learning for multilabel classification. We focus on the realworld scenario where the average number of positive (relevant) labels per data point is small leading to positive label sparsity. Carrying out mutual information based nearoptimal active learning in this s ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We study the problem of active learning for multilabel classification. We focus on the realworld scenario where the average number of positive (relevant) labels per data point is small leading to positive label sparsity. Carrying out mutual information based nearoptimal active learning in this setting is a challenging task since the computational complexity involved is exponential in the total number of labels. We propose a novel inference algorithm for the sparse Bayesian multilabel model of [17]. The benefit of this alternate inference scheme is that it enables a natural approximation of the mutual information objective. We prove that the approximation leads to an identical solution to the exact optimization problem but at a fraction of the optimization cost. This allows us to carry out efficient, nonmyopic, and nearoptimal active learning for sparse multilabel classification. Extensive experiments reveal the effectiveness of the method.
Multilabel Classification with Output Kernels
"... Abstract. Although multilabel classification has become an increasingly important problem in machine learning, current approaches remain restricted to learning in the original label space (or in a simple linear projection of the original label space). Instead, we propose to use kernels on output la ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Although multilabel classification has become an increasingly important problem in machine learning, current approaches remain restricted to learning in the original label space (or in a simple linear projection of the original label space). Instead, we propose to use kernels on output label vectors to significantly expand the forms of label dependence that can be captured. The main challenge is to reformulate standard multilabel losses to handle kernels between output vectors. We first demonstrate how a stateoftheart large margin loss for multilabel classification can be reformulated, exactly, to handle output kernels as well as input kernels. Importantly, the preimage problem for multilabel classification can be easily solved at test time, while the training procedure can still be simply expressed as a quadratic program in a dual parameter space. We then develop a projected gradient descent training procedure for this new formulation. Our empirical results demonstrate the efficacy of the proposed approach on complex image labeling tasks. 1
Conditional Restricted Boltzmann Machines for Multilabel Learning with Incomplete Labels
"... Standard multilabel learning methods assume fully labeled training data. This assumption however is impractical in many application domains where labels are difficult to collect and missing labels are prevalent. In this paper, we develop a novel conditional restricted Boltzmann machine model to ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Standard multilabel learning methods assume fully labeled training data. This assumption however is impractical in many application domains where labels are difficult to collect and missing labels are prevalent. In this paper, we develop a novel conditional restricted Boltzmann machine model to address multilabel learning with incomplete labels. It uses a restricted Boltzmann machine to capture the highorder label dependence relationships in the output space, aiming to enhance the capacity of recovering missing labels and learning high quality multilabel prediction models. Moreover, it also incorporates label cooccurrence information retrieved from auxiliary resources as prior knowledge. We perform model training by maximizing the regularized marginal conditional likelihood of the label vectors given the input features, and develop a Viterbi style EM algorithm to solve the induced optimization problem. The proposed approach is evaluated on four real word multilabel data sets by comparing to a number of stateoftheart methods. The experimental results show it outperforms all the other comparison methods across the applied data sets. 1
Multilabel Classification with Label Correlations and Missing Labels
"... Many realworld applications involve multilabel classification, in which the labels can have strong interdependencies and some of them may even be missing. Existing multilabel algorithms are unable to handle both issues simultaneously. In this paper, we propose a probabilistic model that can auto ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Many realworld applications involve multilabel classification, in which the labels can have strong interdependencies and some of them may even be missing. Existing multilabel algorithms are unable to handle both issues simultaneously. In this paper, we propose a probabilistic model that can automatically learn and exploit multilabel correlations. By integrating out the missing information, it also provides a disciplined approach to the handling of missing labels. The inference procedure is simple, and the optimization subproblems are convex. Experiments on a number of realworld data sets with both complete and missing labels demonstrate that the proposed algorithm can consistently outperform stateoftheart multilabel classification algorithms.
Beijing Key Lab of Traffic Data Analysis and Mining
"... Multilabel problems arise in various domains including automatic multimedia data categorization, and have generated significant interest in computer vision and machine learning community. However, existing methods do not adequately address two key challenges: exploiting correlations between labe ..."
Abstract
 Add to MetaCart
(Show Context)
Multilabel problems arise in various domains including automatic multimedia data categorization, and have generated significant interest in computer vision and machine learning community. However, existing methods do not adequately address two key challenges: exploiting correlations between labels and making up for the lack of labeled data or even missing labels. In this paper, we proposed a semisupervised lowrank mapping (SLRM) model to handle these two challenges. SLRM model takes advantage of the nuclear norm regularization on mapping to effectively capture the label correlations. Meanwhile, it introduces manifold regularizer on mapping to capture the intrinsic structure among data, which provides a good way to reduce the required labeled data with improving the classification performance. Furthermore, we designed an efficient algorithm to solve SLRM model based on alternating direction method of multipliers and thus it can efficiently deal with largescale datasets. Experiments on four realworld multimedia datasets demonstrate that the proposed method can exploit the label correlations and obtain promising and better label prediction results than stateoftheart methods. 1.