Results 1  10
of
28
Label Ranking by Learning Pairwise Preferences
"... Preference learning is an emerging topic that appears in different guises in the recent literature. This work focuses on a particular learning scenario called label ranking, where the problem is to learn a mapping from instances to rankings over a finite number of labels. Our approach for learning s ..."
Abstract

Cited by 89 (20 self)
 Add to MetaCart
(Show Context)
Preference learning is an emerging topic that appears in different guises in the recent literature. This work focuses on a particular learning scenario called label ranking, where the problem is to learn a mapping from instances to rankings over a finite number of labels. Our approach for learning such a mapping, called ranking by pairwise comparison (RPC), first induces a binary preference relation from suitable training data using a natural extension of pairwise classification. A ranking is then derived from the preference relation thus obtained by means of a ranking procedure, whereby different ranking methods can be used for minimizing different loss functions. In particular, we show that a simple (weighted) voting strategy minimizes risk with respect to the wellknown Spearman rank correlation. We compare RPC to existing label ranking methods, which are based on scoring individual labels instead of comparing pairs of labels. Both empirically and theoretically, it is shown that RPC is superior in terms of computational efficiency, and at least competitive in terms of accuracy.
Multilabel classification via calibrated label ranking
 MACH LEARN
, 2008
"... Label ranking studies the problem of learning a mapping from instances to rankings over a predefined set of labels. Hitherto existing approaches to label ranking implicitly operate on an underlying (utility) scale which is not calibrated in the sense that it lacks a natural zero point. We propose a ..."
Abstract

Cited by 69 (10 self)
 Add to MetaCart
Label ranking studies the problem of learning a mapping from instances to rankings over a predefined set of labels. Hitherto existing approaches to label ranking implicitly operate on an underlying (utility) scale which is not calibrated in the sense that it lacks a natural zero point. We propose a suitable extension of label ranking that incorporates the calibrated scenario and substantially extends the expressive power of these approaches. In particular, our extension suggests a conceptually novel technique for extending the common learning by pairwise comparison approach to the multilabel scenario, a setting previously not being amenable to the pairwise decomposition technique. The key idea of the approach is to introduce an artificial calibration label that, in each example, separates the relevant from the irrelevant labels. We show that this technique can be viewed as a combination of pairwise preference learning and the conventional relevance classification technique, where a separate classifier is trained to predict whether a label is relevant or not. Empirical results in the area of text categorization, image classification and gene analysis underscore the merits of the calibrated model in comparison to stateoftheart multilabel learning methods.
Efficient pairwise multilabel classification for largescale problems in the legal domain
 IN ECMLPKDD ’08: PROCEEDINGS OF THE EUROPEAN CONFERENCE ON MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES
, 2008
"... In this paper we applied multilabel classification algorithms to the EURLex database of legal documents of the European Union. On this document collection, we studied three different multilabel classification problems, the largest being the categorization into the EUROVOC concept hierarchy with alm ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
(Show Context)
In this paper we applied multilabel classification algorithms to the EURLex database of legal documents of the European Union. On this document collection, we studied three different multilabel classification problems, the largest being the categorization into the EUROVOC concept hierarchy with almost 4000 classes. We evaluated three algorithms: (i) the binary relevance approach which independently trains one classifier per label; (ii) the multiclass multilabel perceptron algorithm, which respects dependencies between the base classifiers; and (iii) the multilabel pairwise perceptron algorithm, which trains one classifier for each pair of labels. All algorithms use the simple but very efficient perceptron algorithm as the underlying classifier, which makes them very suitable for largescale multilabel classification problems. The main challenge we had to face was that the almost 8,000,000 perceptrons that had to be trained in the pairwise setting could no longer be stored in memory. We solve this problem by resorting to the dual representation of the perceptron, which makes the pairwise approach feasible for problems of this size. The results on the EURLex database confirm the good predictive performance of the pairwise approach and demonstrates the feasibility of this approach for largescale tasks.
Efficient Voting Prediction for Pairwise Multilabel Classification
, 2009
"... The pairwise approach to multilabel classification reduces the problem to learning and aggregating preference predictions among the possible labels. A key problem is the need to query a quadratic number of preferences for making a prediction. To solve this problem, we extend the recently proposed QW ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
(Show Context)
The pairwise approach to multilabel classification reduces the problem to learning and aggregating preference predictions among the possible labels. A key problem is the need to query a quadratic number of preferences for making a prediction. To solve this problem, we extend the recently proposed QWeighted algorithm for efficient pairwise multiclass voting to the multilabel setting, and evaluate the adapted algorithm on several realworld datasets. We achieve an averagecase reduction of classifier evaluations from n2 to n+dn log n, where n is the total number of labels and d is the average number of labels, which is typically quite small in realworld datasets.
Efficient multilabel classification algorithms for largescale problems in the legal domain
 IN: PROCEEDINGS OF THE LANGUAGE RESOURCES AND EVALUATION CONFERENCE (LREC) WORKSHOP ON SEMANTIC PROCESSING OF LEGAL TEXTS
, 2008
"... In this paper we evaluate the performance of multilabel classification algorithms on the EURLex database of legal documents of the European Union. On the same set of underlying documents, we defined three different largescale multilabel problems with up to 4000 classes. On these datasets, we compa ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
In this paper we evaluate the performance of multilabel classification algorithms on the EURLex database of legal documents of the European Union. On the same set of underlying documents, we defined three different largescale multilabel problems with up to 4000 classes. On these datasets, we compared three algorithms: (i) the wellknown oneagainstall approach (OAA); (ii) the multiclass multilabel perceptron algorithm (MMP), which modifies the OAA ensemble by respecting dependencies between the base classifiers in the training protocol of the classifier ensemble; and (iii) the multilabel pairwise perceptron algorithm (MLPP), which unlike the previous algorithms trains one base classifier for each pair of classes. All algorithms use the simple but very efficient perceptron algorithm as the underlying classifier. This makes them very suitable for largescale multilabel classification problems. While previous work has already shown that the latter approach outperforms the other two approaches in terms of predictive accuracy, its key problem is that it has to store one classifier for each pair of classes. The key contribution of this work is to demonstrate a novel technique that makes the pairwise approach feasible for problems with large number of classes, such as those studied in this work. Our results on the EURLex database illustrate the effectiveness of the pairwise approach and the efficiency of the MMP algorithm. We also show that it is feasible to efficiently and effectively handle very large multilabel problems.
Two stage architecture for multilabel learning.
 Pattern Recognition
, 2012
"... a b s t r a c t A common approach to solving multilabel learning problems is to use problem transformation methods and dichotomizing classifiers as in the pairwise decomposition strategy. One of the problems with this strategy is the need for querying a quadratic number of binary classifiers for ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
a b s t r a c t A common approach to solving multilabel learning problems is to use problem transformation methods and dichotomizing classifiers as in the pairwise decomposition strategy. One of the problems with this strategy is the need for querying a quadratic number of binary classifiers for making a prediction that can be quite time consuming, especially in learning problems with a large number of labels. To tackle this problem, we propose a Two Stage Architecture (TSA) for efficient multilabel learning. We analyze three implementations of this architecture the Two Stage Voting Method (TSVM), the Two Stage Classifier Chain Method (TSCCM) and the Two Stage Pruned Classifier Chain Method (TSPCCM). Eight different realworld datasets are used to evaluate the performance of the proposed methods. The performance of our approaches is compared with the performance of two algorithm adaptation methods (MultiLabel kNN and MultiLabel C4.5) and five problem transformation methods (Binary Relevance, Classifier Chain, Calibrated Label Ranking with majority voting, the Quick Weighted method for pairwise multilabel learning and the Label Powerset method). The results suggest that TSCCM and TSPCCM outperform the competing algorithms in terms of predictive accuracy, while TSVM has comparable predictive performance. In terms of testing speed, all three methods show better performance as compared to the pairwise methods for multilabel learning.
Efficient decoding of ternary errorcorrecting output codes for multiclass classification
 Proceedings of 20th European Conference on Machine Learning (ECML09
, 2009
"... ..."
(Show Context)
Hybrid Decision Tree Architecture utilizing Local SVMs for MultiLabel Classification
"... Abstract. Multilabel classification (MLC) problems abound in many areas, including text categorization, protein function classification, and semantic annotation of multimedia. Issues that severely limit the applicability of many current machine learning approaches to MLC are the largescale problem ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. Multilabel classification (MLC) problems abound in many areas, including text categorization, protein function classification, and semantic annotation of multimedia. Issues that severely limit the applicability of many current machine learning approaches to MLC are the largescale problem and the high dimensionality of the label space, which have a strong impact on the computational complexity of learning. These problems are especially pronounced for approaches that transform MLC problems into a set of binary classification problems for which SVMs are used. On the other hand, the most efficient approaches to MLC, based on decision trees, have clearly lower predictive performance. We propose a hybrid decision tree architecture that utilizes local SVMs for efficient multilabel classification. We build decision trees for MLC, where the leaves do not give multilabel predictions directly, but rather contain SVMbased classifiers giving multilabel predictions. A binary relevance architecture is employed in each leaf, where a binary SVM classifier is built for each of the labels relevant to that particular leaf. We use several realworld datasets to evaluate the proposed method and its competition. Our hybrid approach on almost every classification problem outperforms the predictive performances of SVMbased approaches while its computational efficiency is significantly improved as a result of the integrated decision tree. Key words: multilabel classification, hybrid architecture 1
Efficient Prediction Algorithms for Binary Decomposition Techniques
"... Binary decomposition methods transform multiclass learning problems into a series of twoclass learning problems that can be solved with simpler learning algorithms. As the number of such binary learning problems often grows superlinearly with the number of classes, we need efficient methods for c ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Binary decomposition methods transform multiclass learning problems into a series of twoclass learning problems that can be solved with simpler learning algorithms. As the number of such binary learning problems often grows superlinearly with the number of classes, we need efficient methods for computing the predictions. In this paper, we discuss an efficient algorithm that queries only a dynamically determined subset of the trained classifiers, but still predicts the same classes that would have been predicted if all classifiers had been queried. The algorithm is first derived for the simple case of pairwise classification, and then generalized to arbitrary pairwise decompositions of the learning problem in the form of ternary errorcorrecting output codes under a variety of different code designs and decoding strategies.
ApplicationIndependent Feature Construction from Noisy Samples ⋆
"... Abstract. When training classifiers, presence of noise can severely harm their performance. In this paper, we focus on “nonclass ” attribute noise and we consider how a frequent faulttolerant (FFT) pattern mining task can be used to support noisetolerant classification. Our method is based on an ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract. When training classifiers, presence of noise can severely harm their performance. In this paper, we focus on “nonclass ” attribute noise and we consider how a frequent faulttolerant (FFT) pattern mining task can be used to support noisetolerant classification. Our method is based on an application independent strategy for feature construction based on the socalled δfree patterns. Our experiments on noisy training data shows accuracy improvement when using the computed features instead of the original ones. 1