Results 1 
8 of
8
Convex calibrated surrogates for lowrank loss matrices with applications to subset ranking losses
 In Advances in Neural Information Processing Systems
"... The design of convex, calibrated surrogate losses, whose minimization entails consistency with respect to a desired target loss, is an important concept to have emerged in the theory of machine learning in recent years. We give an explicit construction of a convex leastsquares type surrogate loss t ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
The design of convex, calibrated surrogate losses, whose minimization entails consistency with respect to a desired target loss, is an important concept to have emerged in the theory of machine learning in recent years. We give an explicit construction of a convex leastsquares type surrogate loss that can be designed to be calibrated for any multiclass learning problem for which the target loss matrix has a lowrank structure; the surrogate loss operates on a surrogate target space of dimension at most the rank of the target loss. We use this result to design convex calibrated surrogates for a variety of subset ranking problems, with target losses including the precision@q, expected rank utility, mean average precision, and pairwise disagreement. 1
Convex Calibrated Surrogates for Hierarchical Classification
"... Hierarchical classification problems are multiclass supervised learning problems with a predefined hierarchy over the set of class labels. In this work, we study the consistency of hierarchical classification algorithms with respect to a natural loss, namely the tree distance metric on the hierar ..."
Abstract
 Add to MetaCart
(Show Context)
Hierarchical classification problems are multiclass supervised learning problems with a predefined hierarchy over the set of class labels. In this work, we study the consistency of hierarchical classification algorithms with respect to a natural loss, namely the tree distance metric on the hierarchy tree of class labels, via the usage of calibrated surrogates. We first show that the Bayes optimal classifier for this loss classifies an instance according to the deepest node in the hierarchy such that the total conditional probability of the subtree rooted at the node is greater than 12. We exploit this insight to develop new consistent algorithm for hierarchical classification, that makes use of an algorithm known to be consistent for the “multiclass classification with reject option (MCRO) ” problem as a subroutine. Our experiments on a number of benchmark datasets show that the resulting algorithm, which we term OvACascade, gives improved performance over other stateoftheart hierarchical classification algorithms. 1.
Predtron: A Family of Online Algorithms for General Prediction Problems
"... Modern prediction problems arising in multilabel learning and learning to rank pose unique challenges to the classical theory of supervised learning. These problems have large prediction and label spaces of a combinatorial nature and involve sophisticated loss functions. We offer a general framewor ..."
Abstract
 Add to MetaCart
(Show Context)
Modern prediction problems arising in multilabel learning and learning to rank pose unique challenges to the classical theory of supervised learning. These problems have large prediction and label spaces of a combinatorial nature and involve sophisticated loss functions. We offer a general framework to derive mistake driven online algorithms and associated loss bounds. The key ingredients in our framework are a general loss function, a general vector space representation of predictions, and a notion of margin with respect to a general norm. Our general algorithm, Predtron, yields the perceptron algorithm and its variants when instantiated on classic problems such as binary classification, multiclass classification, ordinal regression, and multilabel classification. For multilabel ranking and subset ranking, we derive novel algorithms, notions of margins, and loss bounds. A simulation study confirms the behavior predicted by our bounds and demonstrates the flexibility of the design choices in our framework. 1
On the Consistency of Output Code Based Learning Algorithms for Multiclass Learning Problems
"... A popular approach to solving multiclass learning problems is to reduce them to a set of binary classification problems through some output code matrix: the widely used onevsall and allpairs methods, and the errorcorrecting output code methods of Dietterich and Bakiri (1995), can all be viewed a ..."
Abstract
 Add to MetaCart
(Show Context)
A popular approach to solving multiclass learning problems is to reduce them to a set of binary classification problems through some output code matrix: the widely used onevsall and allpairs methods, and the errorcorrecting output code methods of Dietterich and Bakiri (1995), can all be viewed as special cases of this approach. In this paper, we consider the question of statistical consistency of such methods. We focus on settings where the binary problems are solved by minimizing a binary surrogate loss, and derive general conditions on the binary surrogate loss under which the onevsall and allpairs code matrices yield consistent algorithms with respect to the multiclass 01 loss. We then consider general multiclass learning problems defined by a general multiclass loss, and derive conditions on the output code matrix and binary surrogates under which the resulting algorithm is consistent with respect to the target multiclass loss. We also consider probabilistic code matrices, where one reduces a multiclass problem to a set of class probability labeled binary problems, and show that these can yield benefits in the sense of requiring a smaller number of binary problems to achieve overall consistency. Our analysis makes interesting connections with the theory of proper composite losses (Buja et al., 2005; Reid and Williamson, 2010); these play a role
Team SequeL
"... A commonly used approach to multiclass classification is to replace the 0 − 1 loss with a convex surrogate so as to make empirical risk minimization computationally tractable. Previous work has uncovered sufficient and necessary conditions for the consistency of the resulting procedures. In this pap ..."
Abstract
 Add to MetaCart
A commonly used approach to multiclass classification is to replace the 0 − 1 loss with a convex surrogate so as to make empirical risk minimization computationally tractable. Previous work has uncovered sufficient and necessary conditions for the consistency of the resulting procedures. In this paper, we strengthen these results by showing how the 0 − 1 excess loss of a predictor can be upper bounded as a function of the excess loss of the predictor measured using the convex surrogate. The bound is developed for the case of costsensitive multiclass classification and a convex surrogate loss that goes back to the work of Lee, Lin and Wahba. The bounds are as easy to calculate as in binary classification. Furthermore, we also show that our analysis extends to the analysis of the recently introduced “Simplex Coding ” scheme. 1.
Convex Calibration Dimension for Multiclass Loss Matrices
, 2014
"... We study consistency properties of surrogate loss functions for general multiclass learning problems, defined by a general multiclass loss matrix. We extend the notion of classification calibration, which has been studied for binary and multiclass 01 classification problems (and for certain other s ..."
Abstract
 Add to MetaCart
(Show Context)
We study consistency properties of surrogate loss functions for general multiclass learning problems, defined by a general multiclass loss matrix. We extend the notion of classification calibration, which has been studied for binary and multiclass 01 classification problems (and for certain other specific learning problems), to the general multiclass setting, and derive necessary and sufficient conditions for a surrogate loss to be calibrated with respect to a loss matrix in this setting. We then introduce the notion of convex calibration dimension of a multiclass loss matrix, which measures the smallest ‘size ’ of a prediction space in which it is possible to design a convex surrogate that is calibrated with respect to the loss matrix. We derive both upper and lower bounds on this quantity, and use these results to analyze various loss matrices. In particular, we apply our framework to study various subset ranking losses, and use the convex calibration dimension as a tool to show both the existence and nonexistence of various types of convex calibrated surrogates for these losses. Our results strengthen recent results of Duchi et al. (2010) and Calauzènes et al. (2012) on the nonexistence of certain types of convex calibrated surrogates in subset ranking. We anticipate the convex calibration dimension may prove to be a useful tool in the study and design of surrogate losses for general multiclass learning problems. 1
On the Consistency of Ordinal Regression Methods
"... Ordinal regression is a common supervised learning problem sharing properties with both regression and classification. Many of the ordinal regression algorithms that have been proposed can be viewed as methods that minimize a convex surrogate of the zeroone, absolute, or squared errors. We extend ..."
Abstract
 Add to MetaCart
(Show Context)
Ordinal regression is a common supervised learning problem sharing properties with both regression and classification. Many of the ordinal regression algorithms that have been proposed can be viewed as methods that minimize a convex surrogate of the zeroone, absolute, or squared errors. We extend the notion of consistency which has been studied for classification, ranking and some ordinal regression models to the general setting of ordinal regression. We study a rich family of these surrogate loss functions and assess their consistency with both positive and negative results. For arbitrary loss functions that are admissible in the context of ordinal regression, we develop an approach that yields consistent surrogate loss functions. Finally, we illustrate our findings on realworld datasets. 1