Results 1 -
4 of
4
Label Embedding Trees for Large Multi-Class Tasks
"... Multi-class classification becomes challenging at test time when the number of classes is very large and testing against every possible class can become computationally infeasible. This problem can be alleviated by imposing (or learning) a structure over the set of classes. We propose an algorithm f ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Multi-class classification becomes challenging at test time when the number of classes is very large and testing against every possible class can become computationally infeasible. This problem can be alleviated by imposing (or learning) a structure over the set of classes. We propose an algorithm for learning a treestructure of classifiers which, by optimizing the overall tree loss, provides superior accuracy to existing tree labeling methods. We also propose a method that learns to embed labels in a low dimensional space that is faster than non-embedding approaches and has superior accuracy to existing embedding approaches. Finally we combine the two ideas resulting in the label embedding tree that outperforms alternative methods including One-vs-Rest while being orders of magnitude faster. 1
Fast and balanced: Efficient label tree learning for large scale object recognition
- In NIPS
, 2011
"... We present a novel approach to efficiently learn a label tree for large scale classification with many classes. The key contribution of the approach is a technique to simultaneously determine the structure of the tree and learn the classifiers for each node in the tree. This approach also allows fin ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We present a novel approach to efficiently learn a label tree for large scale classification with many classes. The key contribution of the approach is a technique to simultaneously determine the structure of the tree and learn the classifiers for each node in the tree. This approach also allows fine grained control over the efficiency vs accuracy trade-off in designing a label tree, leading to more balanced trees. Experiments are performed on large scale image classification with 10184 classes and 9 million images. We demonstrate significant improvements in test accuracy and efficiency with less training time and more balanced trees compared to the previous state of the art by Bengio et al. 1
Efficient Discriminative Learning of Class Hierarchy for Many Class Prediction
"... Abstract. Recently the maximum margin criterion has been employed to learn a discriminative class hierarchical model, which shows promising performance for rapid multi-class prediction. Specifically, at each node of this hierarchy, a separating hyperplane is learned to split its associated classes f ..."
Abstract
- Add to MetaCart
Abstract. Recently the maximum margin criterion has been employed to learn a discriminative class hierarchical model, which shows promising performance for rapid multi-class prediction. Specifically, at each node of this hierarchy, a separating hyperplane is learned to split its associated classes from all of the corresponding training data, leading to a time-consuming training process in computer vision applications with many classes such as large-scale object recognition and scene classification. To address this issue, in this paper we propose a new efficient discriminative class hierarchy learning approach for many class prediction. We first present a general objective function to unify the two stateof-the-art methods for multi-class tasks. When there are many classes, this objective function reveals that some classes are indeed redundant. Thus, omitting these redundant classes will not degrade the prediction performance of the learned class hierarchical model. Based on this observation, we decompose the original optimization problem into a sequence of much smaller sub-problems by developing an adaptive classifier updating method and an active class selection strategy. Specifically, we iteratively update the separating hyperplane by efficiently using the training samples only from a limited number of selected classes that are well separated by the current separating hyperplane. Comprehensive experiments on three large-scale datasets demonstrate that our approach can significantly accelerate the training process of the two state-of-the-art methods while achieving comparable prediction performance in terms of both classification accuracy and testing speed. 1

