Results 1  10
of
128
Pegasos: Primal Estimated subgradient solver for SVM
"... We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on a singl ..."
Abstract

Cited by 307 (15 self)
 Add to MetaCart
We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on a single training example. In contrast, previous analyses of stochastic gradient descent methods for SVMs require Ω(1/ɛ2) iterations. As in previously devised SVM solvers, the number of iterations also scales linearly with 1/λ, where λ is the regularization parameter of SVM. For a linear kernel, the total runtime of our method is Õ(d/(λɛ)), where d is a bound on the number of nonzero features in each example. Since the runtime does not depend directly on the size of the training set, the resulting algorithm is especially suited for learning from large datasets. Our approach also extends to nonlinear kernels while working solely on the primal objective function, though in this case the runtime does depend linearly on the training set size. Our algorithm is particularly well suited for large text classification problems, where we demonstrate an orderofmagnitude speedup over previous SVM learning methods.
Training a support vector machine in the primal
 Neural Computation
, 2007
"... Most literature on Support Vector Machines (SVMs) concentrate on the dual optimization problem. In this paper, we would like to point out that the primal problem can also be solved efficiently, both for linear and nonlinear SVMs, and that there is no reason for ignoring this possibilty. On the cont ..."
Abstract

Cited by 101 (5 self)
 Add to MetaCart
(Show Context)
Most literature on Support Vector Machines (SVMs) concentrate on the dual optimization problem. In this paper, we would like to point out that the primal problem can also be solved efficiently, both for linear and nonlinear SVMs, and that there is no reason for ignoring this possibilty. On the contrary, from the primal point of view new families of algorithms for large scale SVM training can be investigated.
Large scale transductive svms
 JMLR
"... We show how the ConcaveConvex Procedure can be applied to Transductive SVMs, which traditionally require solving a combinatorial search problem. This provides for the first time a highly scalable algorithm in the nonlinear case. Detailed experiments verify the utility of our approach. Software is a ..."
Abstract

Cited by 69 (5 self)
 Add to MetaCart
We show how the ConcaveConvex Procedure can be applied to Transductive SVMs, which traditionally require solving a combinatorial search problem. This provides for the first time a highly scalable algorithm in the nonlinear case. Detailed experiments verify the utility of our approach. Software is available at
Learning from imbalanced data
 IEEE Trans. on Knowledge and Data Engineering
, 2009
"... Abstract—With the continuous expansion of data availability in many largescale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decisionm ..."
Abstract

Cited by 67 (0 self)
 Add to MetaCart
(Show Context)
Abstract—With the continuous expansion of data availability in many largescale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decisionmaking processes. Although existing knowledge discovery and data engineering techniques have shown great success in many realworld applications, the problem of learning from imbalanced data (the imbalanced learning problem) is a relatively new challenge that has attracted growing attention from both academia and industry. The imbalanced learning problem is concerned with the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. In this paper, we provide a comprehensive review of the development of research in learning from imbalanced data. Our focus is to provide a critical review of the nature of the problem, the stateoftheart technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario. Furthermore, in order to stimulate future research in this field, we also highlight the major opportunities and challenges, as well as potential important research directions for learning from imbalanced data. Index Terms—Imbalanced learning, classification, sampling methods, costsensitive learning, kernelbased learning, active learning, assessment metrics. Ç
WorstCase Analysis of Selective Sampling for Linear Classification
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... A selective sampling algorithm is a learning algorithm for classification that, based on the past observed data, decides whether to ask the label of each new instance to be classified. In this paper, we introduce a general technique for turning linearthreshold classification algorithms from the ..."
Abstract

Cited by 41 (6 self)
 Add to MetaCart
A selective sampling algorithm is a learning algorithm for classification that, based on the past observed data, decides whether to ask the label of each new instance to be classified. In this paper, we introduce a general technique for turning linearthreshold classification algorithms from the general additive family into randomized selective sampling algorithms. For the most popular algorithms in this family we derive mistake bounds that hold for individual sequences of examples. These bounds
Learning on the border: Active learning in imbalanced data classification
 In Proc. ACM Conf. on Information and Knowledge Management (CIKM ’07
, 2007
"... This paper is concerned with the class imbalance problem which has been known to hinder the learning performance of classification algorithms. The problem occurs when there are significantly less number of observations of the target concept. Various realworld classification tasks, such as medical d ..."
Abstract

Cited by 33 (3 self)
 Add to MetaCart
(Show Context)
This paper is concerned with the class imbalance problem which has been known to hinder the learning performance of classification algorithms. The problem occurs when there are significantly less number of observations of the target concept. Various realworld classification tasks, such as medical diagnosis, text categorization and fraud detection suffer from this phenomenon. The standard machine learning algorithms yield better prediction performance with balanced datasets. In this paper, we demonstrate that active learning is capable of solving the class imbalance problem by providing the learner more balanced classes. We also propose an efficient way of selecting informative instances from a smaller pool of samples for active learning which does not necessitate a search through the entire dataset. The proposed method yields an efficient querying system and allows active learning to be applied to very large datasets. Our experimental results show that with an early stopping criteria, active learning achieves a fast solution with competitive prediction performance in imbalanced data classification.
Solving Multiclass Support Vector Machines with LaRank
 In 24th International Conference on Machine Learning
, 2007
"... Optimization algorithms for large margin multiclass recognizers are often too costly to handle ambitious problems with structured outputs and exponential numbers of classes. Optimization algorithms that rely on the full gradient are not effective because, unlike the solution, the gradient is not spa ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
(Show Context)
Optimization algorithms for large margin multiclass recognizers are often too costly to handle ambitious problems with structured outputs and exponential numbers of classes. Optimization algorithms that rely on the full gradient are not effective because, unlike the solution, the gradient is not sparse and is very large. The LaRank algorithm sidesteps this difficulty by relying on a randomized exploration inspired by the perceptron algorithm. We show that this approach is competitive with gradient based optimizers on simple multiclass problems. Furthermore, a single LaRank pass over the training examples delivers test error rates that are nearly as good as those of the final solution. 1.
An empirical evaluation of supervised learning in high dimensions
 In International Conference on Machine Learning (ICML
, 2008
"... In this paper we perform an empirical evaluation of supervised learning on highdimensional data. We evaluate performance on three metrics: accuracy, AUC, and squared loss and study the effect of increasing dimensionality on the performance of the learning algorithms. Our findings are consistent with ..."
Abstract

Cited by 24 (0 self)
 Add to MetaCart
(Show Context)
In this paper we perform an empirical evaluation of supervised learning on highdimensional data. We evaluate performance on three metrics: accuracy, AUC, and squared loss and study the effect of increasing dimensionality on the performance of the learning algorithms. Our findings are consistent with previous studies for problems of relatively low dimension, but suggest that as dimensionality increases the relative performance of the learning algorithms changes. To our surprise, the method that performs consistently well across all dimensions is random forests, followed by neural nets, boosted trees, and SVMs. 1.
Efficient name disambiguation for largescale databases
 PKDD
, 2006
"... Name disambiguation can occur when one is seeking a list of publications of an author who has used different name variations and when there are multiple other authors with the same name. We present an efficient integrative framework for solving the name disambiguation problem: a blocking method retr ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
(Show Context)
Name disambiguation can occur when one is seeking a list of publications of an author who has used different name variations and when there are multiple other authors with the same name. We present an efficient integrative framework for solving the name disambiguation problem: a blocking method retrieves candidate classes of authors with similar names and a clustering method, DBSCAN, clusters papers by author. The distance metric between papers used in DBSCAN is calculated by an online active selection support vector machine algorithm (LASVM), yielding a simpler model, lower test errors and faster prediction time than a standard SVM. We prove that by recasting transitivity as density reachability in DBSCAN, transitivity is guaranteed for core points. For evaluation, we manually annotated 3,355 papers yielding 490 authors and achieved 90.6 % pairwiseF1. For scalability, authors in the entire CiteSeer dataset, over 700,000 papers, were readily disambiguated. 1
Online tracking and reacquisition using cotrained generative and discriminative trackers
 in Proc. Eur. Conf. Comput. Vision
"... Abstract. Visual tracking is a challenging problem, as an object may change its appearance due to viewpoint variations, illumination changes, and occlusion. Also, an object may leave the field of view and then reappear. In order to track and reacquire an unknown object with limited labeling data, we ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Visual tracking is a challenging problem, as an object may change its appearance due to viewpoint variations, illumination changes, and occlusion. Also, an object may leave the field of view and then reappear. In order to track and reacquire an unknown object with limited labeling data, we propose to learn these changes online and build a model that describes all seen appearance while tracking. To address this semisupervised learning problem, we propose a cotraining based approach to continuously label incoming data and online update a hybrid discriminative generative model. The generative model uses a number of low dimension linear subspaces to describe the appearance of the object. In order to reacquire an object, the generative model encodes all the appearance variations that have been seen. A discriminative classifier is implemented as an online support vector machine, which is trained to focus on recent appearance variations. The online cotraining of this hybrid approach accounts for appearance changes and allows reacquisition of an object after total occlusion. We demonstrate that under challenging situations, this method has strong reacquisition ability and robustness to distracters in background. 1