Results 1 -
4 of
4
Adaptive Kernel Approximation for Large-Scale Non-Linear SVM Prediction
"... The applicability of non-linear support vector machines (SVMs) has been limited in largescale data collections because of their linear prediction complexity to the size of support vectors. We propose an efficient prediction algorithm with performance guarantee for non-linear SVMs, termed AdaptSVM. I ..."
Abstract
- Add to MetaCart
The applicability of non-linear support vector machines (SVMs) has been limited in largescale data collections because of their linear prediction complexity to the size of support vectors. We propose an efficient prediction algorithm with performance guarantee for non-linear SVMs, termed AdaptSVM. It can selectively collapse the kernel function computation to a reduced set of support vectors, compensated by an additional correction term that can be easily computed on-line. It also allows adaptive fall-back to original kernel computation based on its estimated variance and maximum error tolerance. In addition to theoretical analysis, we empirically evaluate on multiple large-scale datasets to show that the proposed algorithm can speed up the prediction process up to 104 times with only < 0.5 % accuracy loss. 1.
A Dependency-based Analysis of Treebank Annotation Errors
"... In this paper, we investigate errors in syntax annotation with the Turku Dependency Treebank, a recently published treebank of Finnish, as study material. This treebank uses the Stanford Dependency scheme as its syntax representation, and its published data contains all data created in the full doub ..."
Abstract
- Add to MetaCart
In this paper, we investigate errors in syntax annotation with the Turku Dependency Treebank, a recently published treebank of Finnish, as study material. This treebank uses the Stanford Dependency scheme as its syntax representation, and its published data contains all data created in the full double annotation as well as timing information, both of which are necessary for this study. First, we examine which syntactic structures are the most error-prone for human annotators, and compare these results to those of a baseline automatic parser. We find that annotation decisions involving highly semantic distinctions, as well as certain morphological ambiguities, are especially difficult for both human annotators and the parser. Second, we train an automatic system that offers for inspection sentences ordered by their likelihood of containing errors. We find that the system achieves a performance that is clearly superior to the random baseline: for instance, by inspecting 10 % of all sentences ordered by our system, it is possible to weed out 25 % of errors. 1
Improved Learning of . . . : TRAINING WITH LATENT VARIABLES AND NONLINEAR KERNELS
, 2011
"... Structured output prediction in machine learning is the study of learning to predict complex objects consisting of many correlated parts, such as sequences, trees, or matchings. The Structural Support Vector Machine (Structural SVM) algorithm is a discriminative method for structured output learning ..."
Abstract
- Add to MetaCart
Structured output prediction in machine learning is the study of learning to predict complex objects consisting of many correlated parts, such as sequences, trees, or matchings. The Structural Support Vector Machine (Structural SVM) algorithm is a discriminative method for structured output learning that allows flexible feature construction with robust control for overfitting. It provides stateof-art prediction accuracies for many structured output prediction tasks in natural language processing, computational biology, and information retrieval. This thesis explores improving the learning of structured prediction rules with structural SVMs in two main areas: incorporating latent variables to extend their scope of application and speeding up the training of structural SVMs with nonlinear kernels. In particular, we propose a new formulation of structural SVM, called Latent Structural SVM, that allows the use of latent variables, and an algorithm to solve the associated non-convex optimization problem. We demonstrate the generality of our new algorithm through several structured output prediction problems, showing improved prediction accuracies with new
Efficient Optimization of Performance Measures by Classifier Adaptation
"... Abstract—In practical applications, machine learning algorithms are often needed to learn classifiers that optimize domain specific performance measures. Previously, the research has focused on learning the needed classifier in isolation, yet learning nonlinear classifier for nonlinear and nonsmooth ..."
Abstract
- Add to MetaCart
Abstract—In practical applications, machine learning algorithms are often needed to learn classifiers that optimize domain specific performance measures. Previously, the research has focused on learning the needed classifier in isolation, yet learning nonlinear classifier for nonlinear and nonsmooth performance measures is still hard. In this paper, rather than learning the needed classifier by optimizing specific performance measure directly, we circumvent this problem by proposing a novel twostep approach called as CAPO, namely to first train nonlinear auxiliary classifiers with existing learning methods, and then to adapt auxiliary classifiers for specific performance measures. In the first step, auxiliary classifiers can be obtained efficiently by taking off-the-shelf learning algorithms. For the second step, we show that the classifier adaptation problem can be reduced to a quadratic program problem, which is similar to linear SVM perf and can be efficiently solved. By exploiting nonlinear auxiliary classifiers, CAPO can generate nonlinear classifier which optimizes a large variety of performance measures including all the performance measure based on the contingency table and AUC, whilst keeping high computational efficiency. Empirical studies show that CAPO is effective and of high computational efficiency, and even it is more efficient than linear SVM perf. Index Terms—Optimize performance measures, classifier adaptation, ensemble learning, curriculum learning 1

