Results 1  10
of
12
Transductive Inference for Text Classification using Support Vector Machines
, 1999
"... This paper introduces Transductive Support Vector Machines (TSVMs) for text classification. While regular Support Vector Machines (SVMs) try to induce a general decision function for a learning task, Transductive Support Vector Machines take into account a particular test set and try to minimiz ..."
Abstract

Cited by 682 (4 self)
 Add to MetaCart
This paper introduces Transductive Support Vector Machines (TSVMs) for text classification. While regular Support Vector Machines (SVMs) try to induce a general decision function for a learning task, Transductive Support Vector Machines take into account a particular test set and try to minimize misclassifications of just those particular examples. The paper presents an analysis of why TSVMs are well suited for text classification. These theoretical findings are supported by experiments on three test collections. The experiments show substantial improvements over inductive methods, especially for small training sets, cutting the number of labeled training examples down to a twentieth on some tasks. This work also proposes an algorithm for training TSVMs efficiently, handling 10,000 examples and more.
Optimization Techniques for SemiSupervised Support Vector Machines
"... Due to its wide applicability, the problem of semisupervised classification is attracting increasing attention in machine learning. SemiSupervised Support Vector Machines (S 3 VMs) are based on applying the margin maximization principle to both labeled and unlabeled examples. Unlike SVMs, their fo ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
Due to its wide applicability, the problem of semisupervised classification is attracting increasing attention in machine learning. SemiSupervised Support Vector Machines (S 3 VMs) are based on applying the margin maximization principle to both labeled and unlabeled examples. Unlike SVMs, their formulation leads to a nonconvex optimization problem. A suite of algorithms have recently been proposed for solving S 3 VMs. This paper reviews key ideas in this literature. The performance and behavior of various S 3 VM algorithms is studied together, under a common experimental setting.
A continuation method for semisupervised svms
 In International Conference on Machine Learning
, 2006
"... SemiSupervised Support Vector Machines (S3VMs) are an appealing method for using unlabeled data in classification: their objective function favors decision boundaries which do not cut clusters. However their main problem is that the optimization problem is nonconvex and has many local minima, whic ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
SemiSupervised Support Vector Machines (S3VMs) are an appealing method for using unlabeled data in classification: their objective function favors decision boundaries which do not cut clusters. However their main problem is that the optimization problem is nonconvex and has many local minima, which often results in suboptimal performances. In this paper we propose to use a global optimization technique known as continuation to alleviate this problem. Compared to other algorithms minimizing the same objective function, our continuation method often leads to lower test errors. 1.
Branch and Bound for SemiSupervised Support Vector Machines
"... Semisupervised SVMs (S³VM) attempt to learn lowdensity separators by maximizing the margin over labeled and unlabeled examples. The associated optimization problem is nonconvex. To examine the full potential of S3VMs modulo local minima problems in current implementations, we apply branch and bou ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
Semisupervised SVMs (S³VM) attempt to learn lowdensity separators by maximizing the margin over labeled and unlabeled examples. The associated optimization problem is nonconvex. To examine the full potential of S3VMs modulo local minima problems in current implementations, we apply branch and bound techniques for obtaining exact, globally optimal solutions. Empirical evidence suggests that the globally optimal solution can return excellent generalization performance in situations where other implementations fail completely. While our current implementation is only applicable to small datasets, we discuss variants that can potentially lead to practically useful algorithms.
Using Labeled and Unlabeled Data to Learn Drifting Concepts
 In Workshop notes of IJCAI01 Workshop on Learning from Temporal and Spatial Data
, 2001
"... For many learning tasks, where data is collected over an extended period of time, one has to cope two problems. The distribution underlying the data is likely to change and only little labeled training data is available at each point in time. A typical example is information filtering, i. e. th ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
For many learning tasks, where data is collected over an extended period of time, one has to cope two problems. The distribution underlying the data is likely to change and only little labeled training data is available at each point in time. A typical example is information filtering, i. e. the adaptive classification of documents with respect to a particular user interest. Both the interest of the user and the document content change over time. A filtering system should be able to adapt to such concept changes. Since users often give little feedback, a filtering system should also be able to achieve a good performance, even if only few labeled training examples are provided. This paper proposes a method to recognize and handle concept changes with support vector machines and to use unlabeled data to reduce the need for labeled data. The method maintains windows on the training data, whose size is automatically adjusted so that the estimated generalization error is minimized. The approach is both theoretically wellfounded as well as effective and efficient in practice. Since it does not require complicated parameterization, it is simpler to use and more robust than comparable heuristics. Experiments with simulated concept drift scenarios based on realworld text data compare the new method with other window management approaches and show that it can effectively select an appropriate window size in a robust way. In order to achieve an acceptable performance with fewer labeled training examples, the proposed method exploits unlabeled examples in a transductive way. 1
ON SEMISUPERVISED KERNEL METHODS
"... Semisupervised learning is an emerging computational paradigm for learning from limited supervision by utilizing large amounts of inexpensive, unsupervised observations. Not only does this paradigm carry appeal as a model for natural learning, but it also has an increasing practical need in most if ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Semisupervised learning is an emerging computational paradigm for learning from limited supervision by utilizing large amounts of inexpensive, unsupervised observations. Not only does this paradigm carry appeal as a model for natural learning, but it also has an increasing practical need in most if not all applications of machine learning – those where abundant amounts of data can be cheaply and automatically collected but manual labeling for the purposes of training learning algorithms is often slow, expensive, and errorprone. In this thesis, we develop families of algorithms for semisupervised inference. These algorithms are based on intuitions about the natural structure and geometry of probability distributions that underlie typical datasets for learning. The classical framework of Regularization in Reproducing Kernel Hilbert Spaces (which is the basis of stateoftheart supervised algorithms such as SVMs) is extended in several ways to utilize unlabeled data. These extensions are embodied in the following contributions: (1) Manifold Regularization is based on the assumption that highdimensional
SemiSupervised Learning in Initially Labeled NonStationary Environments with Gradual Drift
 International Joint Conference on Neural Networks (IJCNN 2012
, 2012
"... Abstract—Semisupervised learning (SSL) in nonstationary environments has received relatively little attention in machine learning, despite a growing number of applications that can benefit from a properly configured SSL algorithm. Previous works in learning nonstationary data have analyzed such c ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract—Semisupervised learning (SSL) in nonstationary environments has received relatively little attention in machine learning, despite a growing number of applications that can benefit from a properly configured SSL algorithm. Previous works in learning nonstationary data have analyzed such cases where both labeled and unlabeled instances are received at every time step and/or in regular intervals; however, to the best of our knowledge, no work has investigated the case where labeled instances are received only at the initial time step, followed by unlabeled instances provided in subsequent time steps. In this proofofconcept work, we propose a new framework for learning in a nonstationary environment that provides only unlabeled data after the initial time step, to which we refer to as initially labeled environment. The proposed framework generates labels for previously unlabeled data at each time step to be combined with incoming unlabeled data – possibly from a drifting distribution using a compacted polytope sample extraction algorithm. We have conducted two experiments to demonstrate the feasibility and reliability of the approach. This proofofconcept is presented in two dimensions; however, the algorithm can be extended to higher dimensions with appropriate modifications. Keywordsalpha shape; concept drift; nonstationary environment; shape offsets; semisupervised learning I.
Dipl.Math. Dipl.Inform.
"... A common task in the field of machine learning is the classification of objects. The basis for such a task is usually a training set consisting of patterns and associated class labels. A typical example is, for instance, the automatic classification of stars and galaxies in the field of astronomy. H ..."
Abstract
 Add to MetaCart
A common task in the field of machine learning is the classification of objects. The basis for such a task is usually a training set consisting of patterns and associated class labels. A typical example is, for instance, the automatic classification of stars and galaxies in the field of astronomy. Here, the training set could consist of images and associated labels, which indicate whether a particular image shows a star or a galaxy. For such a learning scenario, one aims at generating models that can automatically classify new, unseen images. In the field of machine learning, various classification schemes have been proposed. One of the most popular ones is the concept of support vector machines, which often yields excellent classification results given sufficient labeled data. However, for a variety of realworld tasks, the acquisition of sufficient labeled data can be quite timeconsuming. In contrast to labeled training data, unlabeled one can often be obtained easily in huge quantities. Semi and unsupervised techniques aim at taking these unlabeled patterns into account to generate appropriate models. In the literature, various ways of extending support vector machines to these scenarios have been proposed. One of
TRANSDUCTIVE LEARNING, Author KERNEL manuscript, MAP, published IMAGE ANNOTATION: in "BMVC, United BMVC Kingdom SUBMISSION (2012) " 1 DOI: 10.5244/C.26.68 Transductive Kernel Map Learning and its Application to Image Annotation
, 2013
"... We introduce in this paper a novel image annotation approach based on maximum margin classification and a new class of kernels. The method goes beyond the naive use of existing kernels and their restricted combinations in order to design “modelfree“ transductive kernels applicable to interconnected ..."
Abstract
 Add to MetaCart
We introduce in this paper a novel image annotation approach based on maximum margin classification and a new class of kernels. The method goes beyond the naive use of existing kernels and their restricted combinations in order to design “modelfree“ transductive kernels applicable to interconnected image databases. The main contribution of our method includes the minimization of an energy function mixing i) a reconstruction term that factorizes a matrix of interconnected image data as a product of a learned dictionary and a learned kernel map ii) a fidelity term that ensures consistent label predictions with those provided in a training set and iii) a smoothness term which guarantees similar labels for neighboring data and allows us to iteratively diffuse kernel maps and labels from labeled to unlabeled images. Solving this minimization problem makes it possible to learn both a decision criterion and a kernel map that guarantee linear separability in a high dimensional space and good generalization performance. Experiments conducted on image annotation, show that our obtained kernel achieves at least comparable results with related state of the art methods on the MSRC and the Corel5k databases. 1
Author manuscript, published in "ICIP 2012, United States (2012)" TRANSDUCTIVE INFERENCE & KERNEL DESIGN FOR OBJECT CLASS SEGMENTATION
, 2013
"... Transductive inference techniques are nowadays becoming standard in machine learning due to their relative success in solving many realworld applications. Among them, kernelbased methods are particularly interesting but their success remains highly dependent on the choice of kernels. The latter ar ..."
Abstract
 Add to MetaCart
Transductive inference techniques are nowadays becoming standard in machine learning due to their relative success in solving many realworld applications. Among them, kernelbased methods are particularly interesting but their success remains highly dependent on the choice of kernels. The latter are usually handcrafted or designed in order to capture better similarity in training data. In this paper, we introduce a novel transductive learning algorithm for kernel design and classification. Our approach is based on the minimization of an energy function mixing i) a reconstruction term that factorizes a matrix of input data as a product of a learned dictionary and a learned kernel map ii) a fidelity term that ensures consistent label predictions with those provided in a groundtruth and iii) a smoothness term which guarantees similar labels for neighboring data and allows us to iteratively diffuse kernel maps and labels from labeled to unlabeled data. Solving this minimization problem makes it possible to learn both a decision criterion and a kernel map that guarantee linear separability in a high dimensional space and good generalization performance. Experiments conducted on object class segmentation, show improvements with respect to baseline as well as related work on the challenging VOC database. 1.