Results 1  10
of
173
Learning with local and global consistency
 Advances in Neural Information Processing Systems 16
, 2004
"... We consider the general problem of learning from labeled and unlabeled data, which is often called semisupervised learning or transductive inference. A principled approach to semisupervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic stru ..."
Abstract

Cited by 457 (21 self)
 Add to MetaCart
We consider the general problem of learning from labeled and unlabeled data, which is often called semisupervised learning or transductive inference. A principled approach to semisupervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points. We present a simple algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a number of classification problems and demonstrates effective use of unlabeled data. 1
Manifold regularization: A geometric framework for learning from labeled and unlabeled examples
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semisupervised framework that incorporates labeled and unlabeled data in a generalpurpose learner. Some transductive graph learning al ..."
Abstract

Cited by 343 (13 self)
 Add to MetaCart
We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semisupervised framework that incorporates labeled and unlabeled data in a generalpurpose learner. Some transductive graph learning algorithms and standard methods including Support Vector Machines and Regularized Least Squares can be obtained as special cases. We utilize properties of Reproducing Kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graphbased approaches) we obtain a natural outofsample extension to novel examples and so are able to handle both transductive and truly semisupervised settings. We present experimental evidence suggesting that our semisupervised algorithms are able to use unlabeled data effectively. Finally we have a brief discussion of unsupervised and fully supervised learning within our general framework.
Cluster kernels for semisupervised learning
 Advances in Neural Information Processing Systems
, 2002
"... We propose a framework to incorporate unlabeled data in kernel classifier, based on the idea that two points in the same cluster are more likely to have the same label. This is achieved by modifying the eigenspectrum of the kernel matrix. Experimental results assess the validity of this approach. 1 ..."
Abstract

Cited by 154 (10 self)
 Add to MetaCart
We propose a framework to incorporate unlabeled data in kernel classifier, based on the idea that two points in the same cluster are more likely to have the same label. This is achieved by modifying the eigenspectrum of the kernel matrix. Experimental results assess the validity of this approach. 1
Learning from Labeled and Unlabeled Data with Label Propagation
, 2002
"... We investigate the use of unlabeled data to help labeled data in classification. We propose a simple iterative algorithm, label propagation, to propagate labels through the dataset along high density areas defined by unlabeled data. We give the analysis of the algorithm, show its solution, and its c ..."
Abstract

Cited by 111 (0 self)
 Add to MetaCart
We investigate the use of unlabeled data to help labeled data in classification. We propose a simple iterative algorithm, label propagation, to propagate labels through the dataset along high density areas defined by unlabeled data. We give the analysis of the algorithm, show its solution, and its connection to several other algorithms. We also show how to learn parameters by minimum spanning tree heuristic and entropy minimization, and the algorithm's ability to do feature selection. Experiment results are promising.
On Manifold Regularization
, 2005
"... We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semisupervised framework that incorporates labeled and unlabeled data in a generalpurpose learner. Some transductive graph learni ..."
Abstract

Cited by 73 (0 self)
 Add to MetaCart
We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semisupervised framework that incorporates labeled and unlabeled data in a generalpurpose learner. Some transductive graph learning algorithms and standard methods including Support Vector Machines and Regularized Least Squares can be obtained as special cases. We utilize properties of Reproducing Kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graph based approaches) we obtain a natural outofsample extension to novel examples and are thus able to handle both transductive and truly semisupervised settings. We present experimental evidence suggesting that our semisupervised algorithms are able to use unlabeled data effectively. In the absence of labeled examples, our framework gives rise to a regularized form of spectral clustering with an outofsample extension.
Semisupervised protein classification using cluster kernels
 Advances in Neural Information Processing Systems 16
, 2004
"... kernels ..."
Transfer Learning for Image Classification with Sparse Prototype Representations
"... To learn a new visual category from few examples, prior knowledge from unlabeled data as well as previous related categories may be useful. We develop a new method for transfer learning which exploits available unlabeled data and an arbitrary kernel function; we form a representation based on kernel ..."
Abstract

Cited by 48 (8 self)
 Add to MetaCart
To learn a new visual category from few examples, prior knowledge from unlabeled data as well as previous related categories may be useful. We develop a new method for transfer learning which exploits available unlabeled data and an arbitrary kernel function; we form a representation based on kernel distances to a large set of unlabeled data points. To transfer knowledge from previous related problems we observe that a category might be learnable using only a small subset of reference prototypes. Related problems may share a significant number of relevant prototypes; we find such a concise representation by performing a joint loss minimization over the training sets of related problems with a shared regularization penalty that minimizes the total number of prototypes involved in the approximation. This optimization problem can be formulated as a linear program that can be solved efficiently. We conduct experiments on a newstopic prediction task where the goal is to predict whether an image belongs to a particular news topic. Our results show that when only few examples are available for training a target topic, leveraging knowledge learnt from other topics can significantly improve performance.
On semisupervised classification
 In
, 2005
"... A graphbased prior is proposed for parametric semisupervised classification. The prior utilizes both labelled and unlabelled data; it also integrates features from multiple views of a given sample (e.g., multiple sensors), thus implementing a Bayesian form of cotraining. An EM algorithm for train ..."
Abstract

Cited by 39 (8 self)
 Add to MetaCart
A graphbased prior is proposed for parametric semisupervised classification. The prior utilizes both labelled and unlabelled data; it also integrates features from multiple views of a given sample (e.g., multiple sensors), thus implementing a Bayesian form of cotraining. An EM algorithm for training the classifier automatically adjusts the tradeoff between the contributions of: (a) the labelled data; (b) the unlabelled data; and (c) the cotraining information. Active label query selection is performed using a mutual information based criterion that explicitly uses the unlabelled data and the cotraining information. Encouraging results are presented on public benchmarks and on measured data from single and multiple sensors. 1
Unlabeled Data Can Degrade Classification Performance of Generative Classifiers
 in Fifteenth International Florida Artificial Intelligence Society Conference
, 2002
"... This paper analyzes the effect of unlabeled training data in generative classifiers. We are interested in classification performance when unlabeled data are added to an existing pool of labeled data. We show that unlabeled data can degrade the performance of a classifier when there are discrepancies ..."
Abstract

Cited by 37 (7 self)
 Add to MetaCart
This paper analyzes the effect of unlabeled training data in generative classifiers. We are interested in classification performance when unlabeled data are added to an existing pool of labeled data. We show that unlabeled data can degrade the performance of a classifier when there are discrepancies between modeling assumptions used to build the classifier and the actual model that generates the data
Active learning for anomaly and rarecategory detection
 In Advances in Neural Information Processing Systems 18
, 2004
"... We introduce a novel activelearning scenario in which a user wants to work with a learning algorithm to identify useful anomalies. These are distinguished from the traditional statistical definition of anomalies as outliers or merely illmodeled points. Our distinction is that the usefulness of ano ..."
Abstract

Cited by 34 (0 self)
 Add to MetaCart
We introduce a novel activelearning scenario in which a user wants to work with a learning algorithm to identify useful anomalies. These are distinguished from the traditional statistical definition of anomalies as outliers or merely illmodeled points. Our distinction is that the usefulness of anomalies is categorized subjectively by the user. We make two additional assumptions. First, there exist extremely few useful anomalies to be hunted down within a massive dataset. Second, both useful and useless anomalies may sometimes exist within tiny classes of similar anomalies. The challenge is thus to identify “rare category ” records in an unlabeled noisy set with help (in the form of class labels) from a human expert who has a small budget of datapoints that they are prepared to categorize. We propose a technique to meet this challenge, which assumes a mixture model fit to the data, but otherwise makes no assumptions on the particular form of the mixture components. This property promises wide applicability in reallife scenarios and for various statistical models. We give an overview of several alternative methods, highlighting their strengths and weaknesses, and conclude with a detailed empirical analysis. We show that our method can quickly zoom in on an anomaly set containing a few tens of points in a dataset of hundreds of thousands. 1