Results 11  20
of
202
Semisupervised graphbased hyperspectral image classification
 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
, 2007
"... This paper presents a semisupervised graphbased method for the classification of hyperspectral images. The method is designed to handle the special characteristics of hyperspectral images, namely high input dimension of pixels, low number of labeled samples, and spatial variability of the spectral ..."
Abstract

Cited by 50 (6 self)
 Add to MetaCart
This paper presents a semisupervised graphbased method for the classification of hyperspectral images. The method is designed to handle the special characteristics of hyperspectral images, namely high input dimension of pixels, low number of labeled samples, and spatial variability of the spectral signature. To alleviate these problems, the method incorporates three ingredients, respectively. First, being a kernelbased method, it combats the curse of dimensionality efficiently. Second, following a semisupervised approach, it exploits the wealth of unlabeled samples in the image, and naturally gives relative importance to the labeled ones through a graphbased methodology. Finally, it incorporates contextual information through a full family of composite kernels. Noting that the graph method relies on inverting a huge kernel matrix formed by both labeled and unlabeled samples, we originally introduce the Nyström method in the formulation to speed up the classification process. The presented semisupervised graphbased method is compared to stateoftheart support vector machines (SVMs) in the classification of hyperspectral data. The proposed method produces better classification maps which capture the intrinsic structure collectively revealed by labeled and unlabeled points. Good and stable accuracy is produced in illposed classification problems (high dimensional spaces and low number of labeled samples). Also, the introduction of the composite kernels framework drastically improves results, and the new fast formulation ranks almost linearly in the computational
CoEM Support Vector Learning
 In Proceedings of the International Conference on Machine Learning
, 2004
"... Multiview algorithms, such as cotraining and coEM, utilize unlabeled data when the available attributes can be split into independent and compatible subsets. CoEM outperforms cotraining for many problems, but it requires the underlying learner to estimate class probabilities, and to learn ..."
Abstract

Cited by 50 (5 self)
 Add to MetaCart
(Show Context)
Multiview algorithms, such as cotraining and coEM, utilize unlabeled data when the available attributes can be split into independent and compatible subsets. CoEM outperforms cotraining for many problems, but it requires the underlying learner to estimate class probabilities, and to learn from probabilistically labeled data. Therefore, coEM has so far only been studied with naive Bayesian learners. We cast linear classifiers into a probabilistic framework and develop a coEM version of the Support Vector Machine.
On semisupervised classification
 In
, 2005
"... A graphbased prior is proposed for parametric semisupervised classification. The prior utilizes both labelled and unlabelled data; it also integrates features from multiple views of a given sample (e.g., multiple sensors), thus implementing a Bayesian form of cotraining. An EM algorithm for train ..."
Abstract

Cited by 49 (10 self)
 Add to MetaCart
(Show Context)
A graphbased prior is proposed for parametric semisupervised classification. The prior utilizes both labelled and unlabelled data; it also integrates features from multiple views of a given sample (e.g., multiple sensors), thus implementing a Bayesian form of cotraining. An EM algorithm for training the classifier automatically adjusts the tradeoff between the contributions of: (a) the labelled data; (b) the unlabelled data; and (c) the cotraining information. Active label query selection is performed using a mutual information based criterion that explicitly uses the unlabelled data and the cotraining information. Encouraging results are presented on public benchmarks and on measured data from single and multiple sensors. 1
Active learning for anomaly and rarecategory detection
 In Advances in Neural Information Processing Systems 18
, 2004
"... We introduce a novel activelearning scenario in which a user wants to work with a learning algorithm to identify useful anomalies. These are distinguished from the traditional statistical definition of anomalies as outliers or merely illmodeled points. Our distinction is that the usefulness of ano ..."
Abstract

Cited by 46 (0 self)
 Add to MetaCart
(Show Context)
We introduce a novel activelearning scenario in which a user wants to work with a learning algorithm to identify useful anomalies. These are distinguished from the traditional statistical definition of anomalies as outliers or merely illmodeled points. Our distinction is that the usefulness of anomalies is categorized subjectively by the user. We make two additional assumptions. First, there exist extremely few useful anomalies to be hunted down within a massive dataset. Second, both useful and useless anomalies may sometimes exist within tiny classes of similar anomalies. The challenge is thus to identify “rare category ” records in an unlabeled noisy set with help (in the form of class labels) from a human expert who has a small budget of datapoints that they are prepared to categorize. We propose a technique to meet this challenge, which assumes a mixture model fit to the data, but otherwise makes no assumptions on the particular form of the mixture components. This property promises wide applicability in reallife scenarios and for various statistical models. We give an overview of several alternative methods, highlighting their strengths and weaknesses, and conclude with a detailed empirical analysis. We show that our method can quickly zoom in on an anomaly set containing a few tens of points in a dataset of hundreds of thousands. 1
Largescale text categorization by batch mode active learning
 In Proceedings of the International World Wide Web Conference
, 2006
"... Largescale text categorization is an important research topic for Web data mining. One of the challenges in largescale text categorization is how to reduce the human efforts in labeling text documents for building reliable classification models. In the past, there have been many studies on applyin ..."
Abstract

Cited by 45 (8 self)
 Add to MetaCart
(Show Context)
Largescale text categorization is an important research topic for Web data mining. One of the challenges in largescale text categorization is how to reduce the human efforts in labeling text documents for building reliable classification models. In the past, there have been many studies on applying active learning methods to automatic text categorization, which try to select the most informative documents for manually labeling. Most of these studies focused on selecting a single unlabeled document in each iteration. As a result, the text categorization model has to be retrained after each labeled document is solicited. In this paper, we present a novel active learning algorithm that selects a batch of text documents for manually labeling in each iteration. The key of the batch mode active learning is how to reduce the redundancy among the selected examples such that each example provides unique information for model updating. To this end, we use the Fisher information matrix as the measurement of model uncertainty and choose the set of documents that can efficiently minimize the Fisher information matrix of a classification model. Extensive experiments with three different datasets have shown that our algorithm is more effective than the stateoftheart active learning techniques for text categorization and can be a promising tool toward largescale text categorization on World Wide Web.
Semisupervised Multilabel Learning by Solving a Sylvester Equation
"... Multilabel learning refers to the problems where an instance can be assigned to more than one category. In this paper, we present a novel Semisupervised algorithm for Multilabel learning by solving a Sylvester Equation (SMSE). Two graphs are first constructed on instance level and category level ..."
Abstract

Cited by 45 (0 self)
 Add to MetaCart
(Show Context)
Multilabel learning refers to the problems where an instance can be assigned to more than one category. In this paper, we present a novel Semisupervised algorithm for Multilabel learning by solving a Sylvester Equation (SMSE). Two graphs are first constructed on instance level and category level respectively. For instance level, a graph is defined based on both labeled and unlabeled instances, where each node represents one instance and each edge weight reflects the similarity between corresponding pairwise instances. Similarly, for category level, a graph is also built based on
Unlabeled Data Can Degrade Classification Performance of Generative Classifiers
 in Fifteenth International Florida Artificial Intelligence Society Conference
, 2002
"... This paper analyzes the effect of unlabeled training data in generative classifiers. We are interested in classification performance when unlabeled data are added to an existing pool of labeled data. We show that unlabeled data can degrade the performance of a classifier when there are discrepancies ..."
Abstract

Cited by 43 (7 self)
 Add to MetaCart
This paper analyzes the effect of unlabeled training data in generative classifiers. We are interested in classification performance when unlabeled data are added to an existing pool of labeled data. We show that unlabeled data can degrade the performance of a classifier when there are discrepancies between modeling assumptions used to build the classifier and the actual model that generates the data
Covariance Kernels from Bayesian Generative Models
 Advances in Neural Information Processing Systems 14
, 2000
"... We propose the framework of mutual information kernels for learning covariance kernels, as used in Support Vector machines and Gaussian process classifiers, from unlabeled task data using Bayesian techniques. We describe an implementation of this framework which uses variational Bayesian mixtures of ..."
Abstract

Cited by 41 (3 self)
 Add to MetaCart
(Show Context)
We propose the framework of mutual information kernels for learning covariance kernels, as used in Support Vector machines and Gaussian process classifiers, from unlabeled task data using Bayesian techniques. We describe an implementation of this framework which uses variational Bayesian mixtures of factor analyzers in order to attack classification problems in highdimensional spaces where labeled data is sparse, but unlabeled data is abundant.
Dissimilarity in graphbased semisupervised classification
 Eleventh International Conference on Artificial Intelligence and Statistics (AISTATS
, 2007
"... Label dissimilarity specifies that a pair of examples probably have different class labels. We present a semisupervised classification algorithm that learns from dissimilarity and similarity information on labeled and unlabeled data. Our approach uses a novel graphbased encoding of dissimilarity th ..."
Abstract

Cited by 39 (2 self)
 Add to MetaCart
(Show Context)
Label dissimilarity specifies that a pair of examples probably have different class labels. We present a semisupervised classification algorithm that learns from dissimilarity and similarity information on labeled and unlabeled data. Our approach uses a novel graphbased encoding of dissimilarity that results in a convex problem, and can handle both binary and multiclass classification. Experiments on several tasks are promising. 1
SemiSupervised Learning: From Gaussian Fields to Gaussian Processes
 School of CS, CMU
, 2003
"... We show that the Gaussian random fields and harmonic energy minimizing function framework for semisupervised learning can be viewed in terms of Gaussian processes, with covariance matrices derived from the graph Laplacian. We derive hyperparameter learning with evidence maximization, and give an em ..."
Abstract

Cited by 39 (1 self)
 Add to MetaCart
We show that the Gaussian random fields and harmonic energy minimizing function framework for semisupervised learning can be viewed in terms of Gaussian processes, with covariance matrices derived from the graph Laplacian. We derive hyperparameter learning with evidence maximization, and give an empirical study of various ways to parameterize the graph weights.