Results 1  10
of
55
SemiSupervised Learning Literature Survey
, 2006
"... We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter ..."
Abstract

Cited by 451 (8 self)
 Add to MetaCart
We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter excerpt from the author’s
doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest
version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
CoTraining and Expansion: Towards Bridging Theory and Practice
, 2004
"... Cotraining is a method for combining labeled and unlabeled data when examples can be thought of as containing two distinct sets of features. It has had a number of practical successes, yet previous theoretical analyses have needed very strong assumptions on the data that are unlikely to be sati ..."
Abstract

Cited by 51 (3 self)
 Add to MetaCart
Cotraining is a method for combining labeled and unlabeled data when examples can be thought of as containing two distinct sets of features. It has had a number of practical successes, yet previous theoretical analyses have needed very strong assumptions on the data that are unlikely to be satisfied in practice.
Understanding the Yarowsky Algorithm
 Computational Linguistics
, 2004
"... This paper analyzes it as optimizing an objective function. More specifically, a number of variants of the Yarowsky algorithm (though not the original algorithm itself ) are shown to optimize either likelihood or a closely related objective function K ..."
Abstract

Cited by 44 (0 self)
 Add to MetaCart
This paper analyzes it as optimizing an objective function. More specifically, a number of variants of the Yarowsky algorithm (though not the original algorithm itself ) are shown to optimize either likelihood or a closely related objective function K
Active Learning with Multiple Views
, 2002
"... Active learners alleviate the burden of labeling large amounts of data by detecting and asking the user to label only the most informative examples in the domain. We focus here on active learning for multiview domains, in which there are several disjoint subsets of features (views), each of which i ..."
Abstract

Cited by 41 (1 self)
 Add to MetaCart
Active learners alleviate the burden of labeling large amounts of data by detecting and asking the user to label only the most informative examples in the domain. We focus here on active learning for multiview domains, in which there are several disjoint subsets of features (views), each of which is sufficient to learn the target concept. In this paper we make several contributions. First, we introduce CoTesting, which is the first approach to multiview active learning. Second, we extend the multiview learning framework by also exploiting weak views, which are adequate only for learning a concept that is more general/specific than the target concept. Finally, we empirically show that CoTesting outperforms existing active learners on a variety of real world domains such as wrapper induction, Web page classification, advertisement removal, and discourse tree parsing. 1.
Weakly Supervised Natural Language Learning Without Redundant Views
 In Proceedings of HLTNAACL
, 2003
"... We investigate singleview algorithms as an alternative to multiview algorithms for weakly supervised learning for natural language processing tasks without a natural feature split. In particular, we apply cotraining, selftraining, and EM to one such task and find that both selftraining and FSEM ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
We investigate singleview algorithms as an alternative to multiview algorithms for weakly supervised learning for natural language processing tasks without a natural feature split. In particular, we apply cotraining, selftraining, and EM to one such task and find that both selftraining and FSEM, a new variation of EM that incorporates feature selection, outperform cotraining and are comparatively less sensitive to parameter changes.
Semisupervised regression with cotraining style algorithms
, 2007
"... The traditional setting of supervised learning requires a large amount of labeled training examples in order to achieve good generalization. However, in many practical applications, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain. Therefore, semisup ..."
Abstract

Cited by 28 (5 self)
 Add to MetaCart
The traditional setting of supervised learning requires a large amount of labeled training examples in order to achieve good generalization. However, in many practical applications, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain. Therefore, semisupervised learning has attracted much attention. Previous research on semisupervised learning mainly focuses on semisupervised classification. Although regression is almost as important as classification, semisupervised regression is largely understudied. In particular, although cotraining is a main paradigm in semisupervised learning, few works has been devoted to cotraining style semisupervised regression algorithms. In this paper, a cotraining style semisupervised regression algorithm, i.e. COREG, is proposed. This algorithm uses two regressors each labels the unlabeled data for the other regressor, where the confidence in labeling an unlabeled example is estimated through the amount of reduction in mean square error over the labeled neighborhood of that example. Analysis and experiments show that COREG can effectively exploit unlabeled data to improve regression estimates.
Efficient coregularised least squares regression
 in ICML’06
, 2006
"... In many applications, unlabelled examples are inexpensive and easy to obtain. Semisupervised approaches try to utilise such examples to reduce the predictive error. In this paper, we investigate a semisupervised least squares regression algorithm based on the colearning approach. Similar to other ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
In many applications, unlabelled examples are inexpensive and easy to obtain. Semisupervised approaches try to utilise such examples to reduce the predictive error. In this paper, we investigate a semisupervised least squares regression algorithm based on the colearning approach. Similar to other semisupervised algorithms, our base algorithm has cubic runtime complexity in the number of unlabelled examples. To be able to handle larger sets of unlabelled examples, we devise a semiparametric variant that scales linearly in the number of unlabelled examples. Experiments show a significant error reduction by coregularisation and a large runtime improvement for the semiparametric approximation. Last but not least, we propose a distributed procedure that can be applied without collecting all data at a single site. 1.
Multiview regression via canonical correlation analysis
 In Proc. of Conference on Learning Theory
, 2007
"... Abstract. In the multiview regression problem, we have a regression problem where the input variable (which is a real vector) can be partitioned into two different views, where it is assumed that either view of the input is sufficient to make accurate predictions — this is essentially (a significan ..."
Abstract

Cited by 28 (5 self)
 Add to MetaCart
Abstract. In the multiview regression problem, we have a regression problem where the input variable (which is a real vector) can be partitioned into two different views, where it is assumed that either view of the input is sufficient to make accurate predictions — this is essentially (a significantly weaker version of) the cotraining assumption for the regression problem. We provide a semisupervised algorithm which first uses unlabeled data to learn a norm (or, equivalently, a kernel) and then uses labeled data in a ridge regression algorithm (with this induced norm) to provide the predictor. The unlabeled data is used via canonical correlation analysis (CCA, which is a closely related to PCA for two random variables) to derive an appropriate norm over functions. We are able to characterize the intrinsic dimensionality of the subsequent ridge regression problem (which uses this norm) by the correlation coefficients provided by CCA in a rather simple expression. Interestingly, the norm used by the ridge regression algorithm is derived from CCA, unlike in standard kernel methods where a special apriori norm is assumed (i.e. a Banach space is assumed). We discuss how this result shows that unlabeled data can decrease the sample complexity. 1
Bootstrapping POS taggers using unlabelled data
 Proceedings of CoNLL2003
, 2003
"... This paper investigates booststrapping partofspeech taggers using cotraining, in which two taggers are iteratively retrained on each other’s output. Since the output of the taggers is noisy, there is a question of which newly labelled examples to add to the training set. We investigate selecting ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
This paper investigates booststrapping partofspeech taggers using cotraining, in which two taggers are iteratively retrained on each other’s output. Since the output of the taggers is noisy, there is a question of which newly labelled examples to add to the training set. We investigate selecting examples by directly maximising tagger agreement on unlabelled data, a method which has been theoretically and empirically motivated in the cotraining literature. Our results show that agreementbased cotraining can significantly improve tagging performance for small seed datasets. Further results show that this form of cotraining considerably outperforms selftraining. However, we find that simply retraining on all the newly labelled data can, in some cases, yield comparable results to agreementbased cotraining, with only a fraction of the computational cost. 1
Enhancing relevance feedback in image retrieval using unlabeled data
 ACM Transactions on Information Systems
, 2006
"... Relevance feedback is an effective scheme bridging the gap between highlevel semantics and lowlevel features in contentbased image retrieval (Cbir). In contrast to previous methods which rely on labeled images provided by the user, this paper attempts to enhance the performance of relevance feedba ..."
Abstract

Cited by 23 (8 self)
 Add to MetaCart
Relevance feedback is an effective scheme bridging the gap between highlevel semantics and lowlevel features in contentbased image retrieval (Cbir). In contrast to previous methods which rely on labeled images provided by the user, this paper attempts to enhance the performance of relevance feedback by exploiting unlabeled images existing in the database. Concretely, this paper integrates the merits of semisupervised learning and active learning into the relevance feedback process. In detail, in each round of relevance feedback, two simple learners are trained from the labeled data, i.e. images from user query and user feedback. Each learner then labels some unlabeled images in the database for the other learner. After retraining with the additional labeled data, the learners classify the images in the database again and then their classifications are merged. Images judged to be positive with high confidence are returned as the retrieval result, while those judged with low confidence are put into the pool which is used in the next round of relevance feedback. Experiments show that using semisupervised learning and active learning simultaneously in Cbir is beneficial, and the proposed method achieves better performance than some existing methods.