Results 1 -
6 of
6
Co-regularization Based Semi-supervised Domain Adaptation
"... This paper presents a co-regularization based approach to semi-supervised domain adaptation. Our proposed approach (EA++) builds on the notion of augmented space (introduced in EASYADAPT (EA) [1]) and harnesses unlabeled data in target domain to further assist the transfer of information from source ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper presents a co-regularization based approach to semi-supervised domain adaptation. Our proposed approach (EA++) builds on the notion of augmented space (introduced in EASYADAPT (EA) [1]) and harnesses unlabeled data in target domain to further assist the transfer of information from source to target. This semi-supervised approach to domain adaptation is extremely simple to implement and can be applied as a pre-processing step to any supervised learner. Our theoretical analysis (in terms of Rademacher complexity) of EA and EA++ show that the hypothesis class of EA++ has lower complexity (compared to EA) and hence results in tighter generalization bounds. Experimental results on sentiment analysis tasks reinforce our theoretical findings and demonstrate the efficacy of the proposed method when compared to EA as well as few other representative baseline approaches. 1
Definition
"... Synonyms: Learning from labeled and unlabeled data, transductive learning ..."
Abstract
- Add to MetaCart
Synonyms: Learning from labeled and unlabeled data, transductive learning
Asymptotic Analysis of Generative Semi-Supervised Learning
"... Semi-supervised learning has emerged as a popular framework for improving modeling accuracy while controlling labeling cost. Based on an extension of stochastic composite likelihood we quantify the asymptotic accuracy of generative semi-supervised learning. In doing so, we complement distributionfre ..."
Abstract
- Add to MetaCart
Semi-supervised learning has emerged as a popular framework for improving modeling accuracy while controlling labeling cost. Based on an extension of stochastic composite likelihood we quantify the asymptotic accuracy of generative semi-supervised learning. In doing so, we complement distributionfree analysis by providing an alternative framework to measure the value associated with different labeling policies and resolve the fundamental question of how much data to label and in what manner. We demonstrate our approach with both simulation studies and real world experiments using naive Bayes for text classification and MRFs and CRFs for structured prediction in NLP. 1.
Co-Training as a Human Collaboration Policy
"... We consider the task of human collaborative category learning, where two people work together to classify test items into appropriate categories based on what they learn from a training set. We propose a novel collaboration policy based on the Co-Training algorithm in machine learning, in which the ..."
Abstract
- Add to MetaCart
We consider the task of human collaborative category learning, where two people work together to classify test items into appropriate categories based on what they learn from a training set. We propose a novel collaboration policy based on the Co-Training algorithm in machine learning, in which the two people play the role of the base learners. The policy restricts each learner’s view of the data and limits their communication to only the exchange of their labelings on test items. In a series of empirical studies, we show that the Co-Training policy leads collaborators to jointly produce unique and potentially valuable classification outcomes that are not generated under other collaboration policies. We further demonstrate that these observations can be explained with appropriate machine learning models.
Learning large-margin halfspaces with more malicious noise
"... We describe a simple algorithm that runs in time poly(n, 1/γ, 1/ε) and learns an unknown n-dimensional γ-margin halfspace to accuracy 1 − ε in the presence of malicious noise, when the noise rate is allowed to be as high as Θ(εγ √ log(1/γ)). Previous efficient algorithms could only learn to accuracy ..."
Abstract
- Add to MetaCart
We describe a simple algorithm that runs in time poly(n, 1/γ, 1/ε) and learns an unknown n-dimensional γ-margin halfspace to accuracy 1 − ε in the presence of malicious noise, when the noise rate is allowed to be as high as Θ(εγ √ log(1/γ)). Previous efficient algorithms could only learn to accuracy ε in the presence of malicious noise of rate at most Θ(εγ). Our algorithm does not work by optimizing a convex loss function. We show that no algorithm for learning γ-margin halfspaces that minimizes a convex proxy for misclassification error can tolerate malicious noise at a rate greater than Θ(εγ); this may partially explain why previous algorithms could not achieve the higher noise tolerance of our new algorithm. 1
Efficient Semi-supervised and Active Learning of Disjunctions
"... We provide efficient algorithms for learning disjunctions in the semi-supervised setting under a natural regularity assumption introduced by (Balcan & Blum, 2005). We prove bounds on the sample complexity of our algorithms under a mild restriction on the data distribution. We also give an active lea ..."
Abstract
- Add to MetaCart
We provide efficient algorithms for learning disjunctions in the semi-supervised setting under a natural regularity assumption introduced by (Balcan & Blum, 2005). We prove bounds on the sample complexity of our algorithms under a mild restriction on the data distribution. We also give an active learning algorithm with improved sample complexity and extend all our algorithms to the random classification noise setting. 1.

