Results 1 - 10
of
1,113
Learning with Labeled and Unlabeled Data
, 2001
"... In this paper, on the one hand, we aim to give a review on literature dealing with the problem of supervised learning aided by additional unlabeled data. On the other hand, being a part of the author's first year PhD report, the paper serves as a frame to bundle related work by the author as we ..."
Abstract
-
Cited by 202 (3 self)
- Add to MetaCart
In this paper, on the one hand, we aim to give a review on literature dealing with the problem of supervised learning aided by additional unlabeled data. On the other hand, being a part of the author's first year PhD report, the paper serves as a frame to bundle related work by the author
Intrusion detection with unlabeled data using clustering
- In Proceedings of ACM CSS Workshop on Data Mining Applied to Security (DMSA-2001
, 2001
"... Abstract Intrusions pose a serious security risk in a network environment. Although systems can be hardened against many types of intrusions, often intrusions are successful making systems for detecting these intrusions critical to the security of these system. New intrusion types, of which detectio ..."
Abstract
-
Cited by 191 (6 self)
- Add to MetaCart
detection systems are unaware, are the most difficult to detect. Current signature based methods and learning algorithms which rely on labeled data to train, generally can not detect these new intrusions. In addition, labeled training data in order to train misuse and anomaly detection systems is typically
Enhancing Chinese Word Segmentation Using Unlabeled Data
"... This paper investigates improving supervised word segmentation accuracy with unlabeled data. Both large-scale in-domain data and small-scale document text are considered. We present a unified solution to include features derived from unlabeled data to a discriminative learning model. For the large-s ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
This paper investigates improving supervised word segmentation accuracy with unlabeled data. Both large-scale in-domain data and small-scale document text are considered. We present a unified solution to include features derived from unlabeled data to a discriminative learning model. For the large
Contrastive estimation: Training log-linear models on unlabeled data
- In Proc. of ACL
, 2005
"... Conditional random fields (Lafferty et al., 2001) are quite effective at sequence labeling tasks like shallow parsing (Sha and Pereira, 2003) and namedentity extraction (McCallum and Li, 2003). CRFs are log-linear, allowing the incorporation of arbitrary features into the model. To train on unlabele ..."
Abstract
-
Cited by 160 (16 self)
- Add to MetaCart
on unlabeled data, we require unsupervised estimation methods for log-linear models; few exist. We describe a novel approach, contrastive estimation. We show that the new technique can be intuitively understood as exploiting implicit negative evidence and is computationally efficient. Applied to a sequence
Exploiting Unlabeled Data in Ensemble Methods
"... An adaptive semi-supervised ensemble method, ASSEMBLE, is proposed that constructs classification ensembles based on both labeled and unlabeled data. ASSEMBLE alternates between assigning "pseudo-classes" to the unlabeled data using the existing ensemble and constructing the next base clas ..."
Abstract
-
Cited by 64 (0 self)
- Add to MetaCart
method for the base classifier. ASSEMBLE can be used in conjunction with any cost-sensitive classification algorithm for both two-class and multi-class problems. ASSEMBLE using decision trees won the NIPS 2001 Unlabeled Data Competition. In addition, strong results on several benchmark datasets using
Learning with constrained and unlabeled data
- In CVPR
, 2005
"... Classification problems abundantly arise in many computer vision tasks – being of supervised, semi-supervised or unsupervised nature. Even when class labels are not available, a user still might favor certain grouping solutions over others. This bias can be expressed either by providing a clustering ..."
Abstract
-
Cited by 28 (3 self)
- Add to MetaCart
clustering criterion or cost function and, in addition to that, by specifying pairwise constraints on the assignment of objects to classes. In this work, we discuss a unifying formulation for labelled and unlabelled data that can incorporate constrained data for model fitting. Our approach models
Bilingual Co-Training for Sentiment Classification of Chinese Product Reviews
"... The lack of reliable Chinese sentiment resources limits research progress on Chinese sentiment classification. However, there are many freely available English sentiment resources on the Web. This article focuses on the problem of cross-lingual sentiment classification, which leverages only availabl ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
a bilingual co-training approach to make use of both the English view and the Chinese view based on additional unlabeled Chinese data. Experimental results on two test sets show the effectiveness of the proposed approach, which can outperform basic methods and transductive methods. 1.
Exploiting Unlabeled Data in Ensemble Methods
"... An adaptive semi-supervised ensemble method, ASSEMBLE, is proposed that constructs classification ensembles based on both labeled and unlabeled data. ASSEMBLE alternates between assigning "pseudo-classes" to the unlabeled data using the existing ensemble and constructing the next base clas ..."
Abstract
- Add to MetaCart
method for the base classifier. ASSEMBLE can be used in conjunction with any classification algorithm. ASSEMBLE using decision trees won the NIPS 2001 Unlabeled Data Competition. In addition, strong results on several benchmark datasets using both decision trees and neural networks support the proposed
Machine Learning with Labeled and Unlabeled Data
"... Abstract. The field of semi-supervised learning has been expanding rapidly in the past few years, with a sheer increase in the number of related publications. In this paper we present the SSL problem in contrast with supervised and unsupervised learning. In addition, we propose a taxonomy with which ..."
Abstract
- Add to MetaCart
Abstract. The field of semi-supervised learning has been expanding rapidly in the past few years, with a sheer increase in the number of related publications. In this paper we present the SSL problem in contrast with supervised and unsupervised learning. In addition, we propose a taxonomy
A PAC-style model for learning from labeled and unlabeled data
- In Proceedings of the 18th Annual Conference on Learning Theory
, 2005
"... Abstract. There has been growing interest in practice in using unla-beled data together with labeled data in machine learning, and a number of different approaches have been developed. However, the assumptions these methods are based on are often quite distinct and not captured by standard theoretic ..."
Abstract
-
Cited by 58 (8 self)
- Add to MetaCart
that these numbers depend on. Our model can be viewed as an extension of the standard PAC model, where in ad-dition to a concept class C, one also proposes a type of compatibility that one believes the target concept should have with the underlying distribu-tion. In this view, unlabeled data can be helpful because
Results 1 - 10
of
1,113