MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Combining Labeled and Unlabeled Data with Co-Training (1998) [640 citations — 18 self]

Abstract:

We consider the problem of using a large unlabeled sample to boost performance of a learning algorithm when only a small set of labeled examples is available. In particular, we consider a problem setting motivated by the task of learning to classify web pages, in which the description of each example can be partitioned into two distinct views. For example, the description of a web page can be partitioned into the words occurring on that page, and the words occurring in hyperlinks that point to that page. We assume that either view of the example would be sufficient for learning if we had enough labeled data, but our goal is to use both views together to allow inexpensive unlabeled data to augment a much smaller set of labeled examples. Specifically, the presence of two distinct views of each example suggests strategies in which two learning algorithms are trained separately on each view, and then each algorithm 's predictions on new unlabeled examples are used to enlarge the training s...

Citations

4735 Maximum Likelihood from incomplete data via the EM algorithm – Dempster, Laird, et al. - 1977
3011 Pattern Classification and Scene Analysis – Duda, Hart - 1973
273 Unsupervised word sense disambiguation rivaling supervised methods – Yarowsky - 1995
249 Learning to extract symbolic knowledge from the World Wide Web – Craven, DiPasquo, et al. - 1998
216 Pattern Classi cation and Scene Analysis – Duda, Hart - 1973
213 A comparison of two learning algorithms for text categorization – Lewis, Ringuette - 1994
206 Efficient noise-tolerant learning from statistical queries – Kearns - 1993
125 Supervised learning from incomplete data via an EM approach – Ghahramani, Jordan - 1994
71 The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter – Castelli, Cover - 1996
67 On the complexity of teaching – Goldman, Kearns - 1995
65 Informedia: newson-demand multimedia information acquisition and retrieval – Hauptmann, Witbrock - 1997
51 The exponential value of labeled samples – Castelli, Cover - 1995
43 Random sampling in cut, flow, and network design problems – Karger - 1994
29 Learning from a mixture of labeled and unlabeled examples with parametric side information – Ratsaby, Venkatesh - 1995
20 A computational model of teaching – Jackson, Tomkins - 1992
7 Pac learning with constant-partition classification noise and applications to decision tree induction – Decatur - 1997
7 Improving Acoustic Models by Watching Television – Witbrock, Hauptmann - 1998
3 cient noise-tolerant learning from statistical queries – unknown authors - 1993
1 learning with constantpartition classi cation noise and applications to decision tree induction – PAC - 1997
1 Random sampling in cut, ow, and network design problems – Karger - 1994
1 Pattern Classificataon and Scene Analysis – Duda, Hart - 1973
1 R itn~ I 01x1 sampling in cut, flow, and network – Karger - 1997
1 noise-tolerant learning from statistical queries – Efficient - 1993