## Transductive Learning via Spectral Graph Partitioning (2003)

### Cached

### Download Links

- [www.cs.cornell.edu]
- [www.aaai.org]
- [www.aaai.org]
- [www.hpl.hp.com]
- [www.cs.cornell.edu]
- [www.cs.cornell.edu]
- [pdf.aminer.org]
- DBLP

### Other Repositories/Bibliography

Venue: | In ICML |

Citations: | 201 - 0 self |

### BibTeX

@INPROCEEDINGS{Joachims03transductivelearning,

author = {Thorsten Joachims},

title = {Transductive Learning via Spectral Graph Partitioning},

booktitle = {In ICML},

year = {2003},

pages = {290--297}

}

### Years of Citing Articles

### OpenURL

### Abstract

We present a new method for transductive learning, which can be seen as a transductive version of the k nearest-neighbor classifier.

### Citations

10096 | Statistical Learning Theory
- Vapnik
- 1998
(Show Context)
Citation Context ...ial case. 1. Introduction For some applications, the examples for which a prediction is needed are already known when training the classifier. This kind of prediction is called Transductive Learning (=-=Vapnik, 1998-=-). An example of such a task is relevance feedback in information retrieval. In relevance feedback, users can give positive and negative examples for the kinds of documents they are interested in. The... |

2800 | Normalized Cuts and Image Segmentation
- Shi, Malik
- 2000
(Show Context)
Citation Context ... the ratiocut (Hagen & Kahng, 1992). However, the traditional ratiocut problem is unsupervised, i.e. there are no constraints (11) and (12). Solving the unconstrained ratiocut is known to be NP hard (=-=Shi & Malik, 2000-=-). However, efficient methods based on the spectrum of the graph exist that give good approximations to the solution (Hagen & Kahng, 1992). The following will generalize these methods to the case of c... |

1332 | Combining Labeled and Unlabeled Data with Co-Training
- Blum
- 1998
(Show Context)
Citation Context ...refined by (Bennett, 1999) and (Joachims, 1999). Other methods are based on s-t mincuts (Blum & Chawla, 2001) or on multi-way cuts (Kleinberg & Tardos, 1999). Related is also the idea of Co-Training (=-=Blum & Mitchell, 1998-=-), which exploits structure resulting from two redundant representations. We will study what these approaches have in common and where they have problems. In particular, we will focus on s-t Mincuts, ... |

867 | Text classification from labeled and unlabeled documents using
- Nigam, McCallum, et al.
- 2000
(Show Context)
Citation Context ...of a random walk over the cut (Meila & Shi, 2001). This might also lead to a connection to the generative modeling approach of Nigam et al., where the label of each test example is a latent variable (=-=Nigam et al., 2000-=-). 6. Experiments To evaluate the SGT, we performed experiments on six datasets and report results for all of them. The datasets are the ten most frequent categories from the Reuters-21578 text classi... |

730 | Transductive inference for text classification using support vector machines
- Joachims
- 1999
(Show Context)
Citation Context ...exploit structure in their distribution. Several methods have been designed with this goal in mind. Vapnik introduced transductive SVMs (Vapnik, 1998) which were later refined by (Bennett, 1999) and (=-=Joachims, 1999-=-). Other methods are based on s-t mincuts (Blum & Chawla, 2001) or on multi-way cuts (Kleinberg & Tardos, 1999). Related is also the idea of Co-Training (Blum & Mitchell, 1998), which exploits structu... |

433 |
Learning to Classify Text Using Support Vector Machines
- Joachims
- 2001
(Show Context)
Citation Context ... known accurately. However, this is typically not the case and we use an estimate based on the training set in this work. If the true fraction is used, the TSVM achieves a performance of 62.3. While (=-=Joachims, 2002-=-) proposes measures to detect when the wrong fraction was used, this can only be done after running the TSVM. Repeatedly trying di#erent fractions is prohibitively expensive. How E#ective is the SGT f... |

339 | Co-clustering documents and words using bipartite spectral graph partitioning
- Dhillon
- 2001
(Show Context)
Citation Context ...A the Laplacian of the graph with adjacency matrix A and diagonal degree matrix B, B ii = # j A ij . We require that the graph is undirected, so that L is symmetric positive semi-definite. Following (=-=Dhillon, 2001-=-) and ignoring the constraints, the unsupervised ratiocut optimization problem can equivalently be written as min #z #z T L#z #z T #z with z i # {#+ , #- } (14) where #+ = # |{i:z is|{i:z i >0}| and #... |

286 | Learning from labeled and unlabeled data using graph mincuts
- Blum, Chawla
- 2001
(Show Context)
Citation Context ...have been designed with this goal in mind. Vapnik introduced transductive SVMs (Vapnik, 1998) which were later refined by (Bennett, 1999) and (Joachims, 1999). Other methods are based on s-t mincuts (=-=Blum & Chawla, 2001-=-) or on multi-way cuts (Kleinberg & Tardos, 1999). Related is also the idea of Co-Training (Blum & Mitchell, 1998), which exploits structure resulting from two redundant representations. We will study... |

229 | New spectral methods for ratio cut partitioning and clustering
- Hagen, Kahng
- 1992
(Show Context)
Citation Context ...e cut.cut(G + ,G − ) maxy |{i : yi =1}||{i : yi = −1}| (10) s.t. yi =1, if i ∈ Yl and positive (11) yi = −1, if i ∈ Yl and negative (12) y ∈{+1, −1} n (13) This problem is related to the ratiocut (=-=Hagen & Kahng, 1992-=-). However, the traditional ratiocut problem is unsupervised, i.e. there are no constraints (11) and (12). Solving the unconstrained ratiocut is known to be NP hard (Shi & Malik, 2000). However, effic... |

221 | Partially Labeled Classification with Markov Random Walks
- Szummer, Jaakkola
- 2001
(Show Context)
Citation Context ...spectrum. Szummer and Jaakkola apply short random walks on the kNN graph for labeling test examples, exploiting that a random walk will less likely cross cluster boundaries, but stay within clusters (=-=Szummer & Jaakkola, 2001-=-). There might be an interesting connection to the SGT, since the normalized cut minimizes the transition probability of a random walk over the cut (Meila & Shi, 2001). This might also lead to a conne... |

171 | A random walks view of spectral segmentation
- Meilă, Shi
- 2001
(Show Context)
Citation Context ... but stay within clusters (Szummer & Jaakkola, 2001). There might be an interesting connection to the SGT, since the normalized cut minimizes the transition probability of a random walk over the cut (=-=Meila & Shi, 2001-=-). This might also lead to a connection to the generative modeling approach of Nigam et al., where the label of each test example is a latent variable (Nigam et al., 2000). 6. Experiments To evaluate ... |

166 | Approximation algorithms for classification problems with pairwise relationships: Metric labeling and markov random fields
- Kleinberg, Tardos
- 1999
(Show Context)
Citation Context ...Vapnik introduced transductive SVMs (Vapnik, 1998) which were later refined by (Bennett, 1999) and (Joachims, 1999). Other methods are based on s-t mincuts (Blum & Chawla, 2001) or on multi-way cuts (=-=Kleinberg & Tardos, 1999-=-). Related is also the idea of Co-Training (Blum & Mitchell, 1998), which exploits structure resulting from two redundant representations. We will study what these approaches have in common and where ... |

155 | Cluster kernels for semi-supervised learning
- Chapelle, Weston, et al.
- 2003
(Show Context)
Citation Context ...alues and eigenvectors of L and store them in D and V . • To normalize the spectrum of the graph, replace the eigenvalues in D with some monotonically increasing function. We use Dii = i 2 (see also (=-=Chapelle et al., 2002-=-)). Fixing the spectrum of the graph in this way abstracts, for example, from different magnitudes of edge weights, and focuses on the ranking among the smallest cuts. Thefollowingstepshavetobedonefor... |

58 |
Combining support vector and mathematical programming methods for classification
- Bennett
- 1999
(Show Context)
Citation Context ...set and potentially exploit structure in their distribution. Several methods have been designed with this goal in mind. Vapnik introduced transductive SVMs (Vapnik, 1998) which were later refined by (=-=Bennett, 1999-=-) and (Joachims, 1999). Other methods are based on s-t mincuts (Blum & Chawla, 2001) or on multi-way cuts (Kleinberg & Tardos, 1999). Related is also the idea of Co-Training (Blum & Mitchell, 1998), w... |

24 | Transductive inference for estimating values of functions
- Chapelle, Vapnik, et al.
- 1999
(Show Context)
Citation Context ...st examples so that the margin is maximized. A large-margin SVM can be shown to have low leave-one-out error (Vapnik, 1998). Other transductive learning algorithms like transductive ridge-regression (=-=Chapelle et al., 1999-=-) and mincuts (Blum & Chawla, 2001) minimize leave-one-out error as well. However, leave-one-out is not the only measure of self-consistency. The co-training algorithm (Blum & Mitchell, 1998) maximize... |

19 | A constrained eigenvalue problem. Linear Algebra and its applications - Gander, Golub, et al. - 1989 |