## Semi-supervised graph clustering: a kernel approach (2008)

### Cached

### Download Links

Citations: | 47 - 2 self |

### BibTeX

@MISC{Kulis08semi-supervisedgraph,

author = {Brian Kulis and Sugato Basu and Inderjit Dhillon and Raymond Mooney},

title = { Semi-supervised graph clustering: a kernel approach},

year = {2008}

}

### OpenURL

### Abstract

Semi-supervised clustering algorithms aim to improve clustering results using limited supervision. The supervision is generally given as pairwise constraints; such constraints are natural for graphs, yet most semi-supervised clustering algorithms are designed for data represented as vectors. In this paper, we unify vector-based and graph-based approaches. We first show that a recently-proposed objective function for semi-supervised clustering based on Hidden Markov Random Fields, with squared Euclidean distance and a certain class of constraint penalty functions, can be expressed as a special case of the weighted kernel k-means objective (Dhillon et al., in Proceedings of the 10th International Conference on Knowledge Discovery and Data Mining, 2004a). A recent theoretical connection between weighted kernel k-means and several graph clustering objectives enables us to perform semi-supervised clustering of data given either as vectors or as a graph. For graph data, this result leads to algorithms for optimizing several new semi-supervised graph clustering objectives. For vector data, the kernel approach also enables us to find clusters with non-linear boundaries in the input data space. Furthermore, we show that recent work on spectral learning (Kamvar et al., in Proceedings of the 17th International Joint Conference on Artificial Intelligence, 2003) may be viewed as a special case of our formulation. We empirically show that our algorithm is able to outperform current state-of-the-art semi-supervised algorithms on both vector-based and graph-based data sets.

### Citations

8565 | Elements of information theory - Cover, Thomas - 1991 |

3921 | Pattern Classification and Scene Analysis - Duda, Hart - 1973 |

2590 | Normalized cuts and image segmentation
- Shi, Malik
- 1997
(Show Context)
Citation Context ...r our proposed kernel-based semi-supervised clustering algorithm SS-Kernel-kmeans, and also describe related research. 2.1. Graph Clustering and Kernel k-means In graph clustering (Chan et al., 1994; =-=Shi & Malik, 2000-=-), the input is assumed to be a graph G = (V, E,A), where V is the set of vertices, E is the set of edges, and A is the edge affinity matrix. Aij represents the edge-weight between vertex i and j. Let... |

943 |
An Introduction to Support Vector Machines
- Cristianini, Shawe-Taylor
- 2000
(Show Context)
Citation Context ... mapped space without explicitly knowing the mapping of xi and xj to φ(xi) and φ(xj) respectively. It can easily be shown that any positive semidefinite matrix K can be thought of as a kernel matrix (=-=Cristianini & Shawe-Taylor, 2000-=-). 1 k disjoint data subsets, whose union is the whole datasSemi-supervised Graph Clustering: A Kernel Approach Using the kernel matrix K, the distance computation is rewritten as: � Kii − 2�xj∈πc αjK... |

504 | S.: Distance metric learning with application to clustering with side-information
- Xing, Ng, et al.
- 2003
(Show Context)
Citation Context ...porate prior information about clusters into the algorithm in order to improve the clustering results. A number of recent papers have explored this problem (Wagstaff et al., 2001; Klein et al., 2002; =-=Xing et al., 2003-=-; Kamvar et al., 2003; Bar-Hillel et al., 2003; Basu et al., 2004). As is common for most semi-supervised clustering algorithms, we assume that we have pairwise must-link constraints (pairs of points ... |

368 | The pyramid matching kernel: Discriminative classification with sets of image features - Grauman, Darrell |

326 | Constrained k-means clustering with background knowledge
- Wagstaff, Cardie, et al.
- 1999
(Show Context)
Citation Context ...supervised clustering, the goal is to incorporate prior information about clusters into the algorithm in order to improve the clustering results. A number of recent papers have explored this problem (=-=Wagstaff et al., 2001-=-; Klein et al., 2002; Xing et al., 2003; Kamvar et al., 2003; Bar-Hillel et al., 2003; Basu et al., 2004). As is common for most semi-supervised clustering algorithms, we assume that we have pairwise ... |

280 | SemiSupervised Learning - Chapelle, Schölkopf, et al. - 2006 |

219 | Correlation clustering - Bansal, Blum, et al. - 2004 |

183 | A probabilistic framework for semi-supervised clustering
- Basu, Bilenko, et al.
(Show Context)
Citation Context ...er to improve the clustering results. A number of recent papers have explored this problem (Wagstaff et al., 2001; Klein et al., 2002; Xing et al., 2003; Kamvar et al., 2003; Bar-Hillel et al., 2003; =-=Basu et al., 2004-=-). As is common for most semi-supervised clustering algorithms, we assume that we have pairwise must-link constraints (pairs of points that should belong in the same cluster) and cannot-link constrain... |

180 | Integrating constraints and metric learning in semi-supervised clustering - Bilenko, Basu, et al. - 2004 |

179 | Multiclass spectral clustering - Yu, Shi - 2003 |

166 | A Random Walks View of Spectral Segmentation - Meila, Shi - 2001 |

158 | Approximation algorithms for classification problems with pairwise relationships: Metric labeling and markov random fields - Kleinberg, Tardos - 1999 |

155 | Kanehisa M: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res
- Ogata, Goto, et al.
- 1999
(Show Context)
Citation Context ...rom the UCI Machine Learning Repository 2 . It has 317 points in a 16 dimensional-space. 3. GeneNetwork: An interaction network between 216 yeast genes, where each gene is labeled with one of 3 KEGG (=-=Ogata et al., 1999-=-) functional pathway labels. This data is a subgraph of a high-quality probabilistic functional network of yeast genes (Lee et al., 2004): each edge weight in this network represents a probability of ... |

154 | Impact of similarity measures on web-page clustering
- STREHL, STREHL, et al.
- 2000
(Show Context)
Citation Context ... quality of clusters, by measuring the amount of statistical information shared by the random variables representing the cluster distribution and the underlying class distribution of the data points (=-=Strehl et al., 2000-=-). 4.3. Results and Discussion Figure 2 shows the results on the TwoCircle data set. This synthetic dataset is used to demonstrate the efNMI Value 1 0.8 0.6 0.4 0.2 0 HMRF−KMeans SS−Kernel−KMeans−Line... |

152 | From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering
- Klein, Kamvar, et al.
- 2002
(Show Context)
Citation Context ...the goal is to incorporate prior information about clusters into the algorithm in order to improve the clustering results. A number of recent papers have explored this problem (Wagstaff et al., 2001; =-=Klein et al., 2002-=-; Xing et al., 2003; Kamvar et al., 2003; Bar-Hillel et al., 2003; Basu et al., 2004). As is common for most semi-supervised clustering algorithms, we assume that we have pairwise must-link constraint... |

144 | Semi-supervised clustering by seeding - Basu, Banerjee, et al. - 2002 |

133 | Learning distance functions using equivalence relations
- Bar-Hillel, Hertz, et al.
(Show Context)
Citation Context ...into the algorithm in order to improve the clustering results. A number of recent papers have explored this problem (Wagstaff et al., 2001; Klein et al., 2002; Xing et al., 2003; Kamvar et al., 2003; =-=Bar-Hillel et al., 2003-=-; Basu et al., 2004). As is common for most semi-supervised clustering algorithms, we assume that we have pairwise must-link constraints (pairs of points that should belong in the same cluster) and ca... |

132 | A probabilistic functional network of yeast genes - Lee, Date, et al. - 2004 |

122 | Kernel K-means: spectral clustering and normalized cuts. Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 551–556 (2004
- Dhillon, Guan, et al.
- 2010
(Show Context)
Citation Context ... 1. We show that the HMRF semi-supervised clustering objective with squared Euclidean distance and cluster-size weighted penalties is a special case of the weighted kernel k-means objective function (=-=Dhillon et al., 2004-=-a). Given input data in the form of vectors and pairwise constraints, we show how to construct a kernel such that running kernel k-means results in a monotonic decrease of the semi-supervised cluster... |

95 | Clustering with qualitative information - Charikar, Guruswami, et al. - 2005 |

89 | Active Semi-Supervision for Pairwise Constrained Clustering - Basu, Banerjee, et al. - 2004 |

79 | S: Clustering based on conditional distributions in an auxiliary space - Sinkkonen, Kaski |

78 | Weighted Graph Cuts without Eigenvectors: A Multilevel Approach - Dhillon, Guan, et al. |

70 | Spectral learning
- Kamvar, Klein, et al.
- 2003
(Show Context)
Citation Context ...ors or as a graph. For vector data, the kernel approach also enables us to find clusters with nonlinear boundaries in the input data space. Furthermore, we show that recent work on spectral learning (=-=Kamvar et al., 2003-=-) may be viewed as a special case of our formulation. We empirically show that our algorithm is able to outperform current state-of-the-art semi-supervised algorithms on both vectorbased and graph-bas... |

64 | Semisupervised clustering using genetic algorithms - Demiriz, Embrechts - 1999 |

58 | Clustering with constraints: Feasibility issues and the k-means algorithm - Davidson, Ravi - 2005 |

51 | A Unified View of Kernel k-Means, Spectral Clustering and Graph Partitioning
- Dhillon, Guan, et al.
- 2005
(Show Context)
Citation Context ... 1. We show that the HMRF semi-supervised clustering objective with squared Euclidean distance and cluster-size weighted penalties is a special case of the weighted kernel k-means objective function (=-=Dhillon et al., 2004-=-a). Given input data in the form of vectors and pairwise constraints, we show how to construct a kernel such that running kernel k-means results in a monotonic decrease of the semi-supervised cluster... |

47 | Correlation clustering with partial information - Demaine, Immorlica |

37 | Semi-supervised learning with penalized probabilistic clustering - Lu, Leen - 2004 |

32 | Hierarchical clustering with constraints: theory and practice - Davidson, Ravi - 2005 |

27 | A Discriminative Learning Framework with Pairwise Constraints for Video Object Classification - Yan, Zhang, et al. - 2004 |

23 | Locally linear metric adaptation for semi-supervised clustering - CHANG, YEUNG - 2004 |

22 | Learning with constrained and unlabelled data - Lange - 2005 |

18 | Model-based clustering with probabilistic constraints - Law, Topchy, et al. - 2005 |

16 |
Spectral k-way ratio cut partitioning
- Chan, Schlag, et al.
- 1994
(Show Context)
Citation Context ...ssary background for our proposed kernel-based semi-supervised clustering algorithm SS-Kernel-kmeans, and also describe related research. 2.1. Graph Clustering and Kernel k-means In graph clustering (=-=Chan et al., 1994-=-; Shi & Malik, 2000), the input is assumed to be a graph G = (V, E,A), where V is the set of vertices, E is the set of edges, and A is the edge affinity matrix. Aij represents the edge-weight between ... |

10 | Fast low-rank semidefinite programming for embedding and clustering - Kulis, Surendran, et al. - 2007 |

3 | Efficiently learning the metric using side-information - BIE, MOMMA, et al. - 2003 |

2 | A comparison of inference techniques for semi-supervised clustering with hidden Markov random fields - Bilenko, Basu - 2004 |

1 | Kernels and regularization on computational graphs - Smola, Kondor - 2003 |