## Fast subtree kernels on graphs

Citations: | 16 - 2 self |

### BibTeX

@MISC{Shervashidze_fastsubtree,

author = {Nino Shervashidze and Karsten M. Borgwardt},

title = {Fast subtree kernels on graphs},

year = {}

}

### OpenURL

### Abstract

In this article, we propose fast subtree kernels on graphs. On graphs with n nodes and m edges and maximum degree d, these kernels comparing subtrees of height h can be computed in O(mh), whereas the classic subtree kernel by Ramon & Gärtner scales as O(n 2 4 d h). Key to this efficiency is the observation that the Weisfeiler-Lehman test of isomorphism from graph theory elegantly computes a subtree kernel as a byproduct. Our fast subtree kernels can deal with labeled graphs, scale up easily to large graphs and outperform state-of-the-art graph kernels on several classification benchmark datasets in terms of accuracy and runtime. 1

### Citations

145 | Marginalized kernels between labeled graphs
- Kashima, Tsuda, et al.
- 2003
(Show Context)
Citation Context ...of kernel machines that reaches deep into graph mining. Several different graph kernels have been defined in machine learning which can be categorized into three classes: graph kernels based on walks =-=[5, 7]-=- and paths [2], graph kernels based on limited-size subgraphs [6, 11], and graph kernels based on subtree patterns [9, 10]. While fast computation techniques have been developed for graph kernels base... |

129 | On graph kernels: Hardness results and efficient alternatives
- Gaertner, Flach, et al.
- 2003
(Show Context)
Citation Context ...of kernel machines that reaches deep into graph mining. Several different graph kernels have been defined in machine learning which can be categorized into three classes: graph kernels based on walks =-=[5, 7]-=- and paths [2], graph kernels based on limited-size subgraphs [6, 11], and graph kernels based on subtree patterns [9, 10]. While fast computation techniques have been developed for graph kernels base... |

52 |
Cyclic pattern kernels for predictive graph mining
- Horváth, Gärtner, et al.
- 2004
(Show Context)
Citation Context ...erent graph kernels have been defined in machine learning which can be categorized into three classes: graph kernels based on walks [5, 7] and paths [2], graph kernels based on limited-size subgraphs =-=[6, 11]-=-, and graph kernels based on subtree patterns [9, 10]. While fast computation techniques have been developed for graph kernels based on walks [12] and on limited-size subgraphs [11], it is unclear how... |

51 |
Structure-Activity Relationship of Mutagenic Aromatic and Heteroaromatic Nitro Compounds. Correlation with molecular orbital energies and hydrophobicity
- Debnath, Compadre, et al.
- 1991
(Show Context)
Citation Context ...ler-Lehman kernel that we use in the following graph classification tasks. 4.2 Graph classification Datasets We employed the following datasets in our experiments: MUTAG, NCI1, NCI109, and D&D. MUTAG =-=[3]-=- is a dataset of 188 mutagenic aromatic and heteroaromatic nitro compounds labeled according to whether or not they have a mutagenic effect on the Gram-negative bacterium Salmonella typhimurium. We al... |

35 | Shortest-path kernels on graphs
- Borgwardt, Kriegel
- 2005
(Show Context)
Citation Context ...s that reaches deep into graph mining. Several different graph kernels have been defined in machine learning which can be categorized into three classes: graph kernels based on walks [5, 7] and paths =-=[2]-=-, graph kernels based on limited-size subgraphs [6, 11], and graph kernels based on subtree patterns [9, 10]. While fast computation techniques have been developed for graph kernels based on walks [12... |

35 | Expressivity versus efficiency of graph kernels
- Ramon, Gärtner
- 2003
(Show Context)
Citation Context ...ning which can be categorized into three classes: graph kernels based on walks [5, 7] and paths [2], graph kernels based on limited-size subgraphs [6, 11], and graph kernels based on subtree patterns =-=[9, 10]-=-. While fast computation techniques have been developed for graph kernels based on walks [12] and on limited-size subgraphs [11], it is unclear how to compute subtree kernels efficiently. As a consequ... |

34 | Extensions of marginalized graph kernels
- Mahé, Ueda, et al.
- 2004
(Show Context)
Citation Context ...[9] and [1] refine the above definition for applications in chemoinformatics and hand-written digit recognition. Mahé and Vert [9] define extensions of the classic subtree kernel that avoid tottering =-=[8]-=- and consider unbalanced subtrees. Both [9] and [1] propose to consider α-ary subtrees with at most α children per node. This restricts the set of matchings to matchings of up to α nodes, but the runt... |

28 |
Graph kernels based on tree patterns for molecules
- Mahé, Vert
- 2009
(Show Context)
Citation Context ...ning which can be categorized into three classes: graph kernels based on walks [5, 7] and paths [2], graph kernels based on limited-size subgraphs [6, 11], and graph kernels based on subtree patterns =-=[9, 10]-=-. While fast computation techniques have been developed for graph kernels based on walks [12] and on limited-size subgraphs [11], it is unclear how to compute subtree kernels efficiently. As a consequ... |

27 | Distinguishing enzyme structures from non-enzymes without alignments
- Dobson, Doig
- 2007
(Show Context)
Citation Context ...s of NCI1 and NCI109, which classify compounds based on whether or not they are active in an anti-cancer screen ([13] and http://pubchem.ncbi.nlm.nih.gov). D&D is a dataset of 1178 protein structures =-=[4]-=-. Each protein is represented by a graph, in which the nodes are amino acids and two nodes are connected by an edge if they are less than 6 Angstroms apart. The prediction task is to classify the prot... |

26 | Fast computation of graph kernels
- Vishwanathan, Borgwardt, et al.
- 2006
(Show Context)
Citation Context ...[2], graph kernels based on limited-size subgraphs [6, 11], and graph kernels based on subtree patterns [9, 10]. While fast computation techniques have been developed for graph kernels based on walks =-=[12]-=- and on limited-size subgraphs [11], it is unclear how to compute subtree kernels efficiently. As a consequence, they have been applied to relatively small graphs representing chemical compounds [9] o... |

26 | Comparison of descriptor spaces for chemical compound retrieval and classification
- Wale, Karypis
- 2006
(Show Context)
Citation Context ... bacterium Salmonella typhimurium. We also conducted experiments on two balanced subsets of NCI1 and NCI109, which classify compounds based on whether or not they are active in an anti-cancer screen (=-=[13]-=- and http://pubchem.ncbi.nlm.nih.gov). D&D is a dataset of 1178 protein structures [4]. Each protein is represented by a graph, in which the nodes are amino acids and two nodes are connected by an edg... |

22 | Efficient graphlet kernels for large graph comparison, in
- Shervashidze, Vishwanathan, et al.
(Show Context)
Citation Context ...erent graph kernels have been defined in machine learning which can be categorized into three classes: graph kernels based on walks [5, 7] and paths [2], graph kernels based on limited-size subgraphs =-=[6, 11]-=-, and graph kernels based on subtree patterns [9, 10]. While fast computation techniques have been developed for graph kernels based on walks [12] and on limited-size subgraphs [11], it is unclear how... |

19 |
A reduction of a graph to a canonical form and an algebra arisingduring thisreduction
- Weisfeiler
- 1968
(Show Context)
Citation Context ...clarity of presentation. 23 Fast subtree kernels 3.1 The Weisfeiler-Lehman test of isomorphism Our algorithm for computing a fast subtree kernel builds upon the Weisfeiler-Lehman test of isomorphism =-=[14]-=-, more specifically its 1-dimensional variant, also known as “naive vertex refinement”, which we describe in the following. Assume we are given two graphs G and G ′ and we would like to test whether t... |

11 | Graph kernels between point clouds
- Bach
- 2008
(Show Context)
Citation Context ...subgraphs [11], it is unclear how to compute subtree kernels efficiently. As a consequence, they have been applied to relatively small graphs representing chemical compounds [9] or handwritten digits =-=[1]-=-, with approximately twenty nodes on average. But could one speed up subtree kernels to make them usable on graphs with hundreds of nodes, as they arise in protein structure models or in program flow ... |