Results 1  10
of
102
Kernels and Distances for Structured Data
 Machine Learning
, 2004
"... This paper brings together two strands of machine learning of increasing importance: kernel methods and highly structured data. We propose a general method for constructing a kernel following the syntactic structure of the data, as defined by its type signature in a higherorder logic. Our main theo ..."
Abstract

Cited by 49 (3 self)
 Add to MetaCart
This paper brings together two strands of machine learning of increasing importance: kernel methods and highly structured data. We propose a general method for constructing a kernel following the syntactic structure of the data, as defined by its type signature in a higherorder logic. Our main theoretical result is the positive definiteness of any kernel thus defined. We report encouraging experimental results on a range of realworld datasets. By converting our kernel to a distance pseudometric for 1nearest neighbour, we were able to improve the best accuracy from the literature on the Diterpene dataset by more than 10%.
Graph Kernels and Gaussian Processes for Relational Reinforcement Learning
 Machine Learning
, 2003
"... Relational reinforcement learning is a Qlearning technique for relational stateaction spaces. It aims to enable agents to learn how to act in an environment that has no natural representation as a tuple of constants. In this case, the learning algorithm used to approximate the mapping between stat ..."
Abstract

Cited by 40 (9 self)
 Add to MetaCart
Relational reinforcement learning is a Qlearning technique for relational stateaction spaces. It aims to enable agents to learn how to act in an environment that has no natural representation as a tuple of constants. In this case, the learning algorithm used to approximate the mapping between stateaction pairs and their so called Q(uality)value has to be not only very reliable, but it also has to be able to handle the relational representation of stateaction pairs. In this paper we investigate...
Graph Kernels
, 2007
"... We present a unified framework to study graph kernels, special cases of which include the random walk (Gärtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004; Mahé et al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time complexit ..."
Abstract

Cited by 37 (4 self)
 Add to MetaCart
We present a unified framework to study graph kernels, special cases of which include the random walk (Gärtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004; Mahé et al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time complexity of kernel computation between unlabeled graphs with n vertices from O(n 6) to O(n 3). We find a spectral decomposition approach even more efficient when computing entire kernel matrices. For labeled graphs we develop conjugate gradient and fixedpoint methods that take O(dn 3) time per iteration, where d is the size of the label set. By extending the necessary linear algebra to Reproducing Kernel Hilbert Spaces (RKHS) we obtain the same result for ddimensional edge kernels, and O(n 4) in the infinitedimensional case; on sparse graphs these algorithms only take O(n 2) time per iteration in all cases. Experiments on graphs from bioinformatics and other application domains show that these techniques can speed up computation of the kernel by an order of magnitude or more. We also show that certain rational kernels (Cortes et al., 2002, 2003, 2004) when specialized to graphs reduce to our random walk graph kernel. Finally, we relate our framework to Rconvolution kernels (Haussler, 1999) and provide a kernel that is close to the optimal assignment kernel of Fröhlich et al. (2006) yet provably positive semidefinite.
ShortestPath Kernels on Graphs
 In Proceedings of the 2005 International Conference on Data Mining
, 2005
"... Data mining algorithms are facing the challenge to deal with an increasing number of complex objects. For graph data, a whole toolbox of data mining algorithms becomes available by defining a kernel function on instances of graphs. Graph kernels based on walks, subtrees and cycles in graphs have bee ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
Data mining algorithms are facing the challenge to deal with an increasing number of complex objects. For graph data, a whole toolbox of data mining algorithms becomes available by defining a kernel function on instances of graphs. Graph kernels based on walks, subtrees and cycles in graphs have been proposed so far. As a general problem, these kernels are either computationally expensive or limited in their expressiveness. We try to overcome this problem by defining expressive graph kernels which are based on paths. As the computation of all paths and longest paths in a graph is NPhard, we propose graph kernels based on shortest paths. These kernels are computable in polynomial time, retain expressivity and are still positive definite. In experiments on classification of graph models of proteins, our shortestpath kernels show significantly higher classification accuracy than walkbased kernels. 1
Expressivity versus efficiency of graph kernels
 Proceedings of the First International Workshop on Mining Graphs, Trees and Sequences
, 2003
"... Abstract. Recently, kernel methods have become a popular tool for machine learning and data mining. As most ‘realworld ’ data is structured, research in kernel methods has begun investigating kernels for various kinds of structured data. One of the most widely used tools for modeling structured dat ..."
Abstract

Cited by 35 (0 self)
 Add to MetaCart
Abstract. Recently, kernel methods have become a popular tool for machine learning and data mining. As most ‘realworld ’ data is structured, research in kernel methods has begun investigating kernels for various kinds of structured data. One of the most widely used tools for modeling structured data are graphs. In this paper we study the tradeoff between expressivity and efficiency of graph kernels. First, we motivate the need for this discussion by showing that fully general graph kernels can not even be approximated efficiently. We also discuss generalizations of graph kernels defined in literature and show that they are either not positive definite or not very useful. Finally, we propose a new graph kernel based on subtree patterns. We argue that while a little more computationally expensive, this kernel is more expressive than kernels based on walks. 1
Binetcauchy kernels on dynamical systems and its application to the analysis of dynamic scenes
 International Journal of Computer Vision
, 2005
"... Abstract. We derive a family of kernels on dynamical systems by applying the BinetCauchy theorem to trajectories of states. Our derivation provides a unifying framework for all kernels on dynamical systems currently used in machine learning, including kernels derived from the behavioral framework, ..."
Abstract

Cited by 32 (12 self)
 Add to MetaCart
Abstract. We derive a family of kernels on dynamical systems by applying the BinetCauchy theorem to trajectories of states. Our derivation provides a unifying framework for all kernels on dynamical systems currently used in machine learning, including kernels derived from the behavioral framework, diffusion processes, marginalized kernels, kernels on graphs, and the kernels on sets arising from the subspace angle approach. In the case of linear timeinvariant systems, we derive explicit formulae for computing the proposed BinetCauchy kernels by solving Sylvester equations, and relate the proposed kernels to existing kernels based on cepstrum coefficients and subspace angles. Besides their theoretical appeal, these kernels can be used efficiently in the comparison of video sequences of dynamic scenes that can be modeled as the output of a linear timeinvariant dynamical system. One advantage of our kernels is that they take the initial conditions of the dynamical systems into account. As a first example, we use our kernels to compare video sequences of dynamic textures. As a second example, we apply our kernels to the problem of clustering short clips of a movie. Experimental evidence shows superior performance of our kernels. Keywords: BinetCauchy theorem, ARMA models and dynamical systems, Sylvester
Graph kernels for molecular structureactivity relationship analysis with support vector machines
 Journal of Chemical Information and Modeling
"... The support vector machine algorithm together with graph kernel functions has recently been introduced to model structureactivity relationships (SAR) of molecules from their 2D structure, without the need for explicit molecular descriptor computation. We propose two extensions to this approach with ..."
Abstract

Cited by 29 (5 self)
 Add to MetaCart
The support vector machine algorithm together with graph kernel functions has recently been introduced to model structureactivity relationships (SAR) of molecules from their 2D structure, without the need for explicit molecular descriptor computation. We propose two extensions to this approach with the double goal to reduce the computational burden associated with the model and to enhance its predictive accuracy: description of the molecules by a Morgan index process and definition of a secondorder Markov model for random walks on 2D structures. Experiments on two mutagenicity data sets validate the proposed extensions, making this approach a possible complementary alternative to other modeling strategies. 1.
Image classification with segmentation graph kernels
 In Proc. CVPR
, 2007
"... We propose a family of kernels between images, defined as kernels between their respective segmentation graphs. The kernels are based on soft matching of subtreepatterns of the respective graphs, leveraging the natural structure of images while remaining robust to the associated segmentation proces ..."
Abstract

Cited by 26 (10 self)
 Add to MetaCart
We propose a family of kernels between images, defined as kernels between their respective segmentation graphs. The kernels are based on soft matching of subtreepatterns of the respective graphs, leveraging the natural structure of images while remaining robust to the associated segmentation process uncertainty. Indeed, output from morphological segmentation is often represented by a labelled graph, each vertex corresponding to a segmented region, with edges joining neighboring regions. However, such image representations have mostly remained underused for learning tasks, partly because of the observed instability of the segmentation process and the inherent hardness of inexact graph matching with uncertain graphs. Our kernels count common virtual substructures amongst images, which enables to perform efficient supervised classification of natural images with a support vector machine. Moreover, the kernel machinery allows us to take advantage of recent advances in kernelbased learning: i) semisupervised learning reduces the required number of labelled images, while ii) multiple kernel learning algorithms efficiently select the most relevant similarity measures between images within our family. 1.
Fast computation of graph kernels
 In
, 2007
"... Using extensions of linear algebra concepts to Reproducing Kernel Hilbert Spaces (RKHS), we define a unifying framework for random walk kernels on graphs. Reduction to a Sylvester equation allows us to compute many of these kernels in O(n 3) worstcase time. This includes kernels whose previous wors ..."
Abstract

Cited by 26 (7 self)
 Add to MetaCart
Using extensions of linear algebra concepts to Reproducing Kernel Hilbert Spaces (RKHS), we define a unifying framework for random walk kernels on graphs. Reduction to a Sylvester equation allows us to compute many of these kernels in O(n 3) worstcase time. This includes kernels whose previous worstcase time complexity was O(n 6), such as the geometric kernels of Gärtner et al. [1] and the marginal graph kernels of Kashima et al. [2]. Our algebra in RKHS allow us to exploit sparsity in directed and undirected graphs more effectively than previous methods, yielding subcubic computational complexity when combined with conjugate gradient solvers or fixedpoint iterations. Experiments on graphs from bioinformatics and other application domains show that our algorithms are often more than 1000 times faster than existing approaches. 1