## Graph Kernels and Gaussian Processes for Relational Reinforcement Learning (2003)

### Cached

### Download Links

Venue: | Machine Learning |

Citations: | 43 - 9 self |

### BibTeX

@INPROCEEDINGS{Gärtner03graphkernels,

author = {Thomas Gärtner and Kurt Driessens and Jan Ramon},

title = {Graph Kernels and Gaussian Processes for Relational Reinforcement Learning},

booktitle = {Machine Learning},

year = {2003},

pages = {146--163},

publisher = {Springer}

}

### Years of Citing Articles

### OpenURL

### Abstract

Relational reinforcement learning is a Q-learning technique for relational state-action spaces. It aims to enable agents to learn how to act in an environment that has no natural representation as a tuple of constants. In this case, the learning algorithm used to approximate the mapping between state-action pairs and their so called Q(uality)-value has to be not only very reliable, but it also has to be able to handle the relational representation of state-action pairs. In this paper we investigate...

### Citations

9927 | Statistical Learning Theory
- VAPNIK
- 1998
(Show Context)
Citation Context ...definite kernel on X if, for all n ∈ Z + , x1, . . . , xn ∈ X , and c1, . . . , cn ∈ R, it follows that � ci cj k(xi, xj) ≥ 0 . 3.1. Kernel Machines i,j∈{1,...,n} The usual supervised learning model (=-=Vapnik, 1995-=-) considers a set X of individuals and a set Y of labels, such that the relation between individuals and labels is a fixed but unknown probability measure on the set X × Y. The common theme in many di... |

4160 | Reinforcement Learning: An Introduction
- Sutton, Barto
- 1998
(Show Context)
Citation Context ...rnels can compete with, and often improve on, regression trees and instance based regression as a generalisation algorithm for relational reinforcement learning. 1 Introduction Reinforcement learning =-=[26]-=-, in a nutshell, is about controlling an autonomous agent in an unknown environment - often called the state space. The agent has no prior knowledge about the environment and can only obtain some know... |

2239 | Learning with Kernels
- SCHÖLKOPF, SMOLA
- 2002
(Show Context)
Citation Context ...t-order distance-based algorithms as well as first-order regression trees have been used as learning algorithms to approximate the mapping between state-action pairs and their Q-value. Kernel Methods =-=[24]-=- are among the most successful recent developments within the machine learning community. The computational attractiveness of kernel methods is due to the fact that they can be applied in high dimensi... |

2099 | Matrix computations - Golub, Loan - 1996 |

1720 |
An Introduction to Support Vector Machines and other Kernel-Based Learning Methods
- CRISTIANINI, SHAWE-TAYLOR
- 2000
(Show Context)
Citation Context ...s do have some nice closure properties. In particular, they are closed under sum, direct sum, multiplication by a scalar, product, tensor product, zero extension, pointwise limits, and exponentiation =-=[4, 15]-=-. 3.3 Kernels for Structured Data The best known kernel for representation spaces that are not mere attributevalue tuples is the convolution kernel proposed by Haussler [15]. The basic idea of convolu... |

1408 |
Learning from Delayed Rewards
- Watkins
- 1989
(Show Context)
Citation Context ... agent can get about the environment is the state in which it currently is and whether it received a reward. The aim of reinforcement learning is to act such that this reward is maximised. Q-learning =-=[27]-=- --- one particular form of reinforcement learning --- tries to map every state-action-pair to a real number (Q-value) reflecting the quality of that action in that state, based on the experience so f... |

1404 | Reinforcement Learning: A Survey
- Kaelbling, Littman, et al.
- 1996
(Show Context)
Citation Context ...ons or even entire state-space regions with low confidence on their Q-value predictions could be given a higher exploration priority. This approach is similar to interval based exploration techniques =-=[17]-=- where the upper bound of an estimation interval is used to guide the exploration into high promising regions of the state-action space. In the case of RRL-Kbr these upper bounds could be replaced wit... |

845 |
Theory of Reproducing kernels
- Aronszajn
- 1950
(Show Context)
Citation Context ... # R, a feature transformation # : X # H into some Hilbert space H exists, such that k(x, x # ) = ##(x), #(x # )# for all x, x # # X can be checked by verifying that the function is positive definite =-=[1]-=-. This means that any set, whether a linear space or not, that admits a positive definite kernel can be embedded into a linear space. Thus, throughout the paper, we take `valid' to mean `positive defi... |

762 |
Graph Theory
- Diestel
- 1997
(Show Context)
Citation Context ...s to extend the applicability of kernel methods to structured data. This section gives a brief overview of graphs and graph kernels. For a more in-depth discussion of graphs the reader is referred to =-=[5, 19]-=-. For a discussion of di#erent graph kernels see [13]. 4.1 Labelled Directed Graphs Generally, a graph G is described by a finite set of vertices V , a finite set of edges E , and a function # . For l... |

413 | Text Classification Using the String Kernel
- Lodhi, Cristianini, et al.
- 2001
(Show Context)
Citation Context ...s for structured data in literature, however, these usually focus on a very restricted syntax and are more or less domain specific. Examples are string and tree kernels. Traditionally, string kernels =-=[20]-=- have focused on applications in text mining and measure similarity of two strings by the number of common (not necessarily contiguous) substrings. These string kernels have not been applied in other ... |

403 | Convolutional kernels on discrete structures
- Haussler
- 1999
(Show Context)
Citation Context ...s do have some nice closure properties. In particular, they are closed under sum, direct sum, multiplication by a scalar, product, tensor product, zero extension, pointwise limits, and exponentiation =-=[4, 15]-=-. 3.3 Kernels for Structured Data The best known kernel for representation spaces that are not mere attributevalue tuples is the convolution kernel proposed by Haussler [15]. The basic idea of convolu... |

340 | Frequent subgraph discovery - Kuramochi, Karypis - 2001 |

279 | Convolution kernels for natural language
- Collins, Duffy
- 2001
(Show Context)
Citation Context ...g kernels have been defined for other domains, e.g., recognition of translation inition sites in DNA and mRNA sequences [28]. Again, these kernels have not been applied in other domains. Tree kernels =-=[3]-=- can be applied to ordered trees where the number of children of a node is determined by the label of the node. They compute the similarity of trees based on their common subtrees. Tree kernels have b... |

206 |
Combinatorial Optimization: Theory and Algorithms
- Korte, Vygen
- 2006
(Show Context)
Citation Context ...s to extend the applicability of kernel methods to structured data. This section gives a brief overview of graphs and graph kernels. For a more in-depth discussion of graphs the reader is referred to =-=[5, 19]-=-. For a discussion of di#erent graph kernels see [13]. 4.1 Labelled Directed Graphs Generally, a graph G is described by a finite set of vertices V , a finite set of edges E , and a function # . For l... |

157 |
Product Graphs: Structure and Recognition
- Imrich, Klavžar
- 2000
(Show Context)
Citation Context ...idea of counting the number of walks in product graphs. Note that the definitions given here are more complicated than those given in [13] as parallel edges have to be considered here. Product graphs =-=[16]-=- are a very interesting tool in discrete mathematics. The four most important graph products are the Cartesian, the strong, the direct, and the lexicographic product. While the most fundamental one is... |

157 | Marginalized kernels between labeled graphs - Kashima, Tsuda, et al. - 2003 |

143 | On graph kernels: Hardness results and efficient alternatives
- Gärtner, Flach, et al.
- 2003
(Show Context)
Citation Context ...rocess is defined by a mean function and a covariance function, implicitly specifying the prior. The choice of covariance functions is thereby only limited to positive definite kernels. Graph kernels =-=[13]-=- have recently been introduced as one way of extending the applicability of kernel methods beyond mere attribute-value representations. The idea behind graph kernels is to base the similarity of two g... |

121 | A survey of kernels for structured data - Gärtner - 2003 |

115 | Relational reinforcement learning
- Dzeroski, DeRaedt, et al.
- 2001
(Show Context)
Citation Context ...ng found by a learning algorithm that is able to generalise to unseen states. Ideally, an incrementally learnable regression algorithm is used to learn this mapping. Relational reinforcement learning =-=[9, 8]-=- (RRL) is a Q-learning technique that can be applied whenever the state-action space can not easily be represented by tuples of constants but has an inherently relational representation instead. In th... |

112 | Engineering support vector machine kernels that recognize translation initiation sites
- Zien, Rätsch, et al.
- 2000
(Show Context)
Citation Context ...hese string kernels have not been applied in other domains. However, other string kernels have been defined for other domains, e.g., recognition of translation inition sites in DNA and mRNA sequences =-=[28]-=-. Again, these kernels have not been applied in other domains. Tree kernels [3] can be applied to ordered trees where the number of children of a node is determined by the label of the node. They comp... |

111 | Bayesian Qlearning
- Dearden, Friedman, et al.
- 1998
(Show Context)
Citation Context ...tion is related closest to our work. Dearden et al. describe two techniques that take advantage of the probability distribution supplied by Bayesian regression to guide the exploration in Q-learning (=-=Dearden et al., 1998-=-). Empirical evidence suggests that these exploration strategies perform better than modelfree exploration strategies such as Boltzmann exploration (which is currently used in the RRL-system). One app... |

110 | Kernel-based reinforcement learning
- Ormoneit, Sen
- 2002
(Show Context)
Citation Context ...ons of the RRL-system. Section 3 describes kernel methods in general and Gaussian processes in particular. Section 4 proposes graph kernels as covariance functions that are able to deal with the 1 In =-=[22]-=- the term `kernel' is not used to refer to a positive definite function but to a probability density function. structural nature of state-action pairs in RRL. Section 5 shows how states and actions in... |

107 | Ridge regression learning algorithm in dual variables
- Saunders, Gammerman, et al.
- 1998
(Show Context)
Citation Context ...3.1.1. Regularized Least Squares Choosing the square loss function, i.e., V (yi, f(xi)) = (yi − f(xi)) 2 , we obtain the optimization problem of the regularized least squares algorithm (Rifkin, 2002; =-=Saunders et al., 1998-=-): 1 min f(·)∈H n n� (yi − f(xi)) i=1 2 + C n �f(·)�2H . main.tex; 28/04/2005; 23:42; p.6sGaussian Processes for Relational Reinforcement Learning 7 Plugging in our knowledge about the form of solutio... |

102 |
Bayesian gaussian processes for regression and classi¯cation. Doctoral dissertation
- Gibbs
- 1997
(Show Context)
Citation Context ...example and ti the target value. Bayesian approaches aim at modeling the distribution of tN+1 given the example description xN+1, i.e.: P (tN+1|(x1, t1), . . . , (xN, tN), xN+1) . Gaussian processes (=-=Gibbs, 1997-=-) assume that the observed target values tN = [t1 · · · tN] have a joint Gaussian distribution and are obtained from the true target values by additive Gaussian noise: P (tN|x1, . . . , xN, CN) = 1 Z ... |

92 | Everything Old is New Again: A Fresh Look at Historical Approaches to - Rifkin - 2002 |

89 | Practical Reinforcement Learning in Continuous Spaces
- Smart, Kaelbling
(Show Context)
Citation Context ...n RRL but also other regression algorithms can be used. Future work will investigate how reinforcement techniques such as local linear models [23] and the use of convex hulls to make safe predictions =-=[25]-=- can be applied in RRL. A promising direction for future work is also to exploit the probabilistic predictions made available in RRL by the algorithm suggested in this paper. The obvious use of these ... |

60 | Bayes meets bellman: The gaussian process approach to temporal difference learning
- Engel, Mannor, et al.
- 2003
(Show Context)
Citation Context ...or structured data (less relevant to the work presented here) can be found in (Gärtner, 2003). Engel et al. independently developed a method using Gaussian processes for temporal difference learning (=-=Engel et al., 2003-=-) using a regression technique which eliminates almost linearly dependent examples using a technique based on Cholesky factorization. Both Cholesky factorization and QR-factorization have advantages a... |

58 | Relational reinforcement learning - Dˇzeroski, Raedt, et al. |

56 | Cyclic pattern kernels for predictive graph mining - Horváth, Gärtner, et al. - 2004 |

37 | Speeding up relational reinforcement learning through the use of an incremental first order decision tree learner
- Driessens, Ramon, et al.
- 2001
(Show Context)
Citation Context ...ng found by a learning algorithm that is able to generalise to unseen states. Ideally, an incrementally learnable regression algorithm is used to learn this mapping. Relational reinforcement learning =-=[9, 8]-=- (RRL) is a Q-learning technique that can be applied whenever the state-action space can not easily be represented by tuples of constants but has an inherently relational representation instead. In th... |

37 | Tetris is hard, even to approximate - Demaine, Hohenberger, et al. - 2003 |

37 | Gaussian processes in reinforcement learning - Rasmussen, Kuss |

34 | Batch value function approximation via support vectors - Dietterich, Wang - 2001 |

33 | Real-time robot learning with locally weighted statistical learning
- Schaal, Atkeson, et al.
- 2000
(Show Context)
Citation Context ...nels it is not only possible to apply Gaussian processes in RRL but also other regression algorithms can be used. Future work will investigate how reinforcement techniques such as local linear models =-=[23]-=- and the use of convex hulls to make safe predictions [25] can be applied in RRL. A promising direction for future work is also to exploit the probabilistic predictions made available in RRL by the al... |

32 | Smola: 2002, Learning with Kernels - Schölkopf, J |

30 | Integrating guidance into relational reinforcement learning - Driessens, Dzeroski - 2004 |

26 | Integrating experimentation and guidance in relational reinforcement learning
- Driessens, Dzeroski
- 2002
(Show Context)
Citation Context ...tate, action)-pairs and the blocks world kernel as described in the previous section, The RRL-system was trained in worlds where the number of blocks varied between 3 and 5, and given "guided&quo=-=t; traces [6]-=- in a world with 10 blocks. The Qfunction and the related policy were tested at regular intervals on 100 randomly generated starting states in worlds where the number of blocks varied from 3 to 10 blo... |

26 |
Exponential and geometric kernels for graphs
- Gärtner
- 2002
(Show Context)
Citation Context ...e kernels, however, can be applied to the kind of graphs encountered in our representation of the blocks world (See Section 5). Kernels that can be applied there have independently been introduced in =-=[11]-=- and [18] and will be presented in the next section. 4 Graph Kernels Graph kernels are an important means to extend the applicability of kernel methods to structured data. This section gives a brief o... |

26 |
Kernels for graph classification
- Kashima, Inokuchi
- 2002
(Show Context)
Citation Context ..., however, can be applied to the kind of graphs encountered in our representation of the blocks world (See Section 5). Kernels that can be applied there have independently been introduced in [11] and =-=[18]-=- and will be presented in the next section. 4 Graph Kernels Graph kernels are an important means to extend the applicability of kernel methods to structured data. This section gives a brief overview o... |

25 | Automated approaches for classifying structure - Deshpande, Karypis - 2002 |

22 | Relational instance based regression for relational reinforcement learning
- Driessens, Ramon
- 2003
(Show Context)
Citation Context ... Tg has shown itself to be sensitive with respect to the order in which the (state, action, qvalue)-examples are presented and often needs more training episodes to find a competitive policy. RRL-Rib =-=[7]-=- uses relational instance based regression for Q-function generalisation. The instance based regression o#ers a robustness to RRL not found in RRL-Tg but requires a first order distance to be defined ... |

22 | Positive Definite Rational Kernels - Cortes, Haffner, et al. - 2003 |

18 |
Matrix methods for engineers and scientists
- Barnett
- 1979
(Show Context)
Citation Context ... Gaussian Processes are particularly well suited for reinforcement learning, as the inverse of the covariance matrix C can be computed incrementally, using the so called partitioned inverse equations =-=[2]-=-. While computing the inverse directly is of cubic time complexity, incrementing the inverse is only of quadratic time complexity. Also, the probability distribution over target values can be used to ... |

7 | Representations for learning control policies
- Forbes, Andre
- 2002
(Show Context)
Citation Context ...rithm presented in this paper could be improved by using only an approximate inverse in the Gaussian process. The size of the kernel matrix could be reduced by so called instance averaging techniques =-=[10]-=-. While the explicit construction of average instances is far from being trivial, still the kernel between such average instances and test instances can be computed easily without ever constructing av... |

5 | Inokuchi A (2003): Marginalized kernels between labeled graphs - Kashima, Tsuda |

4 |
On graph kernels: Hardness results and ecient alternatives
- Gartner, Flach, et al.
- 2003
(Show Context)
Citation Context ...rocess is defined by a mean function and a covariance function, implicitly specifying the prior. The choice of covariance functions is thereby only limited to positive definite kernels. Graph kernels =-=[13]-=- have recently been introduced as one way of extending the applicability of kernel methods beyond mere attribute-value representations. The idea behind graph kernels is to base the similarity of two g... |

4 | Kaelbling L. 2000 Practical Reinforcement Learning in Continuous Spaces - Smart |

3 | PAC-Bayesian Pattern Classification with Kernels - Graepel - 2002 |

2 |
Kernel-based multi-relational data mining
- Gärtner
(Show Context)
Citation Context ...ing tasks. A kernel for instances represented by terms in a higher-order logic can be found in [14]. For an extensive overview of these and other kernels on structured data, the reader is referred to =-=[12]-=-. None of these kernels, however, can be applied to the kind of graphs encountered in our representation of the blocks world (See Section 5). Kernels that can be applied there have independently been ... |

2 |
Introduction to Gaussian processes. available at http://wol.ra.phy.cam.ac.uk/mackay
- MacKay
- 1997
(Show Context)
Citation Context ...lbert space H exists, such that k(x, x # ) = ##(x), #(x # )# for all x, x # # X . Kernel methods have so far successfully been applied to various tasks in attribute-value learning. Gaussian processes =-=[21]-=- are an incrementally learnable `Bayesian' regression algorithm. Rather than parameterising some set of possible target functions and specifying a prior over these parameters, Gaussian processes direc... |