Results 1  10
of
19
Randomwalk computation of similarities between nodes of a graph, with application to collaborative recommendation
 IEEE Transactions on Knowledge and Data Engineering
, 2006
"... Abstract—This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted and undirected graph. It is based on a Markovchain model of random walk through the database. More precisely, we compute quantities (the average comm ..."
Abstract

Cited by 116 (14 self)
 Add to MetaCart
Abstract—This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted and undirected graph. It is based on a Markovchain model of random walk through the database. More precisely, we compute quantities (the average commute time, the pseudoinverse of the Laplacian matrix of the graph, etc.) that provide similarities between any pair of nodes, having the nice property of increasing when the number of paths connecting those elements increases and when the “length ” of paths decreases. It turns out that the square root of the average commute time is a Euclidean distance and that the pseudoinverse of the Laplacian matrix is a kernel matrix (its elements are inner products closely related to commute times). A principal component analysis (PCA) of the graph is introduced for computing the subspace projection of the node vectors in a manner that preserves as much variance as possible in terms of the Euclidean commutetime distance. This graph PCA provides a nice interpretation to the “Fiedler vector, ” widely used for graph partitioning. The model is evaluated on a collaborativerecommendation task where suggestions are made about which movies people should watch based upon what they watched in the past. Experimental results on the MovieLens database show that the Laplacianbased similarities perform well in comparison with other methods. The model, which nicely fits into the socalled “statistical relational learning ” framework, could also be used to compute document or word similarities, and, more generally, it could be applied to machinelearning and patternrecognition tasks involving a relational database. Index Terms—Graph analysis, graph and database mining, collaborative recommendation, graph kernels, spectral clustering, Fiedler vector, proximity measures, statistical relational learning. 1
The slashdot zoo: Mining a social network with negative edges
 In WWW
, 2009
"... christian.bauckhage ..."
Learning Spectral Graph Transformations for Link Prediction
"... We present a unified framework for learning link prediction and edge weight prediction functions in large networks, based on the transformation of a graph’s algebraic spectrum. Our approach generalizes several graph kernels and dimensionality reduction methods and provides a method to estimate their ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
We present a unified framework for learning link prediction and edge weight prediction functions in large networks, based on the transformation of a graph’s algebraic spectrum. Our approach generalizes several graph kernels and dimensionality reduction methods and provides a method to estimate their parameters efficiently. We show how the parameters of these prediction functions can be learned by reducing the problem to a onedimensional regression problem whose runtime only depends on the method’s reduced rank and that can be inspected visually. We derive variants that apply to undirected, weighted, unweighted, unipartite and bipartite graphs. We evaluate our method experimentally using examples from social networks, collaborative filtering, trust networks, citation networks, authorship graphs and hyperlink networks. 1.
An experimental investigation of graph kernels on a collaborative recommendation task
 Proceedings of the 6th International Conference on Data Mining (ICDM 2006
, 2006
"... This paper presents a survey as well as a systematic empirical comparison of seven graph kernels and two related similarity matrices (simply referred to as graph kernels), namely the exponential diffusion kernel, the Laplacian exponential diffusion kernel, the von Neumann diffusion kernel, the regul ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
This paper presents a survey as well as a systematic empirical comparison of seven graph kernels and two related similarity matrices (simply referred to as graph kernels), namely the exponential diffusion kernel, the Laplacian exponential diffusion kernel, the von Neumann diffusion kernel, the regularized Laplacian kernel, the commutetime kernel, the randomwalkwithrestart similarity matrix, and finally, three graph kernels introduced in this paper: the regularized commutetime kernel, the Markov diffusion kernel, and the crossentropy diffusion matrix. The kernelonagraph approach is simple and intuitive. It is illustrated by applying the nine graph kernels to a collaborativerecommendation task and to a semisupervised classification task, both on several databases. The graph methods compute proximity measures between nodes that help study the structure of the graph. Our comparisons suggest that the regularized commutetime and the Markov diffusion kernels perform best, closely followed by the regularized Laplacian kernel. 1
A family of dissimilarity measures between nodes generalizing both the shortestpath and the commutetime distances
 in Proceedings of the 14th SIGKDD International Conference on Knowledge Discovery and Data Mining
"... This work introduces a new family of linkbased dissimilarity measures between nodes of a weighted directed graph. This measure, called the randomized shortestpath (RSP) dissimilarity, depends on a parameter θ and has the interesting property of reducing, on one end, to the standard shortestpath d ..."
Abstract

Cited by 14 (7 self)
 Add to MetaCart
This work introduces a new family of linkbased dissimilarity measures between nodes of a weighted directed graph. This measure, called the randomized shortestpath (RSP) dissimilarity, depends on a parameter θ and has the interesting property of reducing, on one end, to the standard shortestpath distance when θ is large and, on the other end, to the commutetime (or resistance) distance when θ is small (near zero). Intuitively, it corresponds to the expected cost incurred by a random walker in order to reach a destination node from a starting node while maintaining a constant entropy (related to θ) spread in the graph. The parameter θ is therefore biasing gradually the simple random walk on the graph towards the shortestpath policy. By adopting a statistical physics approach and computing a sum over all the possible paths (discrete path integral), it is shown that the RSP dissimilarity from every node to a particular node of interest can be computed efficiently by solving two linear systems of n equations, where n is the number of nodes. On the other hand, the dissimilarity between every couple of nodes is obtained by inverting an n × n matrix. The proposed measure can be used for various graph mining tasks such as computing betweenness centrality, finding dense communities, etc, as shown in the experimental section.
The sumoverpaths covariance kernel: A novel covariance measure between nodes of a directed graph
 the IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2009
"... Abstract—This work introduces a linkbased covariance measure between the nodes of a weighted directed graph where a cost is associated to each arc. To this end, a probability distribution on the (usually infinite) countable set of paths through the graph is defined by minimizing the total expected ..."
Abstract

Cited by 9 (6 self)
 Add to MetaCart
Abstract—This work introduces a linkbased covariance measure between the nodes of a weighted directed graph where a cost is associated to each arc. To this end, a probability distribution on the (usually infinite) countable set of paths through the graph is defined by minimizing the total expected cost between all pairs of nodes while fixing the total relative entropy spread in the graph. This results in a Boltzmann distribution on the set of paths such that long (highcost) paths occur with a low probability while short (lowcost) paths occur with a high probability. The sumoverpaths (SoP) covariance measure between nodes is then defined according to this probability distribution: two nodes are considered as highly correlated if they often cooccur together on the same – preferably short – paths. The resulting covariance matrix between nodes (say n nodes in total) is a Gram matrix and therefore defines a valid kernel on the graph. It is obtained by inverting a n × n matrix depending on the costs assigned to the arcs. In the same spirit, a betweenness score is also defined, measuring the expected number of times a node occurs on a path. The proposed measures could be used for various graph mining tasks such as computing betweenness centrality, semisupervised classification of nodes, visualization, etc, as shown in the experimental section. Index Terms—Graph mining, kernel on a graph, shortest path, correlation measure, betweenness measure, resistance distance, commutetime distance, biased random walk, semisupervised classification.
Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization
"... We study the application of spectral clustering, prediction and visualization methods to graphs with negatively weighted edges. We show that several characteristic matrices of graphs can be extended to graphs with positively and negatively weighted edges, giving signed spectral clustering methods, s ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
We study the application of spectral clustering, prediction and visualization methods to graphs with negatively weighted edges. We show that several characteristic matrices of graphs can be extended to graphs with positively and negatively weighted edges, giving signed spectral clustering methods, signed graph kernels and network visualization methods that apply to signed graphs. In particular, we review a signed variant of the graph Laplacian. We derive our results by considering random walks, graph clustering, graph drawing and electrical networks, showing that they all result in the same formalism for handling negatively weighted edges. We illustrate our methods using examples from social networks with negative edges and bipartite rating graphs. 1
Efficient Formulations for 1SVM and their Application to Recommendation Tasks
"... Abstract — The present paper proposes new approaches for recommendation tasks based on oneclass support vector machines (1SVMs) with graph kernels generated from a Laplacian matrix. We introduce new formulations for the 1SVM that can manipulate graph kernels quite efficiently. We demonstrate that ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract — The present paper proposes new approaches for recommendation tasks based on oneclass support vector machines (1SVMs) with graph kernels generated from a Laplacian matrix. We introduce new formulations for the 1SVM that can manipulate graph kernels quite efficiently. We demonstrate that the proposed formulations fully utilize the sparse structure of the Laplacian matrix, which enables the proposed approaches to be applied to recommendation tasks having a large number of customers and products in practical computational times. Results of various numerical experiments demonstrating the high performance of the proposed approaches are presented. Index Terms — support vector machine, Laplacian matrix, graph kernel, quadratic programming problem, collaborative filtering, recommender system I.
Alternative Similarity Functions for Graph Kernels
"... Given a bipartite graph of collaborative ratings, the task of recommendation and rating prediction can be modeled with graph kernels. We interpret these graph kernels as the inverted squared Euclidean distance in a space defined by the underlying graph and show that this inverted squared Euclidean s ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Given a bipartite graph of collaborative ratings, the task of recommendation and rating prediction can be modeled with graph kernels. We interpret these graph kernels as the inverted squared Euclidean distance in a space defined by the underlying graph and show that this inverted squared Euclidean similarity function can be replaced by other similarity functions. We evaluate several such similarity functions in the context of collaborative item recommendation and rating prediction, using the exponential diffusion kernel, the von Neumann kernel, and the random forest kernel as a basis. We find that the performance of graph kernels for these tasks can be increased by using these alternative similarity functions. 1.
Taku Kudo
"... Bootstrapping has a tendency, called semantic drift, to select instances unrelated to the seed instances as the iteration proceeds. We demonstrate the semantic drift of bootstrapping has the same root as the topic drift of Kleinberg’s HITS, using a simplified graphbased reformulation of bootstrappin ..."
Abstract
 Add to MetaCart
Bootstrapping has a tendency, called semantic drift, to select instances unrelated to the seed instances as the iteration proceeds. We demonstrate the semantic drift of bootstrapping has the same root as the topic drift of Kleinberg’s HITS, using a simplified graphbased reformulation of bootstrapping. We confirm that two graphbased algorithms, the von Neumann kernels and the regularized Laplacian, can reduce semantic drift in the task of word sense disambiguation (WSD) on Senseval3 English Lexical Sample Task. Proposed algorithms achieve superior performance to Espresso and previous graphbased WSD methods, even though the proposed algorithms have less parameters and are easy to calibrate. 1