Results 1 - 10
of
74
A Survey of Collaborative Filtering Techniques
, 2009
"... As one of the most successful approaches to building recommender systems, collaborative filtering (CF) uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. In this paper, we first introduce CF tasks and their main challenge ..."
Abstract
-
Cited by 216 (0 self)
- Add to MetaCart
As one of the most successful approaches to building recommender systems, collaborative filtering (CF) uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. In this paper, we first introduce CF tasks and their main challenges, such as data sparsity, scalability, synonymy, gray sheep, shilling attacks, privacy protection, etc., and their possible solutions. We then present three main categories of CF techniques: memory-based, model-based, and hybrid CF algorithms (that combine CF with other recommendation techniques), with examples for representative algorithms of each category, and analysis of their predictive performance and their ability to address the challenges. From basic techniques to the state-of-the-art, we attempt to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.
Random-walk computation of similarities between nodes of a graph, with application to collaborative recommendation
- IEEE Transactions on Knowledge and Data Engineering
"... ABSTRACT This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted, undirected, graph. It is based on a Markov-chain model of random walk through the database. More precisely, we compute quantities (the average commu ..."
Abstract
-
Cited by 194 (19 self)
- Add to MetaCart
(Show Context)
ABSTRACT This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted, undirected, graph. It is based on a Markov-chain model of random walk through the database. More precisely, we compute quantities (the average commute time, the pseudoinverse of the Laplacian matrix of the graph, etc) that provide similarities between any pair of nodes, having the nice property of increasing when the number of paths connecting those elements increases and when the "length" of paths decreases. It turns out that the square root of the average commute time is a Euclidean distance and that the pseudoinverse of the Laplacian matrix is a kernel (it contains inner-products closely related to commute times). A procedure for computing the subspace projection of the node vectors of the graph that preserves as much variance as possible in terms of the commute-time distance -a principal components analysis (PCA) of the graph -is also introduced. This graph PCA provides a nice interpretation to the "Fiedler vector", widely used for graph partitioning. The model is evaluated on a collaborative-recommendation task where suggestions are made about which movies people should watch based upon what they watched in the past. Experimental results on the MovieLens database show that the Laplacian-based similarities perform well in comparison with other methods. The model, which nicely fits into the so-called "statistical relational learning" framework, could also be used to compute document or word similarities, and, more generally, could be applied to machine-learning and pattern-recognition tasks involving a database. * François Fouss, Alain Pirotte and Marco Saerens are with the
ClustKNN: a highly scalable hybrid model-& memory-based CF algorithm
- In Proc. of WebKDD-06, KDD Workshop on Web Mining and Web Usage Analysis, at 12 th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
, 2006
"... Collaborative Filtering (CF)-based recommender systems are indispensable tools to find items of interest from the unmanageable number of available items. Moreover, companies who deploy a CF-based recommender system may be able to increase revenue by drawing customers ’ attention to items that they a ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
(Show Context)
Collaborative Filtering (CF)-based recommender systems are indispensable tools to find items of interest from the unmanageable number of available items. Moreover, companies who deploy a CF-based recommender system may be able to increase revenue by drawing customers ’ attention to items that they are likely to buy. However, the sheer number of customers and items typical in e-commerce systems demand specially designed CF algorithms that can gracefully cope with the vast size of the data. Many algorithms proposed thus far, where the principal concern is recommendation quality, may be too expensive to operate in a large-scale system. We propose ClustKnn, a simple and intuitive algorithm that is well suited for large data sets. The method first compresses data tremendously by building a straightforward but efficient clustering model. Recommendations are then generated quickly by using a simple Nearest Neighbor-based approach. We demonstrate the feasibility of ClustKnn both analytically and empirically. We also show, by comparing with a number of other popular CF algorithms that, apart from being highly scalable and intuitive, ClustKnn provides very good recommendation accuracy as well.
An experimental investigation of graph kernels on a collaborative recommendation task
- Proceedings of the 6th International Conference on Data Mining (ICDM 2006
, 2006
"... This paper presents a survey as well as a systematic empirical comparison of seven graph kernels and two related similarity matrices (simply referred to as graph kernels), namely the exponential diffusion kernel, the Laplacian exponential diffusion kernel, the von Neumann diffusion kernel, the regul ..."
Abstract
-
Cited by 27 (7 self)
- Add to MetaCart
(Show Context)
This paper presents a survey as well as a systematic empirical comparison of seven graph kernels and two related similarity matrices (simply referred to as graph kernels), namely the exponential diffusion kernel, the Laplacian exponential diffusion kernel, the von Neumann diffusion kernel, the regularized Laplacian kernel, the commute-time kernel, the random-walk-with-restart similarity matrix, and finally, three graph kernels introduced in this paper: the regularized commute-time kernel, the Markov diffusion kernel, and the cross-entropy diffusion matrix. The kernel-on-a-graph approach is simple and intuitive. It is illustrated by applying the nine graph kernels to a collaborative-recommendation task and to a semisupervised classification task, both on several databases. The graph methods compute proximity measures between nodes that help study the structure of the graph. Our comparisons suggest that the regularized commute-time and the Markov diffusion kernels perform best, closely followed by the regularized Laplacian kernel. 1
Predicting Trust and Distrust in Social Networks
"... Abstract—As user-generated content and interactions have overtaken the web as the default mode of use, questions of whom and what to trust have become increasingly important. Fortunately, online social networks and social media have made it easy for users to indicate whom they trust and whom they do ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
(Show Context)
Abstract—As user-generated content and interactions have overtaken the web as the default mode of use, questions of whom and what to trust have become increasingly important. Fortunately, online social networks and social media have made it easy for users to indicate whom they trust and whom they do not. However, this does not solve the problem since each user is only likely to know a tiny fraction of other users; we must have methods for inferring trust- and distrust- between users who do not know one another. In this paper, we present a new method for computing both trust and distrust (i.e., positive and negative trust). We do this by combining an inference algorithm that relies on a probabilistic interpretation of trust based on random graphs with a modified spring-embedding algorithm. Our algorithm correctly classifies hidden trust edges as positive or negative with high accuracy. These results are useful in a wide range of social web applications where trust is important to user behavior and satisfaction. I.
A Novel Way of Computing Dissimilarities between Nodes of a Graph, with Application to Collaborative Filtering
, 2004
"... This work presents some general procedures for computing dissimilarities between elements of a database or, more generally, nodes of a weighted, undirected, graph. It is based on a Markov-chain model of random walk through the database. The model assigns transition probabilities to the links betw ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
This work presents some general procedures for computing dissimilarities between elements of a database or, more generally, nodes of a weighted, undirected, graph. It is based on a Markov-chain model of random walk through the database. The model assigns transition probabilities to the links between elements, so that a random walker can jump from element to element. A quantity, called the average first-passage cost, computes the average cost incurred by a random walker for reaching element k for the first time when starting from element i.
Improving Recommendation Accuracy by Clustering Social Networks with Trust ABSTRACT
"... Social trust relationships between users in social networks speak to the similarity in opinions between the users, both in general and in important nuanced ways. They have been used in the past to make recommendations on the web. New trust metrics allow us to easily cluster users based on trust. In ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
(Show Context)
Social trust relationships between users in social networks speak to the similarity in opinions between the users, both in general and in important nuanced ways. They have been used in the past to make recommendations on the web. New trust metrics allow us to easily cluster users based on trust. In this paper, we investigate the use of trust clusters as a new way of improving recommendations. Previous work on the use of clusters has shown the technique to be relatively unsuccessful, but those clusters were based on similarity rather than trust. Our results show that when trust clusters are integrated into memory-based collaborative filtering algorithms, they lead to statistically significant improvements in accuracy. In this paper we discuss our methods, experiments, results, and potential future applications of the technique.
Detecting Noise in Recommender System Databases
- IN PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES (IUI’06), 29TH–1ST
, 2006
"... In this paper, we propose a framework that enables the detection of noise in recommender system databases. We consider two classes of noise: natural and malicious noise. The issue of natural noise arises from imperfect user behaviour (e.g. erroneous/careless preference selection) and the various rat ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
In this paper, we propose a framework that enables the detection of noise in recommender system databases. We consider two classes of noise: natural and malicious noise. The issue of natural noise arises from imperfect user behaviour (e.g. erroneous/careless preference selection) and the various rating collection processes that are employed. Malicious noise concerns the deliberate attempt to bias system output in some particular manner. We argue that both classes of noise are important and can adversely e#ect recommendation performance. Our objective is to devise techniques that enable system administrators to identify and remove from the recommendation process any such noise that is present in the data. We provide an empirical evaluation of our approach and demonstrate that it is successful with respect to key performance indicators.
A Comparative Study of Collaborative Filtering Algorithms,” [Online] arXiv: 1205.3193
, 2012
"... Collaborative filtering is a rapidly advancing research area. Every year several new techniques are proposed and yet it is not clear which of the techniques work best and under what conditions. In this paper we conduct a study comparing several collabo-rative filtering techniques – both classic and ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
Collaborative filtering is a rapidly advancing research area. Every year several new techniques are proposed and yet it is not clear which of the techniques work best and under what conditions. In this paper we conduct a study comparing several collabo-rative filtering techniques – both classic and recent state-of-the-art – in a variety of experimental contexts. Specifically, we report conclusions controlling for number of items, number of users, sparsity level, performance criteria, and computational com-plexity. Our conclusions identify what algorithms work well and in what conditions, and contribute to both industrial deployment collaborative filtering algorithms and to the research community. 1
An experimental investigation of kernels on graphs for collaborative . . .
- NEURAL NETWORKS
, 2012
"... ..."