Results 1  10
of
78
Evaluating recommendation systems
 In Recommender systems handbook
, 2011
"... Abstract Recommender systems are now popular both commercially and in the research community, where many approaches have been suggested for providing recommendations. In many cases a system designer that wishes to employ a recommendation system must choose between a set of candidate approaches. A f ..."
Abstract

Cited by 85 (2 self)
 Add to MetaCart
Abstract Recommender systems are now popular both commercially and in the research community, where many approaches have been suggested for providing recommendations. In many cases a system designer that wishes to employ a recommendation system must choose between a set of candidate approaches. A first step towards selecting an appropriate algorithm is to decide which properties of the application to focus upon when making this choice. Indeed, recommendation systems have a variety of properties that may affect user experience, such as accuracy, robustness, scalability, and so forth. In this paper we discuss how to compare recommenders based on a set of properties that are relevant for the application. We focus on comparative studies, where a few algorithms are compared using some evaluation metric, rather than absolute benchmarking of algorithms. We describe experimental settings appropriate for making choices between algorithms. We review three types of experiments, starting with an offline setting, where recommendation approaches are compared without user interaction, then reviewing user studies, where a small group of subjects experiment with the system and report on the experience, and finally describe large scale online experiments, where real user populations interact with the system. In each of these cases we describe types of questions that can be answered, and suggest protocols for experimentation. We also discuss how to draw trustworthy conclusions from the conducted experiments. We then review a large set of properties, and explain how to evaluate systems given relevant properties. We also survey a large set of evaluation metrics in the context of the property that they evaluate.
DSybil: Optimal SybilResistance for Recommendation Systems
, 2009
"... Recommendation systems can be attacked in various ways, and the ultimate attack form is reached with a sybil attack, where the attacker creates a potentially unlimited number of sybil identities to vote. Defending against sybil attacks is often quite challenging, and the nature of recommendation sys ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
(Show Context)
Recommendation systems can be attacked in various ways, and the ultimate attack form is reached with a sybil attack, where the attacker creates a potentially unlimited number of sybil identities to vote. Defending against sybil attacks is often quite challenging, and the nature of recommendation systems makes it even harder. This paper presents DSybil, a novel defense for diminishing the influence of sybil identities in recommendation systems. DSybil provides strong provable guarantees that hold even under the worstcase attack and are optimal. DSybil can defend against an unlimited number of sybil identities over time. DSybil achieves its strong guarantees by i) exploiting the heavytail distribution of the typical voting behavior of the honest identities, and ii) carefully identifying whether the system is already getting “enough help ” from the (weighted) voters already taken into account or whether more “help ” is needed. Our evaluation shows that DSybil would continue to provide highquality recommendations even when a millionnode botnet uses an optimal strategy to launch a sybil attack. 1.
Robustness of collaborative recommendation based on association rule mining. In: RecSys
, 2007
"... ..."
(Show Context)
The influence limiter: Provably manipulationresistant recommender systems
 In To appear in Proceedings of the ACM Recommender Systems Conference (RecSys07
, 2007
"... This appendix should be read in conjunction with the article by Resnick and Sami [1]. Here, we include the proofs that were omitted from the main article due to shortage of space. A.1 Lemma 5 Lemma 5: For the quadratic scoring rule (MSE) loss, for all q,u ∈ [0,1], GF(qu) ≥ D(qu) 2. Proof of Lem ..."
Abstract

Cited by 27 (8 self)
 Add to MetaCart
(Show Context)
This appendix should be read in conjunction with the article by Resnick and Sami [1]. Here, we include the proofs that were omitted from the main article due to shortage of space. A.1 Lemma 5 Lemma 5: For the quadratic scoring rule (MSE) loss, for all q,u ∈ [0,1], GF(qu) ≥ D(qu) 2. Proof of Lemma 5: Because both D(qu) = D(1 − q1 − u) and GF(qu) = GF(1 − q1 − u), we can assume u ≥ q without loss of generality. Keeping q fixed, we want to show that the result holds for all u. Note that D(qq) = GF(qq) = 0. Thus, differentiating with respect to u, it is sufficient to prove that GF ′ (qu) ≥ D ′ (qu)/2 for all u ≥ q,u ≤ 1. We change variables by setting y = u − q. We use the notation D ′ (y) to denote D ′ (qu)u=q+y, treating q as fixed and implicit. Likewise, we use the notation GF ′ (y). For brevity, we use q to denote (1 − q). D(qu) = q[(q − y) 2 − q 2]+q[(q+y) 2 − q 2] = q[y 2 − 2yq]+q[y 2 + 2qy] = y 2 ⇒ D ′ (y) = 2y 1 GF(qu) = qlog(1+y 2 − 2qy)+qlog(1+y 2 + 2qy)
Learning
 In School and Out," Educational Researcher
, 1987
"... In this letter, we outline a new approach to modeling, analyzing, and combating manipulative attacks on recommender systems. ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
(Show Context)
In this letter, we outline a new approach to modeling, analyzing, and combating manipulative attacks on recommender systems.
Stability of Collaborative Filtering Recommendation Algorithms 1
"... The paper explores stability as a new measure of recommender systems performance. Stability is defined to measure the extent to which a recommendation algorithm provides predictions that are consistent with each other. Specifically, for a stable algorithm, adding some of the algorithm’s own predicti ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
The paper explores stability as a new measure of recommender systems performance. Stability is defined to measure the extent to which a recommendation algorithm provides predictions that are consistent with each other. Specifically, for a stable algorithm, adding some of the algorithm’s own predictions to the algorithm’s training data (for example, if these predictions were confirmed as accurate by users) would not invalidate or change the other predictions. While stability is an interesting theoretical property that can provide additional understanding about recommendation algorithms, we believe stability to be a desired practical property for recommender systems designers as well, because unstable recommendations can potentially decrease users ’ trust in recommender systems and, as a result, reduce users ’ acceptance of recommendations. In this paper, we also provide an extensive empirical evaluation of stability for six popular recommendation algorithms on four realworld datasets. Our results suggest that stability performance of individual recommendation algorithms is consistent across a variety of datasets and settings. In particular, we find that modelbased recommendation algorithms consistently demonstrate higher stability than neighborhoodbased collaborative filtering techniques. In addition, we perform a comprehensive empirical analysis of many important factors (e.g., the sparsity of original rating data, normalization of input data, the number of new incoming ratings, the distribution of incoming ratings, the distribution of evaluation data, etc.) and report the impact they have on recommendation stability.
Modeling topic specific credibility on twitter
 in IUI, 2012
"... This paper presents and evaluates three computational models for recommending credible topicspecific information in Twitter. The first model focuses on credibility at the user level, harnessing various dynamics of information flow in the underlying social graph to compute a credibility rating. The ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
(Show Context)
This paper presents and evaluates three computational models for recommending credible topicspecific information in Twitter. The first model focuses on credibility at the user level, harnessing various dynamics of information flow in the underlying social graph to compute a credibility rating. The second model applies a contentbased strategy to compute a finergrained credibility score for individual tweets. Lastly, we discuss a third model which combines facets from both models in a hybrid method, using both averaging and filtering hybrid strategies. To evaluate our novel credibility models, we perform an evaluation on 7 topic specific data sets mined from the Twitter streaming API, with specific focus on a data set of 37K users who tweeted about the topic “Libya”. Results show that the social model outperfoms hybrid and contentbased prediction models in terms of predictive accuracy over a set of manually collected credibility ratings on the “Libya ” dataset.
The information cost of manipulationresistance in recommender systems
 In: RecSys 08: Proceedings of the 2008 ACM conference on Recommender systems
"... Attackers may seek to manipulate recommender systems in order to promote or suppress certain items. Existing defenses based on analysis of ratings also discard useful information from honest raters. In this paper, we show that this is unavoidable and provide a lower bound on how much information mus ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
(Show Context)
Attackers may seek to manipulate recommender systems in order to promote or suppress certain items. Existing defenses based on analysis of ratings also discard useful information from honest raters. In this paper, we show that this is unavoidable and provide a lower bound on how much information must be discarded. We use an informationtheoretic framework to exhibit a fundamental tradeoff between manipulationresistance and optimal use of genuine ratings in recommender systems. We define a recommender system to be (n, c)robust if an attacker with n sybil identities cannot cause more than a limited amount c units of damage to predictions. We prove that any robust recommender system must also discard Ω(log n) units of useful c information from each genuine rater.
Temporal collaborative filtering with adaptive neighbourhoods
 Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, ACM, 2009
"... Recommender Systems, based on collaborative filtering (CF), aim to accurately predict user tastes, by minimising the mean error achieved on hidden test sets of user ratings, after learning from a training set. However, deployed recommender systems do not operate on, and should not be optimised to pr ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
(Show Context)
Recommender Systems, based on collaborative filtering (CF), aim to accurately predict user tastes, by minimising the mean error achieved on hidden test sets of user ratings, after learning from a training set. However, deployed recommender systems do not operate on, and should not be optimised to predict, a static set of user ratings because the underlying dataset is continuously growing and changing. The aim of a recommender system is therefore to iteratively predict users ’ preferences over a dynamic dataset, and system administrators are confronted with the problem of having to continuously tune the parameters calibrating their CF algorithm for best performance. In this work, we first formalise CF as a timedependent, iterative prediction problem. We then perform a temporal analysis of the Netflix dataset, and evaluate the temporal performance of a baseline model and the kNearest Neighbour algorithm. We show that, due to the dynamic nature of the data, certain prediction methods that improve prediction accuracy on the Netflix probe set do not show similar improvements over a set of iterative traintest experiments with growing data. We then address the problem of parameter selection and update, and propose a method to automatically assign and update peruser neighbourhood sizes that (on the temporal scale) outperforms setting global parameters.
Credibility in context: An analysis of feature distributions
 in twitter. ASE/IEEE International Conference on Social Computing, SocialCom
, 2012
"... Abstract—Twitter is a major forum for rapid dissemination of userprovided content in real time. As such, a large proportion of the information it contains is not particularly relevant to many users and in fact is perceived as unwanted ’noise ’ by many. There has been increased research interest in ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
Abstract—Twitter is a major forum for rapid dissemination of userprovided content in real time. As such, a large proportion of the information it contains is not particularly relevant to many users and in fact is perceived as unwanted ’noise ’ by many. There has been increased research interest in predicting whether tweets are relevant, newsworthy or credible, using a variety of models and methods. In this paper, we focus on an analysis that highlights the utility of the individual features in Twitter such as hashtags, retweets and mentions for predicting credibility. We first describe a contextbased evaluation of the utility of a set of features for predicting manually provided credibility assessments on a corpus of microblog tweets. This is followed by an evaluation of the distribution/presence of each feature across 8 diverse crawls of tweet data. Last, an analysis of feature distribution across dyadic pairs of tweets and retweet chains of various lengths is described. Our results show that the best indicators of credibility include URLs, mentions, retweets and tweet length and that features occur more prominently in data describing emergency and unrest situations. I.