Results 1  10
of
41
Evaluating recommendation systems
 In Recommender systems handbook
, 2011
"... Abstract Recommender systems are now popular both commercially and in the research community, where many approaches have been suggested for providing recommendations. In many cases a system designer that wishes to employ a recommendation system must choose between a set of candidate approaches. A f ..."
Abstract

Cited by 82 (2 self)
 Add to MetaCart
Abstract Recommender systems are now popular both commercially and in the research community, where many approaches have been suggested for providing recommendations. In many cases a system designer that wishes to employ a recommendation system must choose between a set of candidate approaches. A first step towards selecting an appropriate algorithm is to decide which properties of the application to focus upon when making this choice. Indeed, recommendation systems have a variety of properties that may affect user experience, such as accuracy, robustness, scalability, and so forth. In this paper we discuss how to compare recommenders based on a set of properties that are relevant for the application. We focus on comparative studies, where a few algorithms are compared using some evaluation metric, rather than absolute benchmarking of algorithms. We describe experimental settings appropriate for making choices between algorithms. We review three types of experiments, starting with an offline setting, where recommendation approaches are compared without user interaction, then reviewing user studies, where a small group of subjects experiment with the system and report on the experience, and finally describe large scale online experiments, where real user populations interact with the system. In each of these cases we describe types of questions that can be answered, and suggest protocols for experimentation. We also discuss how to draw trustworthy conclusions from the conducted experiments. We then review a large set of properties, and explain how to evaluate systems given relevant properties. We also survey a large set of evaluation metrics in the context of the property that they evaluate.
Factor in the neighbors: Scalable and accurate collaborative filtering
 ACM TKDD
"... Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The most common approach to CF is based on neighborho ..."
Abstract

Cited by 70 (1 self)
 Add to MetaCart
(Show Context)
Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The most common approach to CF is based on neighborhood models, which is based on similarities between products or users. In this work we introduce a new neighborhood model with an improved prediction accuracy. The model works by minimizing a global cost function. Further accuracy improvements are achieved by extending the model to exploit both explicit and implicit feedback by the users. Past models were limited by the need to compute all pairwise similarities between items or users, which grow quadratically with input size. In particular, this limitation vastly complicates adopting user similarity models, due to the typical large number of users. Our new model solves these limitations by factoring the neighborhood model, thus making both itemitem and useruser implementations scale linearly with the size of the data. The methods are tested on the Netflix data, with encouraging results. In addition, we suggest a new evaluation metric, which highlights the differences among methods, based on their performance at a topK recommendation task. Our study reveals a very significant improvement in quality of topK recommendation. 1.
HERTZMANN A.: Color compatibility from large datasets
 In ACM SIGGRAPH 2011 papers (2011), SIGGRAPH ’11
"... This paper studies color compatibility theories using large datasets, and develops new tools for choosing colors. There are three parts to this work. First, using online datasets, we test new and existing theories of human color preferences. For example, we test whether certain hues or hue template ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
This paper studies color compatibility theories using large datasets, and develops new tools for choosing colors. There are three parts to this work. First, using online datasets, we test new and existing theories of human color preferences. For example, we test whether certain hues or hue templates may be preferred by viewers. Second, we learn quantitative models that score the quality of a set of five colors, called a color theme. Such models can be used to rate the quality of a new color theme. Third, we demonstrate simple prototypes that apply a learned model to tasks in color design, including improving existing themes and extracting themes from images. Links: DL PDF WEB DATA CODE 1
SIMILARITY BASED ON RATING DATA
"... This paper describes an algorithm to measure the similarity of two multimedia objects, such as songs or movies, using users ’ preferences. Much of the previous work on querybyexample (QBE) or music similarity uses detailed analysis of the object’s content. This is difficult and it is often impossi ..."
Abstract

Cited by 21 (3 self)
 Add to MetaCart
(Show Context)
This paper describes an algorithm to measure the similarity of two multimedia objects, such as songs or movies, using users ’ preferences. Much of the previous work on querybyexample (QBE) or music similarity uses detailed analysis of the object’s content. This is difficult and it is often impossible to capture how consumers react to the music. We argue that a large collection of user’s preferences is more accurate, at least in comparison to our benchmark system, at finding similar songs. We describe an algorithm based the song’s rating data, and show how this approach works by measuring its performance using an objective metric based on whether the same artist performed both songs. Our similarity results are based on 1.5 million musical judgments by 380,000 users. We test our system by generating playlists using a contentbased system, our ratingbased system, and a random list of songs. Music listeners greatly preferred the ratingsbased playlists over the contentbased and random playlists. 1
A SpatioTemporal Approach to Collaborative Filtering
"... In this paper, we propose a novel spatiotemporal model for collaborative filtering applications. Our model is based on lowrank matrix factorization that uses a spatiotemporal filtering approach to estimate user and item factors. The spatial component regularizes the factors by exploiting correlat ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
(Show Context)
In this paper, we propose a novel spatiotemporal model for collaborative filtering applications. Our model is based on lowrank matrix factorization that uses a spatiotemporal filtering approach to estimate user and item factors. The spatial component regularizes the factors by exploiting correlation across users and/or items, modeled as a function of some implicit feedback (e.g., who rated what) and/or some side information (e.g., user demographics, browsing history). In particular, we incorporate correlation in factors through a Markov random field prior in a probabilistic framework, whereby the neighborhood weights are functions of user and item covariates. The temporal component ensures that the user/item factors adapt to process changes that occur through time and is implemented in a state space framework with fast estimation through Kalman filtering. Our spatiotemporal filtering (STKF hereafter) approach provides a single joint model to simultaneously incorporate both spatial and temporal structure in ratings and therefore provides an accurate method to predict future ratings. To ensure scalability of STKF, we employ a meanfield approximation for inference. Incorporating user/item covariates in estimating neighborhood weights also helps in dealing with both coldstart and warmstart problems seamlessly in a single unified modeling framework; covariates predict factors for new users and items through the neighborhood. We illustrate our method on simulated data, benchmark data and data obtained from a relatively new recommender system application arising in the context of Yahoo! Front Page.
Graphical Models for Inference with Missing Data
"... We address the problem of recoverability i.e. deciding whether there exists a consistent estimator of a given relation Q, when data are missing not at random. We employ a formal representation called ‘Missingness Graphs ’ to explicitly portray the causal mechanisms responsible for missingness and t ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
We address the problem of recoverability i.e. deciding whether there exists a consistent estimator of a given relation Q, when data are missing not at random. We employ a formal representation called ‘Missingness Graphs ’ to explicitly portray the causal mechanisms responsible for missingness and to encode dependencies between these mechanisms and the variables being measured. Using this representation, we derive conditions that the graph should satisfy to ensure recoverability and devise algorithms to detect the presence of these conditions in the graph. 1
Missing Data Problems in Machine Learning
, 2008
"... Learning, inference, and prediction in the presence of missing data are pervasive problems in machine learning and statistical data analysis. This thesis focuses on the problems of collaborative prediction with nonrandom missing data and classification with missing features. We begin by presenting ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
Learning, inference, and prediction in the presence of missing data are pervasive problems in machine learning and statistical data analysis. This thesis focuses on the problems of collaborative prediction with nonrandom missing data and classification with missing features. We begin by presenting and elaborating on the theory of missing data due to Little and Rubin. We place a particular emphasis on the missing at random assumption in the multivariate setting with arbitrary patterns of missing data. We derive inference and prediction methods in the presence of random missing data for a variety of probabilistic models including finite mixture models, Dirichlet process mixture models, and factor analysis. Based on this foundation, we develop several novel models and inference procedures for both the collaborative prediction problem and the problem of classification with missing features. We develop models and methods for collaborative prediction with nonrandom missing data by combining standard models for complete data with models of the missing data process. Using a novel recommender system data set and experimental protocol, we show that each proposed method achieves a substantial increase in rating prediction performance compared to models that assume missing ratings are missing at random.
Statistical significance of the Netflix challenge
 URL http: //arxiv.org/abs/1207.5649
"... ar ..."
Collaborative filtering with interlaced generalized linear models
, 2008
"... Collaborative filtering (CF) is a data analysis task appearing in many challenging applications, in particular data mining in Internet and ecommerce. CF can often be formulated as identifying patterns in a large and mostly empty rating matrix. In this paper, we focus on predicting unobserved rating ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
Collaborative filtering (CF) is a data analysis task appearing in many challenging applications, in particular data mining in Internet and ecommerce. CF can often be formulated as identifying patterns in a large and mostly empty rating matrix. In this paper, we focus on predicting unobserved ratings. This task is often a part of a recommendation procedure. We propose a new CF approach called interlaced generalized linear models (GLM); it is based on a factorization of the rating matrix and uses probabilistic modeling to represent uncertainty in the ratings. The advantage of this approach is that different configurations, encoding different intuitions about the rating process can easily be tested while keeping the same learning procedure. The GLM formulation is the keystone to derive an efficient learning procedure, applicable to large datasets. We illustrate the technique on three public domain datasets. r 2008 Elsevier B.V. All rights reserved.
Missing data as a causal inference problem
 Forthcoming, Proceedings of NIPS
, 2013
"... We address the problem of deciding whether there exists an unbiased estimator of a given relation Q, when data are missing not at random. We employ a formal representation called ‘Missingness Graphs ’ to explicitly portray the causal mechanisms responsible for missingness and to encode dependencies ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
We address the problem of deciding whether there exists an unbiased estimator of a given relation Q, when data are missing not at random. We employ a formal representation called ‘Missingness Graphs ’ to explicitly portray the causal mechanisms responsible for missingness and to encode dependencies between these mechanisms and the variables being measured. Using this representation, we define the notion of recoverability which ensures that, for a given missingnessgraph G and a given query Q an algorithm exists that produces an unbiased estimate of Q. That is, in the limit of large samples, the algorithm should produce an estimate of Q as if no data were missing. We further present conditions that the graph should satisfy in order for recoverability to hold and devise algorithms to detect the presence of these conditions. 1