Results 1 - 10
of
79
Evaluating collaborative filtering recommender systems
- ACM Transactions on Information Systems
, 2004
"... © ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM ..."
Abstract
-
Cited by 365 (9 self)
- Add to MetaCart
© ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM
Restricted Boltzmann machines for collaborative filtering
- In Machine Learning, Proceedings of the Twenty-fourth International Conference (ICML 2004). ACM
, 2007
"... Most of the existing approaches to collaborative filtering cannot handle very large data sets. In this paper we show how a class of two-layer undirected graphical models, called Restricted Boltzmann Machines (RBM’s), can be used to model tabular data, such as user’s ratings of movies. We present eff ..."
Abstract
-
Cited by 74 (11 self)
- Add to MetaCart
Most of the existing approaches to collaborative filtering cannot handle very large data sets. In this paper we show how a class of two-layer undirected graphical models, called Restricted Boltzmann Machines (RBM’s), can be used to model tabular data, such as user’s ratings of movies. We present efficient learning and inference procedures for this class of models and demonstrate that RBM’s can be successfully applied to the Netflix data set, containing over 100 million user/movie ratings. We also show that RBM’s slightly outperform carefully-tuned SVD models. When the predictions of multiple RBM models and multiple SVD models are linearly combined, we achieve an error rate that is well over 6 % better than the score of Netflix’s own system. 1.
Factorization meets the neighborhood: a multifaceted collaborative filtering model
- In Proc. of the 14th ACM SIGKDD conference
, 2008
"... Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The two more successful approaches to CF are latent f ..."
Abstract
-
Cited by 68 (6 self)
- Add to MetaCart
Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The two more successful approaches to CF are latent factor models, which directly profile both users and products, and neighborhood models, which analyze similarities between products or users. In this work we introduce some innovations to both approaches. The factor and neighborhood models can now be smoothly merged, thereby building a more accurate combined model. Further accuracy improvements are achieved by extending the models to exploit both explicit and implicit feedback by the users. The methods are tested on the Netflix data. Results are better than those previously published on that dataset. In addition, we suggest a new evaluation metric, which highlights the differences among methods, based on their performance at a top-K recommendation task.
Unifying user-based and item-based collaborative filtering approaches by similarity fusion
- In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
, 2006
"... Memory-based methods for collaborative filtering predict new ratings by averaging (weighted) ratings between, respectively, pairs of similar users or items. In practice, a large number of ratings from similar users or similar items are not available, due to the sparsity inherent to rating data. Cons ..."
Abstract
-
Cited by 37 (3 self)
- Add to MetaCart
Memory-based methods for collaborative filtering predict new ratings by averaging (weighted) ratings between, respectively, pairs of similar users or items. In practice, a large number of ratings from similar users or similar items are not available, due to the sparsity inherent to rating data. Consequently, prediction quality can be poor. This paper reformulates the memory-based collaborative filtering problem in a generative probabilistic framework, treating individual user-item ratings as predictors of missing ratings. The final rating is estimated by fusing predictions from three sources: predictions based on ratings of the same item by other users, predictions based on different item ratings made by the same user, and, third, ratings predicted based on data from other but similar users rating other but similar items. Existing user-based and item-based approaches correspond to the two simple cases of our framework. The complete model is however more robust to data sparsity, because the different types of ratings are used in concert, while additional ratings from similar users towards similar items are employed as a background model to smooth the predictions. Experiments demonstrate that the proposed methods are indeed more robust against data sparsity and give better recommendations.
Slope One Predictors for Online Rating-Based Collaborative Filtering
- in SIAM Data Mining (SDM05
, 2005
"... Rating-based collaborative filtering is the process of predicting how a user would rate a given item from other user ratings. We propose three related slope one schemes with predictors of the form f (x) = x + b, which precompute the average difference between the ratings of one item and another for ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
Rating-based collaborative filtering is the process of predicting how a user would rate a given item from other user ratings. We propose three related slope one schemes with predictors of the form f (x) = x + b, which precompute the average difference between the ratings of one item and another for users who rated both. Slope one algorithms are easy to implement, efficient to query, reasonably accurate, and they support both online queries and dynamic updates, which makes them good candidates for real-world systems. The basic SLOPE ONE scheme is suggested as a new reference scheme for collaborative filtering. By factoring in items that a user liked separately from items that a user disliked, we achieve results competitive with slower memorybased schemes over the standard benchmark EachMovie and Movielens data sets while better fulfilling the desiderata of CF applications.
Collaborative Filtering Recommender Systems
, 2007
"... One of the potent personalization technologies powering the adaptive web is collaborative filtering. Collaborative filtering (CF) is the process of filtering or evaluating items through the opinions of other people. CF technology brings together the opinions of large interconnected communities on ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
One of the potent personalization technologies powering the adaptive web is collaborative filtering. Collaborative filtering (CF) is the process of filtering or evaluating items through the opinions of other people. CF technology brings together the opinions of large interconnected communities on the web, supporting filtering of substantial quantities of data. In this chapter we introduce the core concepts of collaborative filtering, its primary uses for users of the adaptive web, the theory and practice of CF algorithms, and design decisions regarding rating systems and acquisition of ratings. We also discuss how to evaluate CF systems, and the evolution of rich interaction interfaces. We close the chapter with discussions of the challenges of privacy particular to a CF recommendation service and important open research questions in the field.
Privacy-Preserving Collaborative Filtering Using Randomized Perturbation Techniques
, 2003
"... Collaborative Filtering (CF) techniques are becoming increasingly popular with the evolution of the Internet. E-commerce sites use CF systems to suggest products to customers based on like-minded customers' preferences. People use CF systems to cope with information overload. To conduct collaborativ ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
Collaborative Filtering (CF) techniques are becoming increasingly popular with the evolution of the Internet. E-commerce sites use CF systems to suggest products to customers based on like-minded customers' preferences. People use CF systems to cope with information overload. To conduct collaborative filtering, data from customers are needed. However, collecting high quality data from customers is not an easy task because many customers are so concerned about their privacy that they might decide to give false information. CF systems using these data might produce inaccurate recommendations. We propose a randomized perturbation technique to protect users' privacy while still producing accurate recommendations. Although the randomized perturbation techniques add randomness to the original data to prevent the data collector from learning the private user data, our scheme can still provide recommendations with decent accuracy. We conducted several experiments to compare the recommendations on the randomized data with those on the original data. Using these experiment results, we analyzed how different parameters affect the accuracy. Our results show that the CF systems using the randomized perturbation techniques provide accurate recommendations while preserving the users' privacy.
MMM2: Mobile Media Metadata for Media Sharing
- In Extended Abstracts of the Conference on Human Factors in Computing Systems
, 2005
"... Cameraphones are rapidly becoming a global platform for everyday digital imaging especially for networked sharing of media from mobile devices. However, their constrained user interfaces and the current network and application infrastructure encumber the basic tasks of transferring, finding, and sha ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
Cameraphones are rapidly becoming a global platform for everyday digital imaging especially for networked sharing of media from mobile devices. However, their constrained user interfaces and the current network and application infrastructure encumber the basic tasks of transferring, finding, and sharing captured media. We have deployed a prototype context-aware cameraphone application for mobile media sharing (MMM2) that aims to overcome these difficulties. MMM2 leverages the point of capture and of sharing to gather metadata, and uses metadata to support sharing. Based on the early results of the first 6 weeks of a sixmonth trial involving 60 users, indications are that with MMM2 users are actively capturing and sharing photos. The ability to automatically upload photos from a cameraphone to a web-based photo management application and to automatically suggest sharing recipients at the time of capture based on Bluetooth-sensed co-presence and sharing frequency promise to reduce the current difficulty of mobile media sharing. Author Keywords Cameraphones, mobile media metadata, photo sharing, context-aware
Factor in the neighbors: Scalable and accurate collaborative filtering
- ACM TKDD
"... Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The most common approach to CF is based on neighborho ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The most common approach to CF is based on neighborhood models, which is based on similarities between products or users. In this work we introduce a new neighborhood model with an improved prediction accuracy. The model works by minimizing a global cost function. Further accuracy improvements are achieved by extending the model to exploit both explicit and implicit feedback by the users. Past models were limited by the need to compute all pairwise similarities between items or users, which grow quadratically with input size. In particular, this limitation vastly complicates adopting user similarity models, due to the typical large number of users. Our new model solves these limitations by factoring the neighborhood model, thus making both item-item and user-user implementations scale linearly with the size of the data. The methods are tested on the Netflix data, with encouraging results. In addition, we suggest a new evaluation metric, which highlights the differences among methods, based on their performance at a top-K recommendation task. Our study reveals a very significant improvement in quality of top-K recommendation. 1.
You are what you say: Privacy risks of public mentions
- In Proc. 29th Annual ACM SIGIR Conference on Research and Development in Information Retrieval
, 2006
"... In today’s data-rich networked world, people express many aspects of their lives online. It is common to segregate different aspects in different places: you might write opinionated rants about movies in your blog under a pseudonym while participating in a forum or web site for scholarly discussion ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
In today’s data-rich networked world, people express many aspects of their lives online. It is common to segregate different aspects in different places: you might write opinionated rants about movies in your blog under a pseudonym while participating in a forum or web site for scholarly discussion of medical ethics under your real name. However, it may be possible to link these separate identities, because the movies, journal articles, or authors you mention are from a sparse relation space whose properties (e.g., many items related to by only a few users) allow reidentification. This re-identification violates people’s intentions to separate aspects of their life and can have negative consequences; it also may allow other privacy violations, such as obtaining a stronger identifier like name and address. This paper examines this general problem in a specific setting: reidentification of users from a public web movie forum in a private movie ratings dataset. We present three major results. First, we develop algorithms that can re-identify a large proportion of public users in a sparse relation space. Second, we evaluate whether private dataset owners can protect user privacy by hiding data; we show that this requires extensive and undesirable changes to the dataset, making it impractical. Third, we evaluate two methods for users in a public forum to protect their own privacy, suppression and misdirection. Suppression doesn’t work here either. However, we show that a simple misdirection strategy works well: mention a few popular items that you haven’t rated.

