Results 1 - 10
of
124
Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 2005
"... This paper presents an overview of the field of recommender systems and describes the current generation of recommendation methods that are usually classified into the following three main categories: content-based, collaborative, and hybrid recommendation approaches. This paper also describes vario ..."
Abstract
-
Cited by 379 (2 self)
- Add to MetaCart
This paper presents an overview of the field of recommender systems and describes the current generation of recommendation methods that are usually classified into the following three main categories: content-based, collaborative, and hybrid recommendation approaches. This paper also describes various limitations of current recommendation methods and discusses possible extensions that can improve recommendation capabilities and make recommender systems applicable to an even broader range of applications. These extensions include, among others, an improvement of understanding of users and items, incorporation of the contextual information into the recommendation process, support for multcriteria ratings, and a provision of more flexible and less intrusive types of recommendations.
Maximum-Margin Matrix Factorization
- Advances in Neural Information Processing Systems 17
, 2005
"... We present a novel approach to collaborative prediction, using low-norm instead of low-rank factorizations. The approach is inspired by, and has strong connections to, large-margin linear discrimination. We show how to learn low-norm factorizations by solving a semi-definite program, and discuss ..."
Abstract
-
Cited by 90 (16 self)
- Add to MetaCart
We present a novel approach to collaborative prediction, using low-norm instead of low-rank factorizations. The approach is inspired by, and has strong connections to, large-margin linear discrimination. We show how to learn low-norm factorizations by solving a semi-definite program, and discuss generalization error bounds for them.
Fast maximum margin matrix factorization for collaborative prediction
- In Proceedings of the 22nd International Conference on Machine Learning (ICML
, 2005
"... Maximum Margin Matrix Factorization (MMMF) was recently suggested (Srebro et al., 2005) as a convex, infinite dimensional alternative to low-rank approximations and standard factor models. MMMF can be formulated as a semi-definite programming (SDP) and learned using standard SDP solvers. However, cu ..."
Abstract
-
Cited by 85 (7 self)
- Add to MetaCart
Maximum Margin Matrix Factorization (MMMF) was recently suggested (Srebro et al., 2005) as a convex, infinite dimensional alternative to low-rank approximations and standard factor models. MMMF can be formulated as a semi-definite programming (SDP) and learned using standard SDP solvers. However, current SDP solvers can only handle MMMF problems on matrices of dimensionality up to a few hundred. Here, we investigate a direct gradient-based optimization method for MMMF and demonstrate it on large collaborative prediction problems. We compare against results obtained by Marlin (2004) and find that MMMF substantially outperforms all nine methods he tested. 1.
Factorization meets the neighborhood: a multifaceted collaborative filtering model
- In Proc. of the 14th ACM SIGKDD conference
, 2008
"... Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The two more successful approaches to CF are latent f ..."
Abstract
-
Cited by 68 (6 self)
- Add to MetaCart
Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The two more successful approaches to CF are latent factor models, which directly profile both users and products, and neighborhood models, which analyze similarities between products or users. In this work we introduce some innovations to both approaches. The factor and neighborhood models can now be smoothly merged, thereby building a more accurate combined model. Further accuracy improvements are achieved by extending the models to exploit both explicit and implicit feedback by the users. The methods are tested on the Netflix data. Results are better than those previously published on that dataset. In addition, we suggest a new evaluation metric, which highlights the differences among methods, based on their performance at a top-K recommendation task.
A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation
- In KDD
, 2004
"... Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an informationtheoretic co-clustering approach applicable to empirical joint probability distributions was proposed. In many situations, co-clust ..."
Abstract
-
Cited by 63 (17 self)
- Add to MetaCart
Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an informationtheoretic co-clustering approach applicable to empirical joint probability distributions was proposed. In many situations, co-clustering of more general matrices is desired. In this paper, we present a substantially generalized co-clustering framework wherein any Bregman divergence can be used in the objective function, and various conditional expectation based constraints can be considered based on the statistics that need to be preserved. Analysis of the coclustering problem leads to the minimum Bregman information principle, which generalizes the maximum entropy principle, and yields an elegant meta algorithm that is guaranteed to achieve local optimality. Our methodology yields new algorithms and also encompasses several previously known clustering and co-clustering algorithms based on alternate minimization.
Unifying user-based and item-based collaborative filtering approaches by similarity fusion
- In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
, 2006
"... Memory-based methods for collaborative filtering predict new ratings by averaging (weighted) ratings between, respectively, pairs of similar users or items. In practice, a large number of ratings from similar users or similar items are not available, due to the sparsity inherent to rating data. Cons ..."
Abstract
-
Cited by 37 (3 self)
- Add to MetaCart
Memory-based methods for collaborative filtering predict new ratings by averaging (weighted) ratings between, respectively, pairs of similar users or items. In practice, a large number of ratings from similar users or similar items are not available, due to the sparsity inherent to rating data. Consequently, prediction quality can be poor. This paper reformulates the memory-based collaborative filtering problem in a generative probabilistic framework, treating individual user-item ratings as predictors of missing ratings. The final rating is estimated by fusing predictions from three sources: predictions based on ratings of the same item by other users, predictions based on different item ratings made by the same user, and, third, ratings predicted based on data from other but similar users rating other but similar items. Existing user-based and item-based approaches correspond to the two simple cases of our framework. The complete model is however more robust to data sparsity, because the different types of ratings are used in concert, while additional ratings from similar users towards similar items are employed as a background model to smooth the predictions. Experiments demonstrate that the proposed methods are indeed more robust against data sparsity and give better recommendations.
Collaborative Filtering Recommender Systems
, 2007
"... One of the potent personalization technologies powering the adaptive web is collaborative filtering. Collaborative filtering (CF) is the process of filtering or evaluating items through the opinions of other people. CF technology brings together the opinions of large interconnected communities on ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
One of the potent personalization technologies powering the adaptive web is collaborative filtering. Collaborative filtering (CF) is the process of filtering or evaluating items through the opinions of other people. CF technology brings together the opinions of large interconnected communities on the web, supporting filtering of substantial quantities of data. In this chapter we introduce the core concepts of collaborative filtering, its primary uses for users of the adaptive web, the theory and practice of CF algorithms, and design decisions regarding rating systems and acquisition of ratings. We also discuss how to evaluate CF systems, and the evolution of rich interaction interfaces. We close the chapter with discussions of the challenges of privacy particular to a CF recommendation service and important open research questions in the field.
A scalable collaborative filtering framework based on co-clustering
- Fifth IEEE International Conference on Data Mining
, 2005
"... Collaborative filtering-based recommender systems, which automatically predict preferred products of a user using known preferences of other users, have become extremely popular in recent years due to the increase in web-based activities such as e-commerce and online content distribution. Current co ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
Collaborative filtering-based recommender systems, which automatically predict preferred products of a user using known preferences of other users, have become extremely popular in recent years due to the increase in web-based activities such as e-commerce and online content distribution. Current collaborative filtering techniques such as correlation and SVD based methods provide good accuracy, but are computationally very expensive and can only be deployed in static off-line settings where the known preference information does not change with time. However, a number of practical scenarios require dynamic real-time collaborative filtering that can allow new users, items and ratings to enter the system at a rapid rate. In this paper, we consider a novel collaborative filtering approach based on a recently proposed weighted co-clustering algorithm [3] that involves simultaneous clustering of users and items. We design incremental and parallel versions of the co-clustering algorithm and use it to build an efficient real-time collaborative filtering framework. Empirical evaluation of our approach on large movie and book rating datasets demonstrates that it is possible to obtain an accuracy comparable to that of the correlation and matrix factorization based approaches at a much lower computational cost. 1
Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience
- IN VLDB '09: PROCEEDINGS OF THE VLDB ENDOWMENT
, 2009
"... Increasingly, organizations capture, transform and analyze enormous data sets. Prominent examples include internet companies and e-science. The Map-Reduce scalable dataflow paradigm has become popular for these applications. Its simple, explicit dataflow programming model is favored by some over the ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
Increasingly, organizations capture, transform and analyze enormous data sets. Prominent examples include internet companies and e-science. The Map-Reduce scalable dataflow paradigm has become popular for these applications. Its simple, explicit dataflow programming model is favored by some over the traditional high-level declarative approach: SQL. On the other hand, the extreme simplicity of Map-Reduce leads to much low-level hacking to deal with the many-step, branching dataflows that arise in practice. Moreover, users must repeatedly code standard operations such as join by hand. These practices waste time, introduce bugs, harm readability, and impede optimizations. Pig is a high-level dataflow system that aims at a sweet spot between SQL and Map-Reduce. Pig offers SQL-style high-level data manipulation constructs, which can be assembled in an explicit dataflow and interleaved with custom Map- and Reduce-style functions or executables. Pig programs are compiled into sequences of Map-Reduce jobs, and executed in the Hadoop Map-Reduce environment. Both Pig and Hadoop are open-source projects administered by the Apache Software Foundation. This paper describes the challenges we faced in developing Pig, and reports performance comparisons between Pig execution and raw Map-Reduce execution. 1.
Learning with Matrix Factorization
, 2004
"... Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning for over a century, with many new formulations and models suggested in recent

