Results 1  10
of
26
Nonlinear Matrix Factorization with Gaussian Processes
"... A popular approach to collaborative filtering is matrix factorization. In this paper we develop a nonlinear probabilistic matrix factorization using Gaussian process latent variable models. We use stochastic gradient descent (SGD) to optimize the model. SGD allows us to apply Gaussian processes to ..."
Abstract

Cited by 43 (1 self)
 Add to MetaCart
(Show Context)
A popular approach to collaborative filtering is matrix factorization. In this paper we develop a nonlinear probabilistic matrix factorization using Gaussian process latent variable models. We use stochastic gradient descent (SGD) to optimize the model. SGD allows us to apply Gaussian processes to data sets with millions of observations without approximate methods. We apply our approach to benchmark movie recommender data sets. The results show better than previous stateoftheart performance. 1.
Spectral Regularization Algorithms for Learning Large Incomplete Matrices
, 2009
"... We use convex relaxation techniques to provide a sequence of regularized lowrank solutions for largescale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the n ..."
Abstract

Cited by 33 (1 self)
 Add to MetaCart
We use convex relaxation techniques to provide a sequence of regularized lowrank solutions for largescale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm SoftImpute iteratively replaces the missing elements with those obtained from a softthresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a lowrank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity linear in the matrix dimensions. Our semidefiniteprogramming algorithm is readily scalable to large matrices: for example it can obtain a rank80 approximation of a 10 6 × 10 6 incomplete matrix with 10 5 observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very good performance both in training and test error when compared to other competitive stateofthe art techniques. 1.
Collaborative filtering and the missing at random assumption
 In Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI
, 2007
"... Rating prediction is an important application, and a popular research topic in collaborative ltering. However, both the validity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we presen ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
(Show Context)
Rating prediction is an important application, and a popular research topic in collaborative ltering. However, both the validity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we present the results of a user study in which we collect a random sample of ratings from current users of an online radio service. An analysis of the rating data collected in the study shows that the sample of random ratings has markedly dierent properties than ratings of userselected songs. When asked to report on their own rating behaviour, a large number of users indicate they believe their opinion of a song does aect whether they choose to rate that song, a violation of the MAR condition. Finally, we present experimental results showing that incorporating an explicit model of the missing data mechanism can lead to signi cant improvements in prediction performance on the random sample of ratings. 1
Fast nonparametric matrix factorization for largescale collaborative filtering. The 32nd SIGIR conference
, 2009
"... With the sheer growth of online user data, it becomes challenging to develop preference learning algorithms that are sufficiently flexible in modeling but also affordable in computation. In this paper we develop nonparametric matrix factorization methods by allowing the latent factors of two lowran ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
(Show Context)
With the sheer growth of online user data, it becomes challenging to develop preference learning algorithms that are sufficiently flexible in modeling but also affordable in computation. In this paper we develop nonparametric matrix factorization methods by allowing the latent factors of two lowrank matrix factorization methods, the singular value decomposition (SVD) and probabilistic principal component analysis (pPCA), to be datadriven, with the dimensionality increasing with data size. We show that the formulations of the two nonparametric models are very similar, and their optimizations share similar procedures. Compared to traditional parametric lowrank methods, nonparametric models are appealing for their flexibility in modeling complex data dependencies. However, this modeling advantage comes at a computational price — it is highly challenging to scale them to largescale problems, hampering their application to applications such as collaborative filtering. In this paper we introduce novel optimization algorithms, which are simple to implement, which allow learning both nonparametric matrix factorization models to be highly efficient on largescale problems. Our experiments on EachMovie and Netflix, the two largest public benchmarks to date, demonstrate that the nonparametric models make more accurate predictions of user ratings, and are computationally comparable or sometimes even faster in training, in comparison with previous stateoftheart parametric matrix factorization models.
Applying collaborative filtering techniques to movie search for better ranking and browsing
 In KDD
, 2007
"... In general web search engines, such as Google and Yahoo! Search, document relevance for the given query and item authority are two major components of the ranking system. However, many information search tools in ecommerce sites ignore item authority in their ranking systems. In part, this may stem ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
(Show Context)
In general web search engines, such as Google and Yahoo! Search, document relevance for the given query and item authority are two major components of the ranking system. However, many information search tools in ecommerce sites ignore item authority in their ranking systems. In part, this may stem from the relative difficulty of generating item authorities due to the different characteristics of documents (or items) between ecommerce sites and the web. Links between documents in an ecommerce site often represent relationship rather than recommendation. For example, two documents (items) are connected since both are produced by the same company. We propose a new ranking method, which combines recommender systems with information search tools for better search and browsing. Our method uses a collaborative filtering algorithm to generate personal item authorities for each user and combines them with item proximities for better ranking. To demonstrate our approach, we build a prototype movie search engine called MAD6 (Movies, Actors and Directors; 6 degrees of separation).
Collaborative Prediction and Ranking with NonRandom Missing Data
"... A fundamental aspect of ratingbased recommender systems is the observation process, the process by which users choose the items they rate. Nearly all research on collaborative filtering and recommender systems is founded on the assumption that missing ratings are missing at random. The statistical ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
A fundamental aspect of ratingbased recommender systems is the observation process, the process by which users choose the items they rate. Nearly all research on collaborative filtering and recommender systems is founded on the assumption that missing ratings are missing at random. The statistical theory of missing data shows that incorrect assumptions about missing data can lead to biased parameter estimation and prediction. In a recent study, we demonstrated strong evidence for violations of the missing at random condition in a real recommender system. In this paper we present the first study of the effect of nonrandom missing data on collaborative ranking, and extend our previous results regarding the impact of nonrandom missing data on collaborative prediction.
A Simple Algorithm for Nuclear Norm Regularized Problems
"... Optimization problems with a nuclear norm regularization, such as e.g. low norm matrix factorizations, have seen many applications recently. We propose a new approximation algorithm building upon the recent sparse approximate SDP solver of (Hazan, 2008). The experimental efficiency of our method is ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
(Show Context)
Optimization problems with a nuclear norm regularization, such as e.g. low norm matrix factorizations, have seen many applications recently. We propose a new approximation algorithm building upon the recent sparse approximate SDP solver of (Hazan, 2008). The experimental efficiency of our method is demonstrated on large matrix completion problems such as the Netflix dataset. The algorithm comes with strong convergence guarantees, and can be interpreted as a first theoretically justified variant of SimonFunktype SVD heuristics. The method is free of tuning parameters, and very easy to parallelize. 1.
Mixed Membership Matrix Factorization
"... Discrete mixed membership modeling and continuous latent factor modeling (also known as matrix factorization) are two popular, complementary approaches to dyadic data analysis. In this work, we develop a fully Bayesian framework for integrating the two approaches into unified Mixed Membership Matrix ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
Discrete mixed membership modeling and continuous latent factor modeling (also known as matrix factorization) are two popular, complementary approaches to dyadic data analysis. In this work, we develop a fully Bayesian framework for integrating the two approaches into unified Mixed Membership Matrix Factorization (M 3 F) models. We introduce two M 3 F models, derive Gibbs sampling inference procedures, and validate our methods on the EachMovie, MovieLens, and Netflix Prize collaborative filtering datasets. We find that, even when fitting fewer parameters, the M 3 F models outperform stateoftheart latent factor approaches on all benchmarks, yielding the greatest gains in accuracy on sparselyrated, highvariance items. 1.
Collaborative filtering with interlaced generalized linear models
, 2008
"... Collaborative filtering (CF) is a data analysis task appearing in many challenging applications, in particular data mining in Internet and ecommerce. CF can often be formulated as identifying patterns in a large and mostly empty rating matrix. In this paper, we focus on predicting unobserved rating ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Collaborative filtering (CF) is a data analysis task appearing in many challenging applications, in particular data mining in Internet and ecommerce. CF can often be formulated as identifying patterns in a large and mostly empty rating matrix. In this paper, we focus on predicting unobserved ratings. This task is often a part of a recommendation procedure. We propose a new CF approach called interlaced generalized linear models (GLM); it is based on a factorization of the rating matrix and uses probabilistic modeling to represent uncertainty in the ratings. The advantage of this approach is that different configurations, encoding different intuitions about the rating process can easily be tested while keeping the same learning procedure. The GLM formulation is the keystone to derive an efficient learning procedure, applicable to large datasets. We illustrate the technique on three public domain datasets. r 2008 Elsevier B.V. All rights reserved.
Nonparametric bayesian matrix completion
 In SAM
, 2010
"... Abstract—The BetaBinomial processes are considered for inferring missing values in matrices. The model moves beyond the lowrank assumption, modeling the matrix columns as residing in a nonlinear subspace. Largescale problems are considered via efficient Gibbs sampling, yielding predictions as wel ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract—The BetaBinomial processes are considered for inferring missing values in matrices. The model moves beyond the lowrank assumption, modeling the matrix columns as residing in a nonlinear subspace. Largescale problems are considered via efficient Gibbs sampling, yielding predictions as well as a measure of confidence in each prediction. Algorithm performance is considered for several datasets, with encouraging performance relative to existing approaches. I.