Results 1 - 10
of
33
Dynamic topic models
- In ICML
, 2006
"... Scientists need new tools to explore and browse large collections of scholarly literature. Thanks to organizations such as JSTOR, which scan and index the original bound archives of many journals, modern scientists can search digital libraries spanning hundreds of years. A scientist, suddenly ..."
Abstract
-
Cited by 245 (15 self)
- Add to MetaCart
Scientists need new tools to explore and browse large collections of scholarly literature. Thanks to organizations such as JSTOR, which scan and index the original bound archives of many journals, modern scientists can search digital libraries spanning hundreds of years. A scientist, suddenly
Maximum-Margin Matrix Factorization
- Advances in Neural Information Processing Systems 17
, 2005
"... We present a novel approach to collaborative prediction, using low-norm instead of low-rank factorizations. The approach is inspired by, and has strong connections to, large-margin linear discrimination. We show how to learn low-norm factorizations by solving a semi-definite program, and discuss ..."
Abstract
-
Cited by 90 (16 self)
- Add to MetaCart
We present a novel approach to collaborative prediction, using low-norm instead of low-rank factorizations. The approach is inspired by, and has strong connections to, large-margin linear discrimination. We show how to learn low-norm factorizations by solving a semi-definite program, and discuss generalization error bounds for them.
Fast maximum margin matrix factorization for collaborative prediction
- In Proceedings of the 22nd International Conference on Machine Learning (ICML
, 2005
"... Maximum Margin Matrix Factorization (MMMF) was recently suggested (Srebro et al., 2005) as a convex, infinite dimensional alternative to low-rank approximations and standard factor models. MMMF can be formulated as a semi-definite programming (SDP) and learned using standard SDP solvers. However, cu ..."
Abstract
-
Cited by 85 (7 self)
- Add to MetaCart
Maximum Margin Matrix Factorization (MMMF) was recently suggested (Srebro et al., 2005) as a convex, infinite dimensional alternative to low-rank approximations and standard factor models. MMMF can be formulated as a semi-definite programming (SDP) and learned using standard SDP solvers. However, current SDP solvers can only handle MMMF problems on matrices of dimensionality up to a few hundred. Here, we investigate a direct gradient-based optimization method for MMMF and demonstrate it on large collaborative prediction problems. We compare against results obtained by Marlin (2004) and find that MMMF substantially outperforms all nine methods he tested. 1.
Learning a meta-level prior for feature relevance from multiple related tasks
- In Proceedings of International Conference on Machine Learning (ICML). Einat
, 2007
"... In many prediction tasks, selecting relevant features is essential for achieving good generalization performance. Most feature selection algorithms consider all features to be a priori equally likely to be relevant. In this paper, we use transfer learning — learning on an ensemble of related tasks — ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
In many prediction tasks, selecting relevant features is essential for achieving good generalization performance. Most feature selection algorithms consider all features to be a priori equally likely to be relevant. In this paper, we use transfer learning — learning on an ensemble of related tasks — to construct an informative prior on feature relevance. We assume that features themselves have meta-features that are predictive of their relevance to the prediction task, and model their relevance as a function of the meta-features using hyperparameters (called meta-priors). We present a convex optimization algorithm for simultaneously learning the meta-priors and feature weights from an ensemble of related prediction tasks that share a similar relevance structure. Our approach transfers the meta-priors among different tasks, allowing it to deal with settings where tasks have non-overlapping features or where feature relevance varies over the tasks. We show that transfer learning of feature relevance improves performance on two real data sets which illustrate such settings: (1) predicting ratings in a collaborative filtering task, and (2) distinguishing arguments of a verb in a sentence. 1.
Learning with Matrix Factorization
, 2004
"... Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning for over a century, with many new formulations and models suggested in recent
Non-linear Matrix Factorization with Gaussian Processes
"... A popular approach to collaborative filtering is matrix factorization. In this paper we develop a non-linear probabilistic matrix factorization using Gaussian process latent variable models. We use stochastic gradient descent (SGD) to optimize the model. SGD allows us to apply Gaussian processes to ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
A popular approach to collaborative filtering is matrix factorization. In this paper we develop a non-linear probabilistic matrix factorization using Gaussian process latent variable models. We use stochastic gradient descent (SGD) to optimize the model. SGD allows us to apply Gaussian processes to data sets with millions of observations without approximate methods. We apply our approach to benchmark movie recommender data sets. The results show better than previous state-of-theart performance. 1.
Collaborative prediction using ensembles of maximum margin matrix factorizations
- In ICML
, 2006
"... Fast gradient-based methods for Maximum Margin Matrix Factorization (MMMF) were recently shown to have great promise (Rennie & Srebro, 2005), including significantly outperforming the previous state-of-the-art methods on some standard collaborative prediction benchmarks (including MovieLens). In thi ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Fast gradient-based methods for Maximum Margin Matrix Factorization (MMMF) were recently shown to have great promise (Rennie & Srebro, 2005), including significantly outperforming the previous state-of-the-art methods on some standard collaborative prediction benchmarks (including MovieLens). In this paper, we investigate ways to further improve the performance of MMMF, by casting it within an ensemble approach. We explore and evaluate a variety of alternative ways to define such ensembles. We show that our resulting ensembles can perform significantly better than a single MMMF model, along multiple evaluation metrics. In fact, we find that ensembles of partially trained MMMF models can sometimes even give better predictions in total training time comparable to a single MMMF model. 1.
Distributed Collaborative Filtering for Peer-to-Peer File Sharing Systems
- In Proceedings of the 21st Annual ACM Symposium on Applied Computing
, 2006
"... Peer-to-peer networks are becoming more and more popular to share information such as, for example, multimedia files. Since this information is stored locally at the different peers, it is necessary to facilitate the search in an intelligent way. Collaborative filtering is such a search technique th ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
Peer-to-peer networks are becoming more and more popular to share information such as, for example, multimedia files. Since this information is stored locally at the different peers, it is necessary to facilitate the search in an intelligent way. Collaborative filtering is such a search technique that enables to incorporate the preferences of a user that can be learned from the download activities of the users. To be effective collaborative filtering requires a large database that captures these activities. Within a peerto-peer network this is, however, not readily available. Here, we propose a collaborative filtering approach that is self-organizing and operates in a distributed way. Information about the similarity between multimedia files (items) is stored locally at these items in so called item-based buddy tables. We propose to use the language model (popular within information retrieval) to build recommendations for the different users based on the buddy tables of those items a user has downloaded previously (indicating the preference of the user). We have tested and compared our distributed collaborative filtering approach to centralized collaborative filtering and showed that it has similar performance. It is therefore a promising technique to facilitate the search for information in peer-to-peer networks. 1
Collaborative filtering and the missing at random assumption. To be published
- in Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence. 2007
, 2007
"... Rating prediction is an important application, and a popular research topic in collaborative filtering. However, both the validity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we present ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Rating prediction is an important application, and a popular research topic in collaborative filtering. However, both the validity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we present the results of a user study in which we collect a random sample of ratings from current users of an online radio service. An analysis of the rating data collected in the study shows that the sample of random ratings has markedly different properties than ratings of user-selected songs. When asked to report on their own rating behaviour, a large number of users indicate they believe their opinion of a song does affect whether they choose to rate that song, a violation of the MAR condition. Finally, we present experimental results showing that incorporating an explicit model of the missing data mechanism can lead to significant improvements in prediction performance on the random sample of ratings. 1
Applying collaborative filtering techniques to movie search for better ranking and browsing
- In KDD
, 2007
"... parkst @ yahoo-inc.com We propose a new ranking method, which combines recommender systems with information search tools for better search and browsing. Our method uses a collaborative filtering algorithm to generate personal item authorities for each user and combines them with item proximities for ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
parkst @ yahoo-inc.com We propose a new ranking method, which combines recommender systems with information search tools for better search and browsing. Our method uses a collaborative filtering algorithm to generate personal item authorities for each user and combines them with item proximities for better ranking. To demonstrate our approach, we build a prototype movie search and browsing engine called MAD6 (Movies, Actors and Directors; 6 degrees of separation). We conduct offline and online tests of our ranking algorithm. For offline testing, we use Yahoo! Search queries that resulted in a click on a Yahoo! Movies or Internet Movie Database (IMDB) movie URL. Our online test involved 44 Yahoo! employees providing subjective assessments of results quality. In both tests, our ranking methods show significantly better recall and quality than IMDB search and Yahoo! Movies current search.

