Results 1 -
5 of
5
Infinite hidden relational models
- In Proceedings of the 22nd International Conference on Uncertainity in Artificial Intelligence (UAI
, 2006
"... Relational learning analyzes the probabilistic constraints between the attributes of entities and relationships. We extend the expressiveness of relational models by introducing for each entity (or object) an infinitedimensional latent variable as part of a Dirichlet process (DP) mixture model. We d ..."
Abstract
-
Cited by 28 (14 self)
- Add to MetaCart
Relational learning analyzes the probabilistic constraints between the attributes of entities and relationships. We extend the expressiveness of relational models by introducing for each entity (or object) an infinitedimensional latent variable as part of a Dirichlet process (DP) mixture model. We discuss inference in the model, which is based on a DP Gibbs sampler, i.e., the Chinese restaurant process. We extend the Chinese restaurant process to be applicable to relational modeling. We discuss how information is propagated in the network of latent variables, reducing the necessity for extensive structural learning. In the context of a recommendation engine our approach realizes a principled solution for recommendations based on features of items, features of users and relational information. Our approach is evaluated in three applications: a recommendation system based on the Movie-Lens data set, the prediction of gene function using relational information and a medical recommendation system.
Dirichlet-enhanced spam filtering based on biased samples
- Advances in Neural Information Processing Systems 19
, 2007
"... We study a setting that is motivated by the problem of filtering spam messages for many users. Each user receives messages according to an individual, unknown distribution, reflected only in the unlabeled inbox. The spam filter for a user is required to perform well with respect to this distribution ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
We study a setting that is motivated by the problem of filtering spam messages for many users. Each user receives messages according to an individual, unknown distribution, reflected only in the unlabeled inbox. The spam filter for a user is required to perform well with respect to this distribution. Labeled messages from publicly available sources can be utilized, but they are governed by a distinct distribution, not adequately representing most inboxes. We devise a method that minimizes a loss function with respect to a user’s personal distribution based on the available biased sample. A nonparametric hierarchical Bayesian model furthermore generalizes across users by learning a common prior which is imposed on new email accounts. Empirically, we observe that bias-corrected learning outperforms naive reliance on the assumption of independent and identically distributed data; Dirichlet-enhanced generalization across users outperforms a single (“one size fits all”) filter as well as independent filters for all users. 1
Learning Infinite Hidden Relational Models
"... Relational learning analyzes the probabilistic constraints between the attributes of entities and relationships. We extend the expressiveness of relational models by introducing for each entity (or object) an infinite-state latent variable as part of a Dirichlet process (DP) mixture model. It can be ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Relational learning analyzes the probabilistic constraints between the attributes of entities and relationships. We extend the expressiveness of relational models by introducing for each entity (or object) an infinite-state latent variable as part of a Dirichlet process (DP) mixture model. It can be viewed as a relational generalization of hidden Markov random field. The information propagates in the intern-connected network via latent variables, reducing the necessary for extensive structure learning. For inference, we explore a Gibbs sampling method based on the Chinese restaurant process. The performance of our model is demonstrated in three applications: the movie recommendation, the function prediction of genes and a medical recommendation system.
Dirichlet Enhanced Latent Semantic Analysis
- In Conference in Artificial Intelligence and Statistics
, 2005
"... This paper describes nonparametric Bayesian treatments for analyzing records containing occurrences of items. The introduced model retains the strength of previous approaches that explore the latent factors of each record (e.g. topics of documents), and further uncovers the clustering structur ..."
Abstract
- Add to MetaCart
This paper describes nonparametric Bayesian treatments for analyzing records containing occurrences of items. The introduced model retains the strength of previous approaches that explore the latent factors of each record (e.g. topics of documents), and further uncovers the clustering structure of records, which reflects the statistical dependencies of the latent factors. The nonparametric model induced by a Dirichlet process (DP) flexibly adapts model complexity to reveal the clustering structure of the data. To avoid the problems of dealing with infinite dimensions, we further replace the DP prior by a simpler alternative, namely Dirichlet-multinomial allocation (DMA), which maintains the main modelling properties of the DP. Instead of relying on Markov chain Monte Carlo (MCMC) for inference, this paper applies e#cient variational inference based on DMA. The proposed approach yields encouraging empirical results on both a toy problem and text data.
A Uniform Convergence Bound for the Area Under the ROC Curve 1
, 2005
"... The web site for the AISTATS 2005 workshop may be found at ..."

