Results 11  20
of
771
Probabilistic models for unified collaborative and contentbased recommendation in sparsedata environments
 In UAI ’01, 437–444
, 2001
"... Recommender systems leverage product and community information to target products to consumers. Researchers have developed collaborative recommenders, contentbased recommenders, and a few hybrid systems. We propose a unified probabilistic framework for merging collaborative and contentbased recomm ..."
Abstract

Cited by 171 (9 self)
 Add to MetaCart
(Show Context)
Recommender systems leverage product and community information to target products to consumers. Researchers have developed collaborative recommenders, contentbased recommenders, and a few hybrid systems. We propose a unified probabilistic framework for merging collaborative and contentbased recommendations. We extend Hofmann’s (1999) aspect model to incorporate threeway cooccurrence data among users, items, and item content. The relative influence of collaboration data versus content data is not imposed as an exogenous parameter, but rather emerges naturally from the given data sources. However, global probabilistic models coupled with standard EM learning algorithms tend to drastically overfit in the sparsedata situations typical of recommendation applications. We show that secondary content information can often be used to overcome sparsity. Experiments on data from the ResearchIndex library of Computer Science publications show that appropriate mixture models incorporating secondary data produce significantly better quality recommenders thannearest neighbors (NN). Global probabilistic models also allow more general inferences than local methods likeNN. 1
On the Feasibility of PeertoPeer Web Indexing and Search
 IN IPTPS’03
, 2003
"... This paper discusses the feasibility of peertopeer fulltext keyword search of the Web. Two classes of keyword search techniques are in use or have been proposed: flooding of queries over an overlay network (as in Gnutella), and intersection of index lists stored in a distributed hash table. We pr ..."
Abstract

Cited by 167 (13 self)
 Add to MetaCart
This paper discusses the feasibility of peertopeer fulltext keyword search of the Web. Two classes of keyword search techniques are in use or have been proposed: flooding of queries over an overlay network (as in Gnutella), and intersection of index lists stored in a distributed hash table. We present a simple feasibility analysis based on the resource constraints and search workload. Our study suggests that the peertopeer network does not have enough capacity to make naive use of either of search techniques attractive for Web search. The paper presents a number of existing and novel optimizations for P2P search based on distributed hash tables, estimates their effects on performance, and concludes that in combination these optimizations would bring the problem to within an order of magnitude of feasibility. The paper suggests a number of compromises that might achieve the last order of magnitude.
Expectationpropagation for the generative aspect model
 in UAI2000, Proceedings of the 18th Conference in Uncertainty in Artificial Intelligence
"... The generative aspect model is an extension of the multinomial model for text that allows word probabilities to vary stochastically across documents. Previous results with aspect models have been promising, but hindered by the computational difficulty of carrying out inference and learning. This p ..."
Abstract

Cited by 157 (5 self)
 Add to MetaCart
(Show Context)
The generative aspect model is an extension of the multinomial model for text that allows word probabilities to vary stochastically across documents. Previous results with aspect models have been promising, but hindered by the computational difficulty of carrying out inference and learning. This paper demonstrates that the simple variational methods of Blei et al. (2001) can lead to inaccurate inferences and biased learning for the generative aspect model. We develop an alternative approach that leads to higher accuracy at comparable cost. An extension of ExpectationPropagation is used for inference and then embedded in an EM algorithm for learning. Experimental results are presented for both synthetic and real data sets. 1
Exponential family harmoniums with an application to . . .
"... Directed graphical models with one layer of observed random variablesand one or more layers of hidden random variables have been the dominant modelling paradigm in many research fields. Although this approach has met with considerable success, the causal semantics of these models can make it diffi ..."
Abstract

Cited by 150 (22 self)
 Add to MetaCart
(Show Context)
Directed graphical models with one layer of observed random variablesand one or more layers of hidden random variables have been the dominant modelling paradigm in many research fields. Although this approach has met with considerable success, the causal semantics of these models can make it difficult to infer the posterior distribution over thehidden variables. In this paper we propose an alternative twolayer model based on exponential family distributions and the semantics of undirected models. Inference in these "exponential family harmoniums " is fast while learning is performed by minimizing contrastive divergence.A member of this family is then studied as an alternative probabilistic model for latent semantic indexing. In experiments it is shown that theyperform well on document retrieval tasks and provide an elegant solution to searching with keywords.
Collective entity resolution in relational data
 ACM Transactions on Knowledge Discovery from Data (TKDD
, 2006
"... Many databases contain uncertain and imprecise references to realworld entities. The absence of identifiers for the underlying entities often results in a database which contains multiple references to the same entity. This can lead not only to data redundancy, but also inaccuracies in query proces ..."
Abstract

Cited by 146 (12 self)
 Add to MetaCart
Many databases contain uncertain and imprecise references to realworld entities. The absence of identifiers for the underlying entities often results in a database which contains multiple references to the same entity. This can lead not only to data redundancy, but also inaccuracies in query processing and knowledge extraction. These problems can be alleviated through the use of entity resolution. Entity resolution involves discovering the underlying entities and mapping each database reference to these entities. Traditionally, entities are resolved using pairwise similarity over the attributes of references. However, there is often additional relational information in the data. Specifically, references to different entities may cooccur. In these cases, collective entity resolution, in which entities for cooccurring references are determined jointly rather than independently, can improve entity resolution accuracy. We propose a novel relational clustering algorithm that uses both attribute and relational information for determining the underlying domain entities, and we give an efficient implementation. We investigate the impact that different relational similarity measures have on entity resolution quality. We evaluate our collective entity resolution algorithm on multiple realworld databases. We show that it improves entity resolution performance over both attributebased baselines and over algorithms that consider relational information but do not resolve entities collectively. In addition, we perform detailed experiments on synthetically generated data to identify data characteristics that favor collective relational resolution over purely attributebased algorithms.
Nonnegative tensor factorization with applications to statistics and computer vision
 In Proceedings of the International Conference on Machine Learning (ICML
, 2005
"... We derive algorithms for finding a nonnegative ndimensional tensor factorization (nNTF) which includes the nonnegative matrix factorization (NMF) as a particular case when n = 2. We motivate the use of nNTF in three areas of data analysis: (i) connection to latent class models in statistics, (ii ..."
Abstract

Cited by 139 (5 self)
 Add to MetaCart
(Show Context)
We derive algorithms for finding a nonnegative ndimensional tensor factorization (nNTF) which includes the nonnegative matrix factorization (NMF) as a particular case when n = 2. We motivate the use of nNTF in three areas of data analysis: (i) connection to latent class models in statistics, (ii) sparse image coding in computer vision, and (iii) model selection problems. We derive a ”direct ” positivepreserving gradient descent algorithm and an alternating scheme based on repeated multiple rank1 problems. 1.
Parameter estimation for text analysis
, 2004
"... Abstract. Presents parameter estimation methods common with discrete probability distributions, which is of particular interest in text modeling. Starting with maximum likelihood, a posteriori and Bayesian estimation, central concepts like conjugate distributions and Bayesian networks are reviewed. ..."
Abstract

Cited by 119 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Presents parameter estimation methods common with discrete probability distributions, which is of particular interest in text modeling. Starting with maximum likelihood, a posteriori and Bayesian estimation, central concepts like conjugate distributions and Bayesian networks are reviewed. As an application, the model of latent Dirichlet allocation (LDA) is explained in detail with a full derivation of an approximate inference algorithm based on Gibbs sampling, including a discussion of Dirichlet hyperparameter estimation. Finally, analysis methods of LDA models are discussed.
Orthogonal nonnegative matrix trifactorizations for clustering
 In SIGKDD
, 2006
"... Currently, most research on nonnegative matrix factorization (NMF) focus on 2factor X = FG T factorization. We provide a systematic analysis of 3factor X = FSG T NMF. While unconstrained 3factor NMF is equivalent to unconstrained 2factor NMF, constrained 3factor NMF brings new features to constr ..."
Abstract

Cited by 117 (22 self)
 Add to MetaCart
(Show Context)
Currently, most research on nonnegative matrix factorization (NMF) focus on 2factor X = FG T factorization. We provide a systematic analysis of 3factor X = FSG T NMF. While unconstrained 3factor NMF is equivalent to unconstrained 2factor NMF, constrained 3factor NMF brings new features to constrained 2factor NMF. We study the orthogonality constraint because it leads to rigorous clustering interpretation. We provide new rules for updating F,S,G and prove the convergence of these algorithms. Experiments on 5 datasets and a real world case study are performed to show the capability of biorthogonal 3factor NMF on simultaneously clustering rows and columns of the input data matrix. We provide a new approach of evaluating the quality of clustering on words using class aggregate distribution and multipeak distribution. We also provide an overview of various NMF extensions and examine their relationships.
Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models
 SUBMISSION TO IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2007
"... We propose a novel unsupervised learning framework to model activities and interactions in crowded and complicated scenes. Under our framework hierarchical Bayesian models are used to connect three elements in visual surveillance: lowlevel visual features, simple “atomic ” activities, and interacti ..."
Abstract

Cited by 116 (16 self)
 Add to MetaCart
We propose a novel unsupervised learning framework to model activities and interactions in crowded and complicated scenes. Under our framework hierarchical Bayesian models are used to connect three elements in visual surveillance: lowlevel visual features, simple “atomic ” activities, and interactions. Atomic activities are modeled as distributions over lowlevel visual features, and multiagent interactions are modeled as distributions over atomic activities. These models are learnt in an unsupervised way. Given a long video sequence, moving pixels are clustered into different atomic activities and short video clips are clustered into different interactions. In this paper, we propose three hierarchical Bayesian models, Latent Dirichlet Allocation (LDA) mixture model, Hierarchical Dirichlet Process (HDP) mixture model, and two dimensional HDP (2DHDP) model. They advance existing language models, such as LDA [1] and HDP [2]. Directly using existing LDA and HDP models under our framework, only moving pixels can be clustered into atomic activities. Our models can cluster both moving pixels and video clips into atomic activities and interactions. LDA mixture model assumes that it is already known how many different types of atomic activities and interactions occur in the scene. HDP mixture model automatically decides the number of categories of atomic activities. 2DHDP automatically decides the numbers of
Exploiting latent semantic information in statistical language modeling
 Proc. IEEE. 88
, 2000
"... Statistical language models used in largevocabulary speech recognition must properly encapsulate the various constraints, both local and global, present in the language. While local constraints are readily captured through ngram modeling, global constraints, such as longterm semantic dependencies ..."
Abstract

Cited by 114 (7 self)
 Add to MetaCart
Statistical language models used in largevocabulary speech recognition must properly encapsulate the various constraints, both local and global, present in the language. While local constraints are readily captured through ngram modeling, global constraints, such as longterm semantic dependencies, have been more difficult to handle within a datadriven formalism. This paper focuses on the use of latent semantic analysis, a paradigm that automatically uncovers the salient semantic relationships between words and documents in a given corpus. In this approach, (discrete) words and documents are mapped onto a (continuous) semantic vector space, in which familiar clustering techniques can be applied. This leads to the specification of a powerful framework for automatic semantic classification, as well as the derivation of several language model families with various smoothing properties. Because of their largespan nature, these language models are well suited to complement conventional ngrams. An integrative formulation is proposed for harnessing this synergy, in which the latent semantic information is used to adjust the standard ngram probability. Such hybrid language modeling compares favorably with the correspondingngram baseline: experiments conducted on the Wall Street Journal domain show a reduction in average word error rate of over 20%. This paper concludes with a discussion of intrinsic tradeoffs, such as the influence of training data selection on the resulting performance. Keywords—Latent semantic analysis, multispan integration, ngrams, speech recognition, statistical language modeling. I.