Results 1  10
of
46
A Unified View of Matrix Factorization Models
"... Abstract. We present a unified view of matrix factorization that frames the differences among popular methods, such as NMF, Weighted SVD, EPCA, MMMF, pLSI, pLSIpHITS, Bregman coclustering, and many others, in terms of a small number of modeling choices. Many of these approaches can be viewed as m ..."
Abstract

Cited by 58 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We present a unified view of matrix factorization that frames the differences among popular methods, such as NMF, Weighted SVD, EPCA, MMMF, pLSI, pLSIpHITS, Bregman coclustering, and many others, in terms of a small number of modeling choices. Many of these approaches can be viewed as minimizing a generalized Bregman divergence, and we show that (i) a straightforward alternating projection algorithm can be applied to almost any model in our unified view; (ii) the Hessian for each projection has special structure that makes a Newton projection feasible, even when there are equality constraints on the factors, which allows for matrix coclustering; and (iii) alternating projections can be generalized to simultaneously factor a set of matrices that share dimensions. These observations immediately yield new optimization algorithms for the above factorization methods, and suggest novel generalizations of these methods such as incorporating row and column biases, and adding or relaxing clustering constraints. 1
A Probabilistic Framework for Relational Clustering
 KDD'07
"... Relational clustering has attracted more and more attention due to its phenomenal impact in various important applications which involve multitype interrelated data objects, such as Web mining, search marketing, bioinformatics, citation analysis, and epidemiology. In this paper, we propose a probab ..."
Abstract

Cited by 41 (1 self)
 Add to MetaCart
Relational clustering has attracted more and more attention due to its phenomenal impact in various important applications which involve multitype interrelated data objects, such as Web mining, search marketing, bioinformatics, citation analysis, and epidemiology. In this paper, we propose a probabilistic model for relational clustering, which also provides a principal framework to unify various important clustering tasks including traditional attributesbased clustering, semisupervised clustering, coclustering and graph clustering. The proposed model seeks to identify cluster structures for each type of data objects and interaction patterns between different types of objects. Under this model, we propose parametric hard and soft relational clustering algorithms under a large number of exponential family distributions. The algorithms are applicable to relational data of various structures and at the same time unifies a number of statoftheart clustering algorithms: coclustering algorithms, the kpartite graph clustering, and semisupervised clustering based on hidden Markov random fields.
A General Model for Multiple View Unsupervised Learning
, 2008
"... Multiple view data, which have multiple representations from different feature spaces or graph spaces, arise in various data mining applications such as information retrieval, bioinformatics and social network analysis. Since different representations could have very different statistical properties ..."
Abstract

Cited by 41 (2 self)
 Add to MetaCart
Multiple view data, which have multiple representations from different feature spaces or graph spaces, arise in various data mining applications such as information retrieval, bioinformatics and social network analysis. Since different representations could have very different statistical properties, how to learn a consensus pattern from multiple representations is a challenging problem. In this paper, we propose a general model for multiple view unsupervised learning. The proposed model introduces the concept of mapping function to make the different patterns from different pattern spaces comparable and hence an optimal pattern can be learned from the multiple patterns of multiple representations. Under this model, we formulate two specific models for
Predictive discrete latent factor models for large scale dyadic data
 In KDD ’07
, 2007
"... We propose a novel statistical method to predict large scale dyadic response variables in the presence of covariate information. Our approach simultaneously incorporates the effect of covariates and estimates local structure that is induced by interactions among the dyads through a discrete latent f ..."
Abstract

Cited by 36 (2 self)
 Add to MetaCart
(Show Context)
We propose a novel statistical method to predict large scale dyadic response variables in the presence of covariate information. Our approach simultaneously incorporates the effect of covariates and estimates local structure that is induced by interactions among the dyads through a discrete latent factor model. The discovered latent factors provide a predictive model that is both accurate and interpretable. We illustrate our method by working in a framework of generalized linear models, which include commonly used regression techniques like linear regression, logistic regression and Poisson regression as special cases. We also provide scalable generalized EMbased algorithms for model fitting using both "hard" and "soft " cluster assignments. We demonstrate the generality and efficacy of our approach through large scale simulation studies and analysis of datasets obtained from certain realworld movie recommendation and internet advertising applications.
Multiway clustering on relation graphs
 In Proc. of the 7th SIAM Intl. Conf. on Data Mining
, 2006
"... A number of realworld domains such as social networks and ecommerce involve heterogeneous data that describes relations between multiple classes of entities. Understanding the natural structure of this type of heterogeneous relational data is essential both for exploratory analysis and for perform ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
(Show Context)
A number of realworld domains such as social networks and ecommerce involve heterogeneous data that describes relations between multiple classes of entities. Understanding the natural structure of this type of heterogeneous relational data is essential both for exploratory analysis and for performing various predictive modeling tasks. In this paper, we propose a principled multiway clustering framework for relational data, wherein different types of entities are simultaneously clustered based not only on their intrinsic attribute values, but also on the multiple relations between the entities. To achieve this, we introduce a relation graph model that describes all the known relations between the different entity classes, in which each relation between a given set of entity classes is represented in the form of multimodal tensor over an appropriate domain. Our multiway clustering formulation is driven by the objective of capturing the maximal “information ” in the original relation graph, i.e., accurately approximating the set of tensors corresponding to the various relations. This formulation is applicable to all Bregman divergences (a broad family of loss functions that includes squared Euclidean distance, KLdivergence), and also permits analysis of mixed data types using convex combinations of appropriate Bregman loss functions. Furthermore, we present a large family of structurally different multiway clustering schemes that preserve various linear summary statistics of the original data. We accomplish the above generalizations by extending a recently proposed key theoretical result, namely the minimum Bregman information principle [1], to the relation graph setting. We also describe an efficient multiway clustering algorithm based on alternate minimization that generalizes a number of other recently proposed clustering methods. Empirical results on datasets obtained from realworld domains (e.g., movie recommendations, newsgroup articles) demonstrate the generality and efficacy of our framework. 1
Detecting Communities in Social Networks using MaxMin Modularity
"... Many datasets can be described in the form of graphs or networks where nodes in the graph represent entities and edges represent relationships between pairs of entities. A common property of these networks is their community structure, considered as clusters of densely connected groups of vertices, ..."
Abstract

Cited by 33 (4 self)
 Add to MetaCart
Many datasets can be described in the form of graphs or networks where nodes in the graph represent entities and edges represent relationships between pairs of entities. A common property of these networks is their community structure, considered as clusters of densely connected groups of vertices, with only sparser connections between groups. The identification of such communities relies on some notion of clustering or density measure, which defines the communities that can be found. However, previous community detection methods usually apply the same structural measure on all kinds of networks, despite their distinct dissimilar features. In this paper, we present a new community mining measure, MaxMin Modularity, which considers both connected pairs and criteria defined by domain experts in finding communities, and then specify a hierarchical clustering algorithm to detect communities in networks. When applied to real world networks for which the community structures are already known, our method shows improvement over previous algorithms. In addition, when applied to randomly generated networks for which we only have approximate information about communities, it gives promising results which shows the algorithm’s robustness against noise.
Allatonce Optimization for Coupled Matrix and Tensor Factorizations
, 1105
"... Joint analysis of data from multiple sources has the potential to improve our understanding of the underlying structures in complex data sets. For instance, in restaurant recommendation systems, recommendations can be based on rating histories of customers. In addition to rating histories, customers ..."
Abstract

Cited by 28 (3 self)
 Add to MetaCart
(Show Context)
Joint analysis of data from multiple sources has the potential to improve our understanding of the underlying structures in complex data sets. For instance, in restaurant recommendation systems, recommendations can be based on rating histories of customers. In addition to rating histories, customers ’ social networks (e.g., Facebook friendships) and restaurant categories information (e.g., Thai or Italian) can also be used to make better recommendations. The task of fusing data, however, is challenging since data sets can be incomplete and heterogeneous, i.e., data consist of both matrices, e.g., the person by person social network matrix or the restaurant by category matrix, and higherorder tensors, e.g., the “ratings ” tensor of the form restaurant by meal by person. In this paper, we are particularly interested in fusing data sets with the goal of capturing their underlying latent structures. We formulate this problem as a coupled matrix and tensor factorization (CMTF) problem where heterogeneous data sets are modeled by fitting outerproduct models to higherorder tensors and matrices in a coupled manner. Unlike traditional approaches solving this problem using alternating algorithms, we propose an allatonce optimization approach called CMTFOPT (CMTFOPTimization), which is a gradientbased optimization approach for joint analysis of matrices and higherorder tensors. We also extend the algorithm to handle coupled incomplete data sets. Using numerical experiments, we demonstrate that the proposed allatonce approach is more accurate than the alternating least squares approach.
A Classification for Community Discovery Methods in Complex Networks
, 2011
"... Many realworld networks are intimately organized according to a community structure. Much research effort has been devoted to develop methods and algorithms that can efficiently highlight this hidden structure of a network, yielding a vast literature on what is called today community detection. S ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
Many realworld networks are intimately organized according to a community structure. Much research effort has been devoted to develop methods and algorithms that can efficiently highlight this hidden structure of a network, yielding a vast literature on what is called today community detection. Since network representation can be very complex and can contain different variants in the traditional graph model, each algorithm in the literature focuses on some of these properties and establishes, explicitly or implicitly, its own definition of community. According to this definition, each proposed algorithm then extracts the communities, which typically reflect only part of the features of real communities. The aim of this survey is to provide a ‘user manual’ for the community discovery problem. Given a meta definition of what a community in a social network is, our aim is to organize the main categories of community discovery methods based on the definition of community they adopt. Given a desired definition of community and the features of a problem (size of network, direction of edges, multidimensionality, and so on) this review paper is designed to provide a set of approaches that researchers could focus on. The proposed classification of community discovery methods is also useful for putting into perspective the many open
Unifying Dependent Clustering and Disparate Clustering for Nonhomogeneous Data
"... Modern data mining settings involve a combination of attributevalued descriptors over entities as well as specified relationships between these entities. We present an approach to cluster such nonhomogeneous datasets by using the relationships to impose either dependent clustering or disparate clus ..."
Abstract

Cited by 16 (8 self)
 Add to MetaCart
(Show Context)
Modern data mining settings involve a combination of attributevalued descriptors over entities as well as specified relationships between these entities. We present an approach to cluster such nonhomogeneous datasets by using the relationships to impose either dependent clustering or disparate clustering constraints. Unlike prior work that views constraints as boolean criteria, we present a formulation that allows constraints to be satisfied or violated in a smooth manner. This enables us to achieve dependent clustering and disparate clustering using the same optimization framework by merely maximizing versus minimizing the objective function. We present results on both synthetic data as well as several realworld datasets.
Knowledge transformation from word space to document space
 In Proc. of SIGIR’ 08
, 2008
"... In most IR clustering problems, we directly cluster the documents, working in the document space, using cosine similarity between documents as the similarity measure. In many realworld applications, however, we usually have knowledge on the word side and wish to transform this knowledge to the docu ..."
Abstract

Cited by 13 (5 self)
 Add to MetaCart
(Show Context)
In most IR clustering problems, we directly cluster the documents, working in the document space, using cosine similarity between documents as the similarity measure. In many realworld applications, however, we usually have knowledge on the word side and wish to transform this knowledge to the document (concept) side. In this paper, we provide a mechanism for this knowledge transformation. To the best of our knowledge, this is the first model for such type of knowledge transformation. This model uses a nonnegative matrix factorization model X = FSG T, where X is the worddocument semantic matrix, F is the posterior probability of a word belonging to a word cluster and represents knowledge in the word space, G is the posterior probability of a document belonging to a document cluster and represents knowledge in the document space, and S is a scaled matrix factor which provides a condensed view of X. We show how knowledge on words can improve document clustering, i.e, knowledge in the word space is transformed into the document space. We perform extensive experiments to validate our approach.