Results 1 
5 of
5
Infinite multiple membership relational modeling for complex networks
 In Machine Learning for Signal Processing (MLSP), 2011 IEEE International Workshop on
, 2011
"... Learning latent structure in complex networks has become an important problem fueled by many types of networked data originating from practically all fields of science. In this paper, we propose a new nonparametric Bayesian multiplemembership latent feature model for networks. Contrary to existing ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
(Show Context)
Learning latent structure in complex networks has become an important problem fueled by many types of networked data originating from practically all fields of science. In this paper, we propose a new nonparametric Bayesian multiplemembership latent feature model for networks. Contrary to existing multiplemembership models that scale quadratically in the number of vertices the proposed model scales linearly in the number of links admitting multiplemembership analysis in large scale networks. We demonstrate a connection between the single membership relational model and multiple membership models and show on “real” size benchmark network data that accounting for multiple memberships improves the learning of latent structure as measured by link prediction while explicitly accounting for multiple membership result in a more compact representation of the latent structure of networks. 1.
Stochastic Blockmodel with Cluster Overlap, Relevance Selection, and SimilarityBased Smoothing
"... Abstract—Stochastic blockmodels provide a rich, probabilistic framework for modeling relational data by expressing the objects being modeled in terms of a latent vector representation. This representation can be a latent indicator vector denoting the cluster membership (hard clustering), a vector ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Stochastic blockmodels provide a rich, probabilistic framework for modeling relational data by expressing the objects being modeled in terms of a latent vector representation. This representation can be a latent indicator vector denoting the cluster membership (hard clustering), a vector of cluster membership probabilities (soft clustering), or more generally a realvalued vector (latent space representation). Recently, a new class of overlapping stochastic blockmodels has been proposed where the idea is to allow the objects to have hard memberships in multiple clusters (in form of a latent binary vector). This aspect captures the properties of many realworld networks in domains such as biology and social networks where objects can simultaneously have memberships in multiple clusters owing to the multiple roles they may have. In this paper, we improve upon this model in three key ways:
The Hierarchical Local Partition Process
"... Editor: We consider the problem for which K different types of data are collected to characterize an associated inference task, with this performed for M distinct tasks. It is assumed that the parameters associated with the model for data type (modality) k may be represented in the form of a mixture ..."
Abstract
 Add to MetaCart
(Show Context)
Editor: We consider the problem for which K different types of data are collected to characterize an associated inference task, with this performed for M distinct tasks. It is assumed that the parameters associated with the model for data type (modality) k may be represented in the form of a mixture model, with the M tasks representing M draws from the mixture. We wish to simultaneously infer mixture models across all K modality types, using data from all M tasks. Considering tasks m1 and m2, we wish to impose the belief that if the data associated with modality k are drawn from the same mixture component (implying a similarity between tasks m1 and m2), then it is more probable that the associated data from modality j ̸ = k will also be drawn from the same component. On the other hand, it is anticipated that there may be “random effects ” that manifest idiosyncratic behavior for a subset of the modalities, even when similarity exists between the other modalities. The model employed utilizes a hierarchical Bayesian formalism, based on the local partition process. Inference is examined using both Markov chain Monte Carlo (MCMC) sampling and variational Bayesian (VB) analysis. The method is illustrated first with simulated data and then with data from two real applications. Concerning the latter, we consider analysis of geneexpression data and the sorting of annotated images.
Identification of MCMC Samples for Clustering
"... Abstract. For clustering problems, many studies use just MAP assignments to show clustering results instead of using whole samples from a MCMC sampler. This is because it is not straightforward to recognize clusters based on whole samples. Thus, we proposed an identification algorithm which construc ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. For clustering problems, many studies use just MAP assignments to show clustering results instead of using whole samples from a MCMC sampler. This is because it is not straightforward to recognize clusters based on whole samples. Thus, we proposed an identification algorithm which constructs groups of relevant clusters. The identification exploits spectral clustering to group clusters. Although a naive spectral clustering algorithm is intractable due to memory space and computational time, we developed a memoryandtime efficient spectral clustering for samples of a MCMC sampler. In experiments, we show our algorithm is tractable for real data while the naive algorithm is intractable. For search query log data, we also show representative vocabularies of clusters, which cannot be chosen by just MAP assignments. 1
Efficient Inference in the Infinite Multiple Membership Relational Model
"... Summary: The Indian Buffet Process (IBP) is a stochastic process on binary features that has been applied to modeling communities in complex networks [4, 5, 6]. Inference in the IBP is challenging as the potential number of possible configurations grows as 2KN where K is the number of latent feature ..."
Abstract
 Add to MetaCart
(Show Context)
Summary: The Indian Buffet Process (IBP) is a stochastic process on binary features that has been applied to modeling communities in complex networks [4, 5, 6]. Inference in the IBP is challenging as the potential number of possible configurations grows as 2KN where K is the number of latent features and N the number of nodes in the network. We presently consider the performance of three MCMC sampling approaches for the IBP; standard Gibbs sampling, joint Gibbs sampling and nonconjugate split/merge sampling. Our results indicate that including joint sampling significantly improves on parameter inference over standard Gibbs sampling while splitmerge sampling appears useful for improving the inference as measured by burnintime of the sampler. Introduction: Recently the Indian Buffet Process (IBP) [1] has been applied to modeling overlapping communities in networks[4, 5, 6]. We currently focus on the model proposed in [6] that is given by the following generative process Z ∼ IBP (α), σ ∼ Beta(β+c, β−c),