## Topic modeling with network regularization (2008)

### Cached

### Download Links

- [sifaka.cs.uiuc.edu]
- [sifaka.cs.uiuc.edu]
- [www-personal.umich.edu]
- [www-personal.umich.edu]
- [www2008.org]
- [wwwconference.org]
- [www-connex.lip6.fr]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proc. of the 17th WWW Conference |

Citations: | 61 - 6 self |

### BibTeX

@INPROCEEDINGS{Mei08topicmodeling,

author = {Qiaozhu Mei and Deng Cai and Duo Zhang and Chengxiang Zhai},

title = {Topic modeling with network regularization},

booktitle = {In Proc. of the 17th WWW Conference},

year = {2008}

}

### OpenURL

### Abstract

In this paper, we formally define the problem of topic modeling with network structure (TMN). We propose a novel solution to this problem, which regularizes a statistical topic model with a harmonic regularizer based on a graph structure in the data. The proposed method combines topic modeling and social network analysis, and leverages the power of both statistical topic models and discrete regularization. The output of this model can summarize well topics in text, map a topic onto the network, and discover topical communities. With appropriate instantiations of the topic model and the graph-based regularizer, our model can be applied to a wide range of text mining problems such as authortopic analysis, community discovery, and spatial text mining. Empirical experiments on two data sets with different genres show that our approach is effective and outperforms both text-oriented methods and network-oriented methods alone. The proposed model is general; it can be applied to any text collections with a mixture of topics and an associated network structure.

### Citations

8091 | D.: Maximum Likelihood from Incomplete Data via the EM Algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...objective function degenerates to the log-likelihood function of PLSA with no regularization. The standard way of parameter estimation for PLSA is to apply the Expectation Maximization (EM) algorithm =-=[8]-=- which iteratively computes a local maximum of L(C). Specifically, in the E-step, it computes the expectation of the complete likelihood Q(Ψ; Ψn), where Ψ denotes all the parameters, and Ψn denotes th... |

2590 | Normalized cuts and image segmentation
- Shi, Malik
- 1997
(Show Context)
Citation Context ...u may not get semantically coherent ones”. To verify this, we compare our results with a pure graph-based clustering method. Specifically, we compare with the Normalized Cut (NC) clustering algorithm =-=[24]-=-, which is one of the standard spectral clustering algorithms. By feeding the algorithm 4 with the coauthor matrix, we also extract four clusters (communities). 4 http://www.cis.upenn.edu/∼jshi/softwa... |

2366 | Latent dirichlet allocation
- Blei, Ng, et al.
- 2003
(Show Context)
Citation Context ... we can leverage the associated network structure to discover interesting topic and/or network patterns. Statistical topic models have recently been successfully applied to multiple text mining tasks =-=[10, 4, 28, 26, 20, 15, 27]-=- to discover a number of topics from text. Some recent work has incorporated into topic modeling context information [20], such as time [27], geographic location [19], and authorship [26, 23, 19], to ... |

942 |
The EM Algorithm and Extensions
- Mclachlan, Krishnan
- 1996
(Show Context)
Citation Context ...e affected by where she lived. Since the topic “weather” is very broad, we guide the mixture model with some prior knowledge, so that it could extract several topics which we expect to see. Following =-=[18]-=-, this is done by changing the MLE estimation of p(w|θ) in M step (Equation 11) into a maximum a posterior (MAP) estimation. We extract 7 topics from the Weather dataset. We use“wind”and“hurricane” as... |

785 | Probabilistic latent semantic indexing
- Hofmann
- 1999
(Show Context)
Citation Context ... we can leverage the associated network structure to discover interesting topic and/or network patterns. Statistical topic models have recently been successfully applied to multiple text mining tasks =-=[10, 4, 28, 26, 20, 15, 27]-=- to discover a number of topics from text. Some recent work has incorporated into topic modeling context information [20], such as time [27], geographic location [19], and authorship [26, 23, 19], to ... |

764 | A view of the EM algorithm that justifies incremental sparse and other variants
- Neal, Hinton
- 1998
(Show Context)
Citation Context ...riables. This significantly increases the cost of parameter estimation of NetPLSA. In this section, we propose a simpler algorithm for parameter estimation based on the generalized EM algorithm (GEM) =-=[21]-=-. According to GEM, we do not have to find the local maximum of Q(Ψn+1; Ψn) at every M step; instead, we only need to find a better value of Ψ in the M-step, i.e., to ensure Q(Ψn+1; Ψn) ≥ Q(Ψn; Ψn). T... |

734 | Laplacian Eigenmaps for Dimensionality Reduction and
- Belkin
(Show Context)
Citation Context ...pic modeling with graph-based harmonic regularization is a novel approach. The graph-based regularizer is related to existing work in machine learning, especially graph-based semi-supervised learning =-=[33, 29, 3, 32]-=- and spectral clustering [6, 24]. The optimization framework we propose is closely related to [34], which is probably the first work combining a generative model with graph-based regularizer. Our work... |

623 | The small-world phenomenon: an algorithmic perspective
- KLEINBERG
(Show Context)
Citation Context ...twork structures is also a deficiency in some other text mining techniques such as document clustering. On the other hand, social network analysis (SNA) focuses on the topology structure of a network =-=[11, 13, 1, 12]-=-, addressing questions such as “what the diameter of a network is [13]”, “how a network evolves [13, 1]”, “how information diffuses on the network [9, 14]”, and “what are the communities on a network ... |

490 | Semisupervised learning using gaussian fields and harmonic functions
- Zhu, Ghahramani, et al.
- 2003
(Show Context)
Citation Context ...twork regularized statistical topic model as NetSTM. To illustrate this framework, in this paper we use PLSA as the statistical topic model and a regularizer similar to the graph harmonic function in =-=[33]-=-, i.e., R(C, G) = 1 2 � 〈u,v〉∈E w(u, v) k� (f(θj, u) − f(θj, v)) 2 j=1 Correspondingly, we call this model NetPLSA. Note that the regularizer in Equation 4 is an extension of the graph harmonic functi... |

434 | Learning with local and global consistency
- Zhou, Bousquet, et al.
(Show Context)
Citation Context ...pic modeling with graph-based harmonic regularization is a novel approach. The graph-based regularizer is related to existing work in machine learning, especially graph-based semi-supervised learning =-=[33, 29, 3, 32]-=- and spectral clustering [6, 24]. The optimization framework we propose is closely related to [34], which is probably the first work combining a generative model with graph-based regularizer. Our work... |

299 | Graphs over time: densification laws, shrinking diameters and possible explanations
- Leskovec, Kleinberg, et al.
- 2005
(Show Context)
Citation Context ...twork structures is also a deficiency in some other text mining techniques such as document clustering. On the other hand, social network analysis (SNA) focuses on the topology structure of a network =-=[11, 13, 1, 12]-=-, addressing questions such as “what the diameter of a network is [13]”, “how a network evolves [13, 1]”, “how information diffuses on the network [9, 14]”, and “what are the communities on a network ... |

297 | Trawling the Web for Emerging Cyber-Communities
- Kumar, Raghavan, et al.
- 1999
(Show Context)
Citation Context ...twork structures is also a deficiency in some other text mining techniques such as document clustering. On the other hand, social network analysis (SNA) focuses on the topology structure of a network =-=[11, 13, 1, 12]-=-, addressing questions such as “what the diameter of a network is [13]”, “how a network evolves [13, 1]”, “how information diffuses on the network [9, 14]”, and “what are the communities on a network ... |

265 | Group formation in large social networks: membership, growth, and evolution
- Backstrom, Huttenlocher, et al.
(Show Context)
Citation Context |

253 | Information diffusion through blogspace
- Gruhl, Guha, et al.
(Show Context)
Citation Context ...n the topology structure of a network [11, 13, 1, 12], addressing questions such as “what the diameter of a network is [13]”, “how a network evolves [13, 1]”, “how information diffuses on the network =-=[9, 14]-=-”, and “what are the communities on a network [11, 1].” However, these techniques usually do not leverage the rich text information. In manysscenarios, text information is very helpful for SNA tasks. ... |

233 | The author-topic model for authors and documents
- Rosen-Zvi, Griffiths, et al.
- 2004
(Show Context)
Citation Context ..., 26, 20, 15, 27] to discover a number of topics from text. Some recent work has incorporated into topic modeling context information [20], such as time [27], geographic location [19], and authorship =-=[26, 23, 19]-=-, to facilitate contextual text mining. Topics discovered in this way can be used to infer research communities [26, 23] or information diffusion over geographic locations [19]. However, they do not c... |

186 | The missing link: A probabilistic model of document content and hypertext connectivity
- Cohn
(Show Context)
Citation Context ...erstand the diffusion of social networks [9, 14]. However, the rich textual information associated with the social network is ignored in most cases. Although there has been some existing explorations =-=[7, 17, 16, 2]-=-, there has not been a unified way to combine textual contents with social networks. Indeed, [31] proposes a probabilistic model to extract e-communities based on the content of communication document... |

162 |
Ucinet for Windows: Software for Social Network Analysis
- Borgatti, Everett, et al.
- 2002
(Show Context)
Citation Context ...ithm which tries to put two vertices which are connected by an edge closer, and Gower Metric Scaling will locate two vertices closer if they are intensely connected directly or through other vertices =-=[5]-=-. Therefore, in both layout views (Figure 2 (a) and (b)), authors closer to each other are more likely to be in the same community. Clearly, from Figure 2 (b), we can guess that there are 3 to 4 major... |

152 | Topic and role discovery in social networks with experiments on enron and academic email
- McCallum, Wang, et al.
- 2007
(Show Context)
Citation Context ...erstand the diffusion of social networks [9, 14]. However, the rich textual information associated with the social network is ignored in most cases. Although there has been some existing explorations =-=[7, 17, 16, 2]-=-, there has not been a unified way to combine textual contents with social networks. Indeed, [31] proposes a probabilistic model to extract e-communities based on the content of communication document... |

150 | The intelligent surfer: Probabilistic combination of link and content information
- Richardson, Domingos
(Show Context)
Citation Context ...e the mining process of topics in text and social networks (e.g., combining topic modeling with network analysis). Although methods have been proposed to combine page contents and links in web search =-=[22]-=-, none of them is tuned for text mining. In this paper, we formally define the major tasks of Topic Modeling with Network Structure (TMN), and propose a unified framework to combine statistical topic ... |

138 | Topics over time: a non-markov continuous-time model of topical trends - Wang, McCallum - 2006 |

125 |
Spectral k-way ratio-cut partitioning and clustering
- Chan, Schlag, et al.
- 1993
(Show Context)
Citation Context ...f the collection. When λ = 1, this objective function boils down to 1 �k 2 j=1 fT j ∆fj. Embedded with additional constraints, this is related to the objective of spectral clustering (i.e., ratio cut =-=[6]-=-). By minimizing O(C, G), we will extract document clusters solely based on the network structure. An interesting simplified case is when every vertex only contains one document (thus substitute u, v ... |

116 | Pachinko allocation: Dag-structured mixture models of topic correlations
- Li, Mccallum
- 2006
(Show Context)
Citation Context ... we can leverage the associated network structure to discover interesting topic and/or network patterns. Statistical topic models have recently been successfully applied to multiple text mining tasks =-=[10, 4, 28, 26, 20, 15, 27]-=- to discover a number of topics from text. Some recent work has incorporated into topic modeling context information [20], such as time [27], geographic location [19], and authorship [26, 23, 19], to ... |

116 | Probabilistic author-topic models for information discovery
- STEYVERS, SMYTH, et al.
- 2004
(Show Context)
Citation Context |

78 | Cascading Behavior in Large Blog Graphs
- Leskovec, McGlohon, et al.
- 2007
(Show Context)
Citation Context ...n the topology structure of a network [11, 13, 1, 12], addressing questions such as “what the diameter of a network is [13]”, “how a network evolves [13, 1]”, “how information diffuses on the network =-=[9, 14]-=-”, and “what are the communities on a network [11, 1].” However, these techniques usually do not leverage the rich text information. In manysscenarios, text information is very helpful for SNA tasks. ... |

69 |
A probabilistic approach to spatiotemporal theme pattern mining on weblogs
- Mei, Liu, et al.
- 2006
(Show Context)
Citation Context ...ning tasks [10, 4, 28, 26, 20, 15, 27] to discover a number of topics from text. Some recent work has incorporated into topic modeling context information [20], such as time [27], geographic location =-=[19]-=-, and authorship [26, 23, 19], to facilitate contextual text mining. Topics discovered in this way can be used to infer research communities [26, 23] or information diffusion over geographic locations... |

58 | A crosscollection mixture model for comparative text mining
- Zhai, Velivelli, et al.
- 2004
(Show Context)
Citation Context |

52 | Multi-way distributional clustering via pairwise interactions
- Bekkerman, El-Yaniv, et al.
- 2005
(Show Context)
Citation Context ...erstand the diffusion of social networks [9, 14]. However, the rich textual information associated with the social network is ignored in most cases. Although there has been some existing explorations =-=[7, 17, 16, 2]-=-, there has not been a unified way to combine textual contents with social networks. Indeed, [31] proposes a probabilistic model to extract e-communities based on the content of communication document... |

43 | Spectral clustering for multi-type relational data
- Long, Zhang, et al.
- 2006
(Show Context)
Citation Context |

39 | Probabilistic models for discovering e-communities
- Zhou, Manavoglu, et al.
- 2006
(Show Context)
Citation Context ... social network is ignored in most cases. Although there has been some existing explorations [7, 17, 16, 2], there has not been a unified way to combine textual contents with social networks. Indeed, =-=[31]-=- proposes a probabilistic model to extract e-communities based on the content of communication documents, but they leave aside the network structure in their model. Cohn and Hofmann proposed a model w... |

37 | Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning
- Zhu, Lafferty
(Show Context)
Citation Context ...ated to existing work in machine learning, especially graph-based semi-supervised learning [33, 29, 3, 32] and spectral clustering [6, 24]. The optimization framework we propose is closely related to =-=[34]-=-, which is probably the first work combining a generative model with graph-based regularizer. Our work is different from theirs, as their task is semi-supervised classification, while we focus on unsu... |

22 |
A mixture model for contextual text mining
- Mei, Zhai
- 2006
(Show Context)
Citation Context ...opic maps, and meaningful geographic topic distributions. 6. RELATED WORK Statistical topic modeling and social network analysis have little overlap in existing literature. Statistical topic modeling =-=[10, 4, 28, 26, 19, 20, 15]-=- uses a multinomial word distribution to represent a topic, and explains the generation of the text collection with a mixture of such topics. However, none of these existing models considers the natur... |

13 | Topic evolution and social interactions: how authors effect research
- Zhou, Ji, et al.
- 2006
(Show Context)
Citation Context ...vised learning. The concrete applications we introduced in Section 4 are also related to existing work on author-topic analysis [26, 20], spatiotemporal text mining [19, 20], and blog mining [9, 19]. =-=[30]-=- explores co-author network to estimate the Markov transition probabilities between topics, which uses the network structure as a post processing step of topic modeling. Our work leverages the generat... |

8 | Adjusting mixture weights of gaussian mixture model via regularized probabilistic latent semantic analysis
- Si, Jin
(Show Context)
Citation Context ...ting models considers the natural network structure in the data. In the basic models such as PLSA [10] and LDA [4], there is no constraint other than “sum-to-one” on the topic-document distributions. =-=[25]-=- uses a regularizer based on KL divergence, by discouraging the topic distribution of a document from deviating the average topic distribution in the collection. We propose a different method, by regu... |

6 |
Discrete Regularization, Semi-Supervised Learning, ser. Adaptive computation and machine learning
- Zhou, Schölkopf
(Show Context)
Citation Context ...opics). It can be rewritten as R(C, G) = 1 2 k� j=1 f T j ∆fj where fj is a |V | dimensional vector of the weights of the j-th topic on each vertex (e.g., {p(θj|v)}v). ∆ is the graph Laplacian matrix =-=[33, 32]-=-. We have ∆ = D − W, where W is the matrix of edge weights, and D is a diagonal matrix where d(u, u) = � v w(u, v). This framework is a general one that can leverage the power of both the topic model ... |