Results 1 - 10
of
38
Topics over time: A non-Markov continuous-time model of topical trends
- in SIGKDD
, 2006
"... This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work that relies on Markov assumptions or discretization of time, here each topic is associated with a continuous distribution ..."
Abstract
-
Cited by 73 (7 self)
- Add to MetaCart
This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work that relies on Markov assumptions or discretization of time, here each topic is associated with a continuous distribution over timestamps, and for each generated document, the mixture distribution over topics is influenced by both word co-occurrences and the document’s timestamp. Thus, the meaning of a particular topic can be relied upon as constant, but the topics ’ occurrence and correlations change significantly over time. We present results on nine months of personal email, 17 years of NIPS research papers and over 200 years of presidential state-of-the-union addresses, showing improved topics, better timestamp prediction, and interpretable trends.
FacetNet: A Framework for Analyzing Communities and Their Evolutions in Dynamic Networks
"... We discover communities from social network data, and analyze the community evolution. These communities are inherent characteristics of human interaction in online social networks, as well as paper citation networks. Also, communities may evolve over time, due to changes to individuals’ roles and s ..."
Abstract
-
Cited by 22 (10 self)
- Add to MetaCart
We discover communities from social network data, and analyze the community evolution. These communities are inherent characteristics of human interaction in online social networks, as well as paper citation networks. Also, communities may evolve over time, due to changes to individuals’ roles and social status in the network as well as changes to individuals ’ research interests. We present an innovative algorithm that deviates from the traditional two-step approach to analyze community evolutions. In the traditional approach, communities are first detected for each time slice, and then compared to determine correspondences. We argue that this approach is inappropriate in applications with noisy data. In this paper, we propose FacetNet for analyzing communities and their evolutions through a robust unified
Community Evolution in Dynamic Multi-Mode Networks
- KDD'08
, 2008
"... A multi-mode network typically consists of multiple heterogeneous social actors among which various types of interactions could occur. Identifying communities in a multi-mode network can help understand the structural properties of the network, address the data shortage and unbalanced problems, and ..."
Abstract
-
Cited by 20 (8 self)
- Add to MetaCart
A multi-mode network typically consists of multiple heterogeneous social actors among which various types of interactions could occur. Identifying communities in a multi-mode network can help understand the structural properties of the network, address the data shortage and unbalanced problems, and assist tasks like targeted marketing and finding influential actors within or between groups. In general, a network and the membership of groups often evolve gradually. In a dynamic multi-mode network, both actor membership and interactions can evolve, which poses a challenging problem of identifying community evolution. In this work, we try to address this issue by employing the temporal information to analyze a multi-mode network. A spectral framework and its scalability issue are carefully studied. Experiments on both synthetic data and real-world large scale networks demonstrate the efficacy of our algorithm and suggest its generality in solving problems with complex relationships.
Recovering temporally rewiring networks: A model-based approach
- In ICML07
, 2007
"... A plausible representation of relational information among entities in dynamic systems such as a living cell or a social community is a stochastic network which is topologically rewiring and semantically evolving over time. While there is a rich literature on modeling static or temporally invariant ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
A plausible representation of relational information among entities in dynamic systems such as a living cell or a social community is a stochastic network which is topologically rewiring and semantically evolving over time. While there is a rich literature on modeling static or temporally invariant networks, much less has been done toward modeling the dynamic processes underlying rewiring networks, and on recovering such networks when they are not observable. We present a class of hidden temporal exponential random graph models (htERGMs) to study the yet unexplored topic of modeling and recovering temporally rewiring networks from time series of node attributes such as activities of social actors or expression levels of genes. We show that one can reliably infer the latent timespecific topologies of the evolving networks from the observation. We report empirical results on both synthetic data and a Drosophila lifecycle gene expression data set, in comparison with a static counterpart of htERGM. 1.
Social Ties and their Relevance to Churn in Mobile Telecom Networks
, 2008
"... Social Network Analysis has emerged as a key paradigm in modern sociology, technology, and information sciences. The paradigm stems from the view that the attributes of an individual in a network are less important than their ties (relationships) with other individuals in the network. Exploring the ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Social Network Analysis has emerged as a key paradigm in modern sociology, technology, and information sciences. The paradigm stems from the view that the attributes of an individual in a network are less important than their ties (relationships) with other individuals in the network. Exploring the nature and strength of these ties can help understand the structure and dynamics of social networks and explain real-world phenomena, ranging from organizational efficiency to the spread of information and disease. In this paper, we examine the communication patterns of millions of mobile phone users, allowing us to study the underlying social network in a large-scale communication network. Our primary goal is to address the role of social ties in the formation and growth of groups, or communities, in a mobile network. In particular, we study the evolution of churners in an operator’s network spanning over a period of four months. Our analysis explores the propensity of a subscriber to churn out of a service provider’s network depending on the number of ties (friends) that have already churned. Based on our findings, we propose a spreading activation-based technique that predicts potential churners by examining the current set of churners and their underlying social network. The efficiency of the prediction is expressed as a lift curve, which indicates the fraction of all churners that can be caught when a certain fraction of subscribers were contacted.
Relational learning via latent social dimensions, in 'KDD '09
- Proceedings di of the 15th ACM SIGKDD international ti conference on Knowledge
, 2009
"... Social media such as blogs, Facebook, Flickr, etc., presents data in a network format rather than classical IID distribution. To address the interdependency among data instances, relational learning has been proposed, and collective inference based on network connectivity is adopted for prediction. ..."
Abstract
-
Cited by 15 (9 self)
- Add to MetaCart
Social media such as blogs, Facebook, Flickr, etc., presents data in a network format rather than classical IID distribution. To address the interdependency among data instances, relational learning has been proposed, and collective inference based on network connectivity is adopted for prediction. However, the connections in social media are often multi-dimensional. An actor can connect to another actor due to different factors, e.g., alumni, colleagues, living in the same city or sharing similar interest, etc. Collective inference normally does not differentiate these connections. In this work, we propose to extract latent social dimensions based on network information first, and then utilize them as features for discriminative learning. These social dimensions describe different affiliations of social actors hidden in the network, and the subsequent discriminative learning can automatically determine which affiliations are better aligned with the class labels. Such a scheme is preferred when multiple diverse relations are associated with the same network. We conduct extensive experiments on social media data (one from a real-world blog site and the other from a popular content sharing site). Our model outperforms representative relational learning methods based on collective inference, especially when few labeled data are available. The sensitivity of this model and its connection to existing methods are also carefully examined.
The time-series link prediction problem with applications in communication surveillance
- INFORMS Journal on Computing
, 2009
"... The ability to predict linkages among data objects is central to many data mining tasks, such as product recommendation and social network analysis. A substantial literature has been devoted to the link prediction problem either as an implicitly embedded problem in specific applications or as a gene ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
The ability to predict linkages among data objects is central to many data mining tasks, such as product recommendation and social network analysis. A substantial literature has been devoted to the link prediction problem either as an implicitly embedded problem in specific applications or as a generic data mining task. This literature has mostly adopted a static graph representation where a snapshot of the network is analyzed to predict hidden or future links. However, this representation is only appropriate to investigate whether certain link will ever occur or not and does not apply to many applications for which the prediction of the repeated link occurrences are of main interest (e.g., communication network surveillance). In this paper, we introduce the time series link prediction problem, taking into consideration temporal evolutions of link occurrences to predict link occurrence probabilities at a particular time. Using the Enron email data and highenergy particle physics literature coauthorship data we have demonstrated that time series models of single link occurrences achieved comparable link prediction performance with commonly used static graph link prediction algorithms. Furthermore, combination of static graph link prediction algorithms and time series model produced significantly improved predictions than static graph link prediction methods, demonstrating the great potential of integrated methods that exploit both inter-link structural dependencies and intra-link temporal dependencies. Key words: analysis of algorithms; communication networks; link prediction; statistical analysis; time series analysis. 1.
Structural Link Analysis from User Profiles and Friends Networks: A Feature Construction Approach
"... We consider the problems of predicting, classifying, and annotating friends relations in friends networks, based upon network structure and user profile data. First, we document a data model for the blog service LiveJournal, and define a set of machine learning problems such as predicting existing l ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
We consider the problems of predicting, classifying, and annotating friends relations in friends networks, based upon network structure and user profile data. First, we document a data model for the blog service LiveJournal, and define a set of machine learning problems such as predicting existing links and estimating inter-pair distance. Next, we explain how the problem of classifying a user pair in a social network, as directly connected or not, poses the problem of selecting and constructing relevant features. We document feature analyzers for attributes that depend only on graph attributes, those that depend on individual user demographics and set-valued attributes (e.g., interests, communities, and educational institutions), and those that depend on candidate user pairs. We then extend our data model using whole-network attributes and report machine learning experiments on learning the concept of a connected pair of friends from LiveJournal data. Finally, we develop a theory of dependent types for deriving causal explanations and discuss how this can be used to scale statistical relational learning up to our full corpus, a recent crawl of over a million records from
Parallel Spectral Clustering
"... Abstract. Spectral clustering algorithm has been shown to be more effective in finding clusters than most traditional algorithms. However, spectral clustering suffers from a scalability problem in both memory use and computational time when a dataset size is large. To perform clustering on large dat ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Abstract. Spectral clustering algorithm has been shown to be more effective in finding clusters than most traditional algorithms. However, spectral clustering suffers from a scalability problem in both memory use and computational time when a dataset size is large. To perform clustering on large datasets, we propose to parallelize both memory use and computation on distributed computers. Through an empirical study on a large document dataset of 193, 844 data instances and a large photo dataset of 637, 137, we demonstrate that our parallel algorithm can effectively alleviate the scalability problem. Key words: Parallel spectral clustering, distributed computing 1

