Results 1 - 10
of
47
Find Me If You Can: Improving Geographical Prediction with Social and Spatial Proximity
"... Geography and social relationships are inextricably intertwined; the people we interact with on a daily basis almost always live near us. As people spend more time online, data regarding these two dimensions – geography and social relationships – are becoming increasingly precise, allowing us to bui ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Geography and social relationships are inextricably intertwined; the people we interact with on a daily basis almost always live near us. As people spend more time online, data regarding these two dimensions – geography and social relationships – are becoming increasingly precise, allowing us to build reliable models to describe their interaction. These models have important implications in the design of location-based services, security intrusion detection, and social media supporting local communities. Using user-supplied address data and the network of associations between members of the Facebook social network, we can directly observe and measure the relationship between geography and friendship. Using these measurements, we introduce an algorithm that predicts the location of an individual from a sparse set of located users with performance that exceeds IP-based geolocation. This algorithm is efficient and scalable, and could be run on hundreds of millions of users. Categories and Subject Descriptors
Supervised Random Walks: Predicting and Recommending Links in Social Networks
"... Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which interactions among existing members are likely to occur in the near future or which existing interactions are we missing. Althoug ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which interactions among existing members are likely to occur in the near future or which existing interactions are we missing. Although this problem has been extensively studied, the challenge of how to effectively combine the information from the network structure with rich node and edge attribute data remains largely open. We develop an algorithm based on Supervised Random Walks that naturally combines the information from the network structure with node and edge level attributes. We achieve this by using these attributes to guide a random walk on the graph. We formulate a supervised learning task where the goal is to learn a function that assigns strengths to edges in the network such that a random walker is more likely to visit the nodes to which new links will be created in the future. We develop an efficient training algorithm to directly learn the edge strength estimation function. Our experiments on the Facebook social graph and large collaboration networks show that our approach outperforms state-of-theart unsupervised approaches as well as approaches that are based on feature extraction.
Analyzing Patterns of User Content Generation in Online Social Networks
"... Various online social networks (OSNs) have been developed rapidly on the Internet. Researchers have analyzed different properties of such OSNs, mainly focusing on the formation and evolution of the networks as well as the information propagation over the networks. In knowledge-sharing OSNs, such as ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Various online social networks (OSNs) have been developed rapidly on the Internet. Researchers have analyzed different properties of such OSNs, mainly focusing on the formation and evolution of the networks as well as the information propagation over the networks. In knowledge-sharing OSNs, such as blogs and question answering systems, issues on how users participate in the network and how users “generate/contribute” knowledge are vital to the sustained and healthy growth of the networks. However, related discussions have not been reported in the research literature. In this work, we empirically study workloads from three popular knowledge-sharing OSNs, including a blog system, a social bookmark sharing network, and a question answering social network to examine these properties. Our analysis consistently shows that (1) users ’ posting behavior in these networks exhibits strong daily and weekly patterns, but the user active time in these OSNs does not follow exponential distributions; (2) the user posting behavior in these OSNs follows stretched exponential distributions instead of power-law distributions, indicating the influence of a small number of core users cannot dominate the network; (3) the distributions of user contributions on high-quality and effort-consuming contents in these OSNs have smaller stretch factors for the stretched exponential distribution. Our study provides insights into user activity patterns and lays out an analytical foundation for further understanding various properties of these OSNs.
Co-evolution of social and affiliation networks
- In 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD
, 2009
"... In our work, we address the problem of modeling social network generation which explains both link and group formation. Recent studies on social network evolution propose generative models which capture the statistical properties of real-world networks related only to node-to-node link formation. We ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
In our work, we address the problem of modeling social network generation which explains both link and group formation. Recent studies on social network evolution propose generative models which capture the statistical properties of real-world networks related only to node-to-node link formation. We propose a novel model which captures the coevolution of social and affiliation networks. We provide surprising insights into group formation based on observations in several real-world networks, showing that users often join groups for reasons other than their friends. Our experiments show that the model is able to capture both the newly observed and previously studied network properties. This work is the first to propose a generative model which captures the statistical properties of these complex networks. The proposed model facilitates controlled experiments which study the effect of actors ’ behavior on the network evolution, and it allows the generation of realistic synthetic datasets.
A.: Mining graph evolution rules
- In: ECML/PKDD
, 2009
"... Abstract. In this paper we introduce graph-evolution rules, a novel type of frequency-based pattern that describe the evolution of large networks over time, at a local level. Given a sequence of snapshots of an evolving graph, we aim at discovering rules describing the local changes occurring in it. ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Abstract. In this paper we introduce graph-evolution rules, a novel type of frequency-based pattern that describe the evolution of large networks over time, at a local level. Given a sequence of snapshots of an evolving graph, we aim at discovering rules describing the local changes occurring in it. Adopting a definition of support based on minimum image we study the problem of extracting patterns whose frequency is larger than a minimum support threshold. Then, similar to the classical association rules framework, we derive graph-evolution rules from frequent patterns that satisfy a given minimum confidence constraint. We discuss merits and limits of alternative definitions of support and confidence, justifying the chosen framework. To evaluate our approach we devise GERM (Graph Evolution Rule Miner), an algorithm to mine all graph-evolution rules whose support and confidence are greater than given thresholds. The algorithm is applied to analyze four large real-world networks (i.e., two social networks, and two co-authorship networks from bibliographic data), using different time granularities. Our extensive experimentation confirms the feasibility and utility of the presented approach. It further shows that different kinds of networks exhibit different evolution rules, suggesting the usage of these local patterns to globally discriminate different kind of networks. 1
A Particle-and-Density Based Evolutionary Clustering Method for Dynamic Networks
"... Recently, dynamic networks are attracting increasing interest due to their high potential in capturing natural and social phenomena over time. Discovery of evolutionary communities in dynamic networks has become a critical task. The previous evolutionary clustering methods usually adopt the temporal ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
Recently, dynamic networks are attracting increasing interest due to their high potential in capturing natural and social phenomena over time. Discovery of evolutionary communities in dynamic networks has become a critical task. The previous evolutionary clustering methods usually adopt the temporal smoothness framework, which has a desirable feature of controlling the balance between temporal noise and true concept drift of communities. They, however, have some major drawbacks: (1) assuming only a fixed number of communities over time; and (2) not allowing arbitrary start/stop of community over time. The forming of new communities and dissolving of existing communities are very common phenomena in real dynamic networks. In this paper, we propose a new particle-and-density based evolutionary clustering method that efficiently discovers a variable number of communities of arbitrary forming and dissolving. We first model a dynamic network as a collection of lots of particles called nano-communities, and a community as a densely connected subset of particles, called a quasi l-clique-by-clique (shortly, l-KK). Each particle contains a small amount of information about the evolution of data or patterns, and the quasi l-KK s inherent in a given dynamic network provide us with guidance on how to find a variable number of communities of arbitrary forming and dissolving. We propose a density-based clustering method that efficiently finds temporally smoothed local clusters of high quality by using a cost embedding technique and optimal modularity. We also propose a mapping method based on information theory that makes sequences of smoothed local clusters as close as possible to data-inherent quasi l-KKs. The result of the mapping method allows us to easily identify the stage of each community among the three stages: evolving, forming, and dissolving. Experimental studies, by using various data sets, demonstrate that our method improves the clustering accuracy, and at the same time, the time performance by an order of magnitude compared with the current state-of-the art method.
Dynamics of Large Networks
, 2008
"... A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models that explain processes which govern the network evolution, fit such models to real networks, and use them to generate realistic graphs or give formal explanations about their properties. In addition, our work has a wide range of applications: it can help us spot anomalous graphs and outliers, forecast future graph structure and run simulations of network evolution. Another important aspect of our research is the study of “local ” patterns and structures of propagation in networks. We aim to identify building blocks of the networks and find the patterns of influence that these blocks have on information or virus propagation over the network. Our recent work included the study of the spread of influence in a large person-to-person
Folks in Folksonomies: Social Link Prediction from Shared Metadata
"... Web 2.0 applications have attracted a considerable amount of attention because their open-ended nature allows users to create lightweight semantic scaffolding to organize and share content. To date, the interplay of the social and semantic components of social media has been only partially explored. ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Web 2.0 applications have attracted a considerable amount of attention because their open-ended nature allows users to create lightweight semantic scaffolding to organize and share content. To date, the interplay of the social and semantic components of social media has been only partially explored. Here we focus on Flickr and Last.fm, two social media systems in which we can relate the tagging activity of the users with an explicit representation of their social network. We show that a substantial level of local lexical and topical alignment is observable among users who lie close to each other in the social network. We introduce a null model that preserves user activity while removing local correlations, allowing us to disentangle the actual local alignment between users from statistical effects due to the assortative mixing of user activity and centrality in the social network. This analysis suggests that users with
Hot today, gone tomorrow: on the migration of MySpace users
- In WOSN ’09: Proceedings of the 2nd ACM workshop on Online social networks
, 2009
"... While some empirical studies on Online Social Networks (OSNs) have examined the growth of these systems, little is known about the patterns of decline in user population or user activity (in terms of visiting their OSN account) in large OSNs, mainly because capturing the required information is chal ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
While some empirical studies on Online Social Networks (OSNs) have examined the growth of these systems, little is known about the patterns of decline in user population or user activity (in terms of visiting their OSN account) in large OSNs, mainly because capturing the required information is challenging. In this paper, we examine the evolution of user population and user activity in a popular OSN, namely MySpace. Leveraging more than 360K randomly sampled profiles, we characterize both the pattern of departure and the level of activity among MySpace users. Our main findings can be summarized as follows: (i) A significant fraction of accounts have been deleted and a large fraction of valid accounts have not been visited for more than three months. (ii) One third of public accounts are owned by users who abandon their accounts shortly after creation (i.e., tourists). We leverage this information to estimate the account creation time of other users from their user IDs. (iii) We demonstrate that the growth of allocated user IDs in MySpace was exponential, followed by a sudden and significant slow-down in April 2008 due to an increase in the popularity of Facebook. If such up- and down-turns are symptomatic of OSNs, they raise the obvious question: What are the main forces that enable some systems to compete and strive in the Internet’s OSN eco-system, while others decline and ultimately die out?
Exploiting Place Features in Link Prediction on Location-based Social Networks
"... Link prediction systems have been largely adopted to recommend new friends in online social networks using data about social interactions. With the soaring adoption of locationbased social services it becomes possible to take advantage of an additional source of information: the places people visit. ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Link prediction systems have been largely adopted to recommend new friends in online social networks using data about social interactions. With the soaring adoption of locationbased social services it becomes possible to take advantage of an additional source of information: the places people visit. In this paper we study the problem of designing a link prediction system for online location-based social networks. We have gathered extensive data about one of these services, Gowalla, with periodic snapshots to capture its temporal evolution. We study the link prediction space, finding that about 30 % of new links are added among “place-friends”, i.e., among users who visit the same places. We show how this prediction space can be made 15 times smaller, while still 66 % of future connections can be discovered. Thus, we define new prediction features based on the properties of the places visited by users which are able to discriminate potential future links among them. Building on these findings, we describe a supervised learning framework which exploits these prediction features to predict new links among friends-of-friends and place-friends. Our evaluation shows how the inclusion of information about places and related user activity offers high link prediction performance. These results open new directions for realworld link recommendation systems on location-based social networks.

