Results 1 - 10
of
36
De-anonymizing social networks
, 2009
"... Operators of online social networks are increasingly sharing potentially sensitive information about users and their relationships with advertisers, application developers, and data-mining researchers. Privacy is typically protected by anonymization, i.e., removing names, addresses, etc. We present ..."
Abstract
-
Cited by 57 (2 self)
- Add to MetaCart
Operators of online social networks are increasingly sharing potentially sensitive information about users and their relationships with advertisers, application developers, and data-mining researchers. Privacy is typically protected by anonymization, i.e., removing names, addresses, etc. We present a framework for analyzing privacy and anonymity in social networks and develop a new re-identification algorithm targeting anonymized socialnetwork graphs. To demonstrate its effectiveness on realworld networks, we show that a third of the users who can be verified to have accounts on both Twitter, a popular microblogging service, and Flickr, an online photo-sharing site, can be re-identified in the anonymous Twitter graph with only a 12 % error rate. Our de-anonymization algorithm is based purely on the network topology, does not require creation of a large number of dummy “sybil ” nodes, is robust to noise and all existing defenses, and works even when the overlap between the target network and the adversary’s auxiliary information is small. 1.
Learning Influence Probabilities In Social Networks
"... Recently, there has been tremendous interest in the phenomenon of influence propagation in social networks. The studies in this area assume they have as input to their problems a social graph with edges labeled with probabilities of influence between users. However, the question of where these proba ..."
Abstract
-
Cited by 29 (6 self)
- Add to MetaCart
Recently, there has been tremendous interest in the phenomenon of influence propagation in social networks. The studies in this area assume they have as input to their problems a social graph with edges labeled with probabilities of influence between users. However, the question of where these probabilities come from or how they can be computed from real social network data has been largely ignored until now. Thus it is interesting to ask whether from a social graph and a log of actions by its users, one can build models of influence. This is the main problem attacked in this paper. In addition to proposing models and algorithms for learning the model parameters and for testing the learned models to make predictions, we also develop techniques for predicting the time by which a user may be expected to perform an action. We validate our ideas and techniques using the Flickr data set consisting of a social graph with 1.3M nodes, 40M edges, and an action log consisting of 35M tuples referring to 300K distinct actions. Beyond showing that there is genuine influence happening in a real social network, we show that our techniques have excellent prediction performance.
Social influence analysis in large-scale networks
- In Proceedings of the 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
, 2009
"... In large social networks, nodes (users, entities) are influenced by others for various reasons. For example, the colleagues have strong influence on one’s work, while the friends have strong influence on one’s daily life. How to differentiate the social influences from different angles(topics)? How ..."
Abstract
-
Cited by 20 (6 self)
- Add to MetaCart
In large social networks, nodes (users, entities) are influenced by others for various reasons. For example, the colleagues have strong influence on one’s work, while the friends have strong influence on one’s daily life. How to differentiate the social influences from different angles(topics)? How to quantify the strength of those social influences? How to estimate the model on real large networks? To address these fundamental questions, we propose Topical Affinity Propagation (TAP) to model the topic-level social influence on large networks. In particular, TAP can take results of any topic modeling and the existing network structure to perform topic-level influence propagation. With the help of the influence analysis, we present several important applications on real data sets such as 1) what are the representative nodes on a given topic? 2) how to identify the social influences of neighboring nodes on a particular node? To scale to real large networks, TAP is designed with efficient distributed learning algorithms that is implemented and tested under the Map-Reduce framework. We further present the common characteristics of distributed learning algorithms for Map-Reduce. Finally, we demonstrate the effectiveness and efficiency of TAP on real large data sets. Categories and Subject Descriptors
Modeling relationship strength in online social networks
- In Proc. WWW '10
"... Previous work analyzing social networks has mainly focused on binary friendship relations. However, in online social networks the low cost of link formation can lead to networks with heterogeneous relationship strengths (e.g., acquaintances and best friends mixed together). In this case, the binary ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Previous work analyzing social networks has mainly focused on binary friendship relations. However, in online social networks the low cost of link formation can lead to networks with heterogeneous relationship strengths (e.g., acquaintances and best friends mixed together). In this case, the binary friendship indicator provides only a coarse representation of relationship information. In this work, we develop an unsupervised model to estimate relationship strength from interaction activity (e.g., communication, tagging) and user similarity. More specifically, we formulate a link-based latent variable model, along with a coordinate ascent optimization procedure for the inference. We evaluate our approach on real-world data from Facebook, showing that the estimated link weights result in higher autocorrelation and lead to improved classification accuracy. 1
Connections between the Lines: Augmenting Social Networks with Text
"... Network data is ubiquitous, encoding collections of relationships between entities such as people, places, genes, or corporations. While many resources for networks of interesting entities are emerging, most of these can only annotate connections in a limited fashion. Although relationships between ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Network data is ubiquitous, encoding collections of relationships between entities such as people, places, genes, or corporations. While many resources for networks of interesting entities are emerging, most of these can only annotate connections in a limited fashion. Although relationships between entities are rich, it is impractical to manually devise complete characterizations of these relationships for every pair of entities on large, real-world corpora. In this paper we present a novel probabilistic topic model to analyze text corpora and infer descriptions of its entities and of relationships between those entities. We develop variational methods for performing approximate inference on our model and demonstrate that our model can be practically deployed on large corpora such as Wikipedia. We show qualitatively and quantitatively that our model can construct and annotate graphs of relationships and make useful predictions.
Differences in the Mechanics of Information Diffusion Across Topics: Idioms, Political Hashtags, and Complex Contagion on Twitter
"... There is a widespread intuitive sense that different kinds of information spread differently on-line, but it has been difficult to evaluate this question quantitatively since it requires a setting where many different kinds of information spread in a shared environment. Here we study this issue on T ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
There is a widespread intuitive sense that different kinds of information spread differently on-line, but it has been difficult to evaluate this question quantitatively since it requires a setting where many different kinds of information spread in a shared environment. Here we study this issue on Twitter, analyzing the ways in which tokens known as hashtags spread on a network defined by the interactions among Twitter users. We find significant variation in the ways that widely-used hashtags on different topics spread. Our results show that this variation is not attributable simply to differences in “stickiness, ” the probability of adoption based on one or more exposures, but also to a quantity that could be viewed as a kind of “persistence ” — the relative extent to which repeated exposures to a hashtag continue to have significant marginal effects. We find that hashtags on politically controversial topics are particularly persistent, with repeated exposures continuing to have unusually large marginal effects on adoption; this provides, to our knowledge, the first large-scale validation of the “complex contagion” principle from sociology, which posits that repeated exposures to an idea are particularly crucial when the idea is in some way controversial or contentious. Among other findings, we discover that hashtags representing the natural analogues of Twitter idioms and neologisms are particularly non-persistent, with the effect of multiple exposures decaying rapidly relative to the first exposure. We also study the subgraph structure of the initial adopters for different widely-adopted hashtags, again finding structural differences across topics. We develop simulation-based and generative models to analyze how the adoption dynamics interact with the network structure of the early adopters on which a hashtag spreads.
Randomization Tests for Distinguishing Social Influence and Homophily Effects
"... Relational autocorrelation is ubiquitous in relational domains. This observed correlation between class labels of linked instances in a network (e.g., two friends are more likely to share political beliefs than two randomly selected people) can be due to the effects of two different social processes ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Relational autocorrelation is ubiquitous in relational domains. This observed correlation between class labels of linked instances in a network (e.g., two friends are more likely to share political beliefs than two randomly selected people) can be due to the effects of two different social processes. If social influence effects are present, instances are likely to change their attributes to conform to their neighbor values. If homophily effects are present, instances are likely to link to other individuals with similar attribute values. Both these effects will result in autocorrelated attribute values. When analyzing static relational networks it is impossible to determine how much of the observed correlation is due each of these factors. However, the recent surge of interest in social networks has increased the availability of dynamic network data. In this paper, we present a randomization technique for temporal network data where the attributes and links change over time. Given data from two time steps, we measure the gain in correlation and assess whether a significant portion of this gain is due to influence and/or homophily. We demonstrate the efficacy of our method on semi-synthetic data and then apply the method to a real-world social networks dataset, showing the impact of both influence and homophily effects.
Scalable Influence Maximization in Social Networks under the Linear Threshold Model
"... Abstract—Influence maximization is the problem of finding a small set of most influential nodes in a social network so that their aggregated influence in the network is maximized. In this paper, we study influence maximization in the linear threshold model, one of the important models formalizing th ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Abstract—Influence maximization is the problem of finding a small set of most influential nodes in a social network so that their aggregated influence in the network is maximized. In this paper, we study influence maximization in the linear threshold model, one of the important models formalizing the behavior of influence propagation in social networks. We first show that computing exact influence in general networks in the linear threshold model is #P-hard, which closes an open problem left in the seminal work on influence maximization by Kempe, Kleinberg, and Tardos, 2003. As a contrast, we show that computing influence in directed acyclic graphs (DAGs) can be done in time linear to the size of the graphs. Based on the fast computation in DAGs, we propose the first scalable influence maximization algorithm tailored for the linear threshold model. We conduct extensive simulations to show that our algorithm is scalable to networks with millions of nodes and edges, is orders of magnitude faster than the greedy approximation algorithm proposed by Kempe et al. and its optimized versions, and performs consistently among the best algorithms while other heuristic algorithms not design specifically for the linear threshold model have unstable performances on different realworld networks. Keywords-influence maximization; social networks; linear threshold model; I.
Effects of Feedback and Peer Pressure on Contributions to Enterprise Social Media
- Proceedings of the 2009 International conference on Supporting Group Work
"... Increasingly, large organizations are experimenting with internal social media (e.g., blogs, forums) as a platform for widespread distributed collaboration. Contributions to their counterparts outside the organization’s firewall are driven by attention from strangers, in addition to sharing among fr ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Increasingly, large organizations are experimenting with internal social media (e.g., blogs, forums) as a platform for widespread distributed collaboration. Contributions to their counterparts outside the organization’s firewall are driven by attention from strangers, in addition to sharing among friends. However, employees in a workplace under time pressures may be reluctant to participate–and the audience for their contributions is comparatively smaller. Participation rates also vary widely from group to group. So what influences people to contribute in this environment? In this paper, we present the results of a year-long empirical study of internal social media participation at a large technology company, and analyze the impact attention, feedback, and managers’ and coworkers ’ participation have on employees ’ behavior. We find feedback in the form of posted comments is highly correlated with a user’s subsequent participation. Recent manager and coworker activity relate to users initiating or resuming participation in social media. These findings extend, to an aggregate level, the results from prior interviews about blogging at the company and offer design and policy implications for organizations seeking to encourage social media adoption.
Social Action Tracking via Noise Tolerant Time-varying Factor Graphs
"... Users’behaviors(actions)inasocialnetworkareinfluencedbyvarious factors such as personal interests, social influence, and global trends. However, few publications systematicallystudy how social actions evolve in a dynamic social network and towhat extent different factors affect the user actions. In ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Users’behaviors(actions)inasocialnetworkareinfluencedbyvarious factors such as personal interests, social influence, and global trends. However, few publications systematicallystudy how social actions evolve in a dynamic social network and towhat extent different factors affect the user actions. In this paper, we propose a Noise Tolerant Time-varying Factor Graph Model (NTT-FGM) for modeling and predicting social actions. NTT-FGM simultaneously models social network structure, user attributes and user action history for better prediction of the users ’ future actions. More specifically, a user’s action at time t is generated by her latent state at t, which is influenced by her attributes,herownlatentstateattimet−1andherneighbors’ states attimetandt−1. Basedonthisintuition,weformalizethe social action tracking problem using the NTT-FGM model; then present an efficient algorithm to learn the model, by combining the ideas from both continuous linear system and Markov random field. Finally, we present a case study of our model on predicting future social actions. We validate the model on three different types ofreal-worlddatasets. Qualitatively,ourmodelcandiscover interestingpatternsofthesocialdynamics. Quantitatively,experimental resultsshowthattheproposedmethodoutperformsseveralbaseline methods for social actionprediction. Categories andSubject Descriptors

