Results 1  10
of
28
Supervised Random Walks: Predicting and Recommending Links in Social Networks
"... Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which interactions among existing members are likely to occur in the near future or which existing interactions are we missing. Althoug ..."
Abstract

Cited by 62 (0 self)
 Add to MetaCart
Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which interactions among existing members are likely to occur in the near future or which existing interactions are we missing. Although this problem has been extensively studied, the challenge of how to effectively combine the information from the network structure with rich node and edge attribute data remains largely open. We develop an algorithm based on Supervised Random Walks that naturally combines the information from the network structure with node and edge level attributes. We achieve this by using these attributes to guide a random walk on the graph. We formulate a supervised learning task where the goal is to learn a function that assigns strengths to edges in the network such that a random walker is more likely to visit the nodes to which new links will be created in the future. We develop an efficient training algorithm to directly learn the edge strength estimation function. Our experiments on the Facebook social graph and large collaboration networks show that our approach outperforms stateoftheart unsupervised approaches as well as approaches that are based on feature extraction.
Inferring Networks of Diffusion and Influence
, 2012
"... Information diffusion and virus propagation are fundamental processes taking place in networks. While it is often possible to directly observe when nodes become infected with a virus or publish the information, observing individual transmissions (who infects whom, or who influences whom) is typicall ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
Information diffusion and virus propagation are fundamental processes taking place in networks. While it is often possible to directly observe when nodes become infected with a virus or publish the information, observing individual transmissions (who infects whom, or who influences whom) is typically very difficult. Furthermore, in many applications, the underlying network over which the diffusions and propagations spread is actually unobserved. We tackle these challenges by developing a method for tracing paths of diffusion and influence through networks and inferring the networks over which contagions propagate. Given the times when nodes adopt pieces of information or become infected, we identify the optimal network that best explains the observed infection times. Since the optimization problem is NPhard to solve exactly, we develop an efficient approximation algorithm that scales to large datasets and finds provably nearoptimal networks. We demonstrate the effectiveness of our approach by tracing information diffusion in a set of 170 million blogs and news articles over a one year period to infer how information flows through the online media space. We find that the diffusion network of news for the top 1,000 media sites and blogs tends to have a coreperiphery structure with a small set of core media sites that diffuse information to the rest of the Web. These sites tend to have stable circles of influence with more general news media sites acting as connectors between them.
Uncovering the temporal dynamics of diffusion networks
 in Proc. of the 28th Int. Conf. on Machine Learning (ICML’11
, 2011
"... Time plays an essential role in the diffusion of information, influence and disease over networks. In many cases we only observe when a node copies information, makes a decision or becomes infected – but the connectivity, transmission rates between nodes and transmission sources are unknown. Inferri ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
Time plays an essential role in the diffusion of information, influence and disease over networks. In many cases we only observe when a node copies information, makes a decision or becomes infected – but the connectivity, transmission rates between nodes and transmission sources are unknown. Inferring the underlying dynamics is of outstanding interest since it enables forecasting, influencing and retarding infections, broadly construed. To this end, we model diffusion processes as discrete networks of continuous temporal processes occurring at different rates. Given cascade data – observed infection times of nodes – we infer the edges of the global diffusion network and estimate the transmission rates of each edge that best explain the observed data. The optimization problem is convex. The model naturally (without heuristics) imposes sparse solutions and requires no parameter tuning. The problem decouples into a collection of independent smaller problems, thus scaling easily to networks on the order of hundreds of thousands of nodes. Experiments on real and synthetic data show that our algorithm both recovers the edges of diffusion networks and accurately estimates their transmission rates from cascade data. 1.
The Network Completion Problem: Inferring Missing Nodes and Edges in Networks
"... While the social and information networks have become ubiquitous, the challenge ofcollecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the uno ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
While the social and information networks have become ubiquitous, the challenge ofcollecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the unobserved part of the network. We address this issue by studying the Network Completion Problem: Given a network with missing nodes and edges, can we complete the missing part? We cast the problem in the Expectation Maximization (EM) framework where we use the observed part of the network to fit a model of network structure, and then we estimate the missing part of the network using the model, reestimate the parameters and so on. We combine the EM algorithm with the Kronecker graphs model and design a scalable Metropolized Gibbs sampling approach that allows for the estimation of the model parametersas well as the inference about missing nodes and edges of the network. Experiments on synthetic and several realworld networks show that our approach can effectively recover the network even when about half of the nodes in the network are missing. Our algorithm outperforms not only classical linkprediction approaches but also the state of the art Stochastic block modeling approach. Furthermore, our algorithm easily scales to networks with tens of thousands of nodes. 1
Composite Social Network for Predicting Mobile Apps Installation
, 2011
"... We have carefully instrumented a large portion of the population living in a university graduate dormitory by giving participants Android smart phones running our sensing software. In this paper, we propose the novel problem of predicting mobile application (known as “apps”) installation using socia ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
We have carefully instrumented a large portion of the population living in a university graduate dormitory by giving participants Android smart phones running our sensing software. In this paper, we propose the novel problem of predicting mobile application (known as “apps”) installation using social networks and explain its challenge. Modern smart phones, like the ones used in our study, are able to collect different social networks using builtin sensors. (e.g. Bluetooth proximity network, call log network, etc) While this information is accessible to app market makers such as the iPhone AppStore, it has not yet been studied how app market makers can use these information for marketing research and strategy development. We develop a simple computational model to better predict app installation by using a composite network computed from the different networks sensed by phones. Our model also captures individual variance and exogenous factors in app adoption. We show the importance of considering all these factors in predicting app installations, and we observe the surprising result that app installation is indeed predictable. We also show that our model achieves the best results compared with generic approaches.
Topology Discovery of Sparse Random Graphs With Few Participants ∗
, 2011
"... We considerthe taskoftopologydiscoveryofsparserandomgraphsusing endtoendrandom measurements(e.g., delay)between a subset ofnodes, referredto as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discovery under two ro ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We considerthe taskoftopologydiscoveryofsparserandomgraphsusing endtoendrandom measurements(e.g., delay)between a subset ofnodes, referredto as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discovery under two routing models: (a) the participants exchange messages along the shortest paths and obtain endtoend measurements, and (b) additionally, the participants exchange messages along the second shortest path. For scenario (a), our proposed algorithm results in a sublinear editdistance guarantee using a sublinear number of uniformly selected participants. For scenario (b), we obtain a much stronger result, and show that we can achieve consistent reconstruction when a sublinear number of uniformly selected nodes participate. This implies that accurate discovery of sparse random graphs is tractable using an extremely small number of participants. We finally obtain a lower bound on the number of participants required by any algorithm to reconstruct the original random graph up to a given edit distance. We also demonstrate that while consistent discovery is tractable for sparse random graphs using a small number of participants, in general, there are graphs which cannot be discovered by any algorithm even with a significant number of participants, and with the availability of endtoend information along all the paths between the participants.
Featureenhanced probabilistic models for diffusion network inference
 In European conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD’12
, 2012
"... Abstract. Cascading processes, such as disease contagion, viral marketing, and information diffusion, are a pervasive phenomenon in many types of networks. The problem of devising intervention strategies to facilitate or inhibit such processes has recently received considerable attention. However, a ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. Cascading processes, such as disease contagion, viral marketing, and information diffusion, are a pervasive phenomenon in many types of networks. The problem of devising intervention strategies to facilitate or inhibit such processes has recently received considerable attention. However, a major challenge is that the underlying network is often unknown. In this paper, we revisit the problem of inferring latent network structure given observations from a diffusion process, such as the spread of trending topics in social media. We define a family of novel probabilistic models that can explain recurrent cascading behavior, and take into account not only the time differences between events but also a richer set of additional features. We show that MAP inference is tractable and can therefore scale to very large realworld networks. Further, we demonstrate the effectiveness of our approach by inferring the underlying network structure of a subset of the popular Twitter following network by analyzing the topics of a large number of messages posted by users over a 10month period. Experimental results show that our models accurately recover the links of the Twitter network, and significantly improve the performance over previous models based entirely on time. 1
Topology Discovery of Sparse Random Graphs With Few Participants
"... We consider the task of topology discovery of sparse random graphs using endtoend random measurements (e.g., delay) between a subset of nodes, referred to as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discover ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We consider the task of topology discovery of sparse random graphs using endtoend random measurements (e.g., delay) between a subset of nodes, referred to as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discovery under two routing models: (a) the participants exchange messages along the shortest paths and obtain endtoend measurements, and (b) additionally, the participants exchange messages along the second shortest path. For scenario(a), ourproposedalgorithm resultsinasublineareditdistance guarantee using a sublinear number of uniformly selected participants. For scenario (b), we obtain a much stronger result, and show that we can achieve consistent reconstruction when a sublinear numberof uniformly selected nodes participate. This implies that accurate discovery of sparse random graphs is tractable using an extremely small number of participants. We finally obtain a lower bound on the number of participants required by any algorithm to reconstruct the original random graph up to a given edit distance. We also demonstrate that while consistent discovery is tractable for sparse random graphs using a small number of participants, in general, there are graphs which cannot be discovered by any algorithm even with a significant number of participants, and with the availability of endtoend information along all the paths between the participants.
Statistical Inference for ValuedEdge Networks: The Generalized Exponential Random Graph Model
, 2011
"... Across the sciences, the statistical analysis of networks is central to the production of knowledge on relational phenomena. Because of their ability to model the structural generation of networks based on both endogenous and exogenous factors, exponential random graph models are a ubiquitous means ..."
Abstract
 Add to MetaCart
Across the sciences, the statistical analysis of networks is central to the production of knowledge on relational phenomena. Because of their ability to model the structural generation of networks based on both endogenous and exogenous factors, exponential random graph models are a ubiquitous means of analysis. However, they are limited by an inability to model networks with valued edges. We address this problem by introducing a class of generalized exponential random graph models capable of modeling networks whose edges have continuous values (bounded or unbounded), thus greatly expanding the scope of networks applied researchers can subject to statistical analysis.
Machine Learning for Ecological Science and Environmental Policy
"... � Processes governing the function and structure of ecosystems � Flows of energy and nutrients ..."
Abstract
 Add to MetaCart
� Processes governing the function and structure of ecosystems � Flows of energy and nutrients