Results 1  10
of
16
Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations
, 2005
"... How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include hea ..."
Abstract

Cited by 309 (39 self)
 Add to MetaCart
How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include heavy tails for in and outdegree distributions, communities, smallworld phenomena, and others. However, given the lack of information about network evolution over long periods, it has been hard to convert these findings into statements about trends over time. Here we study a wide range of real graphs, and we observe some surprising phenomena. First, most of these graphs densify over time, with the number of edges growing superlinearly in the number of nodes. Second, the average distance between nodes often shrinks over time, in contrast to the conventional wisdom that such distance parameters should increase slowly as a function of the number of nodes (like O(log n) orO(log(log n)). Existing graph generation models do not exhibit these types of behavior, even at a qualitative level. We provide a new graph generator, based on a “forest fire” spreading process, that has a simple, intuitive justification, requires very few parameters (like the “flammability” of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study.
Graph evolution: Densification and shrinking diameters
 ACM TKDD
, 2007
"... How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include hea ..."
Abstract

Cited by 127 (13 self)
 Add to MetaCart
How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include heavy tails for in and outdegree distributions, communities, smallworld phenomena, and others. However, given the lack of information about network evolution over long periods, it has been hard to convert these findings into statements about trends over time. Here we study a wide range of real graphs, and we observe some surprising phenomena. First, most of these graphs densify over time, with the number of edges growing superlinearly in the number of nodes. Second, the average distance between nodes often shrinks over time, in contrast to the conventional wisdom that such distance parameters should increase slowly as a function of the number of nodes (like O(log n) or O(log(log n)). Existing graph generation models do not exhibit these types of behavior, even at a qualitative level. We provide a new graph generator, based on a “forest fire” spreading process, that has a simple, intuitive justification, requires very few parameters (like the “flammability ” of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study. We also notice that the “forest fire” model exhibits a sharp transition between sparse graphs and graphs that are densifying. Graphs with decreasing distance between the nodes are generated around this transition point. Last, we analyze the connection between the temporal evolution of the degree distribution and densification of a graph. We find that the two are fundamentally related. We also observe that real networks exhibit this type of r
Supervised Random Walks: Predicting and Recommending Links in Social Networks
"... Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which interactions among existing members are likely to occur in the near future or which existing interactions are we missing. Althoug ..."
Abstract

Cited by 61 (0 self)
 Add to MetaCart
Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which interactions among existing members are likely to occur in the near future or which existing interactions are we missing. Although this problem has been extensively studied, the challenge of how to effectively combine the information from the network structure with rich node and edge attribute data remains largely open. We develop an algorithm based on Supervised Random Walks that naturally combines the information from the network structure with node and edge level attributes. We achieve this by using these attributes to guide a random walk on the graph. We formulate a supervised learning task where the goal is to learn a function that assigns strengths to edges in the network such that a random walker is more likely to visit the nodes to which new links will be created in the future. We develop an efficient training algorithm to directly learn the edge strength estimation function. Our experiments on the Facebook social graph and large collaboration networks show that our approach outperforms stateoftheart unsupervised approaches as well as approaches that are based on feature extraction.
Kronecker Graphs: An Approach to Modeling Networks
 JOURNAL OF MACHINE LEARNING RESEARCH 11 (2010) 9851042
, 2010
"... How can we generate realistic networks? In addition, how can we do so with a mathematically tractable model that allows for rigorous analysis of network properties? Real networks exhibit a long list of surprising properties: Heavy tails for the in and outdegree distribution, heavy tails for the ei ..."
Abstract

Cited by 54 (3 self)
 Add to MetaCart
How can we generate realistic networks? In addition, how can we do so with a mathematically tractable model that allows for rigorous analysis of network properties? Real networks exhibit a long list of surprising properties: Heavy tails for the in and outdegree distribution, heavy tails for the eigenvalues and eigenvectors, small diameters, and densification and shrinking diameters over time. Current network models and generators either fail to match several of the above properties, are complicated to analyze mathematically, or both. Here we propose a generative model for networks that is both mathematically tractable and can generate networks that have all the above mentioned structural properties. Our main idea here is to use a nonstandard matrix operation, the Kronecker product, to generate graphs which we refer to as “Kronecker graphs”. First, we show that Kronecker graphs naturally obey common network properties. In fact, we rigorously prove that they do so. We also provide empirical evidence showing that Kronecker graphs can effectively model the structure of real networks. We then present KRONFIT, a fast and scalable algorithm for fitting the Kronecker graph generation model to large real networks. A naive approach to fitting would take superexponential
Dynamics of Large Networks
, 2008
"... A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models that explain processes which govern the network evolution, fit such models to real networks, and use them to generate realistic graphs or give formal explanations about their properties. In addition, our work has a wide range of applications: it can help us spot anomalous graphs and outliers, forecast future graph structure and run simulations of network evolution. Another important aspect of our research is the study of “local ” patterns and structures of propagation in networks. We aim to identify building blocks of the networks and find the patterns of influence that these blocks have on information or virus propagation over the network. Our recent work included the study of the spread of influence in a large persontoperson
Sampling community structure
 In WWW
, 2010
"... We propose a novel method, based on concepts from expander graphs, to sample communities in networks. We show that our sampling method, unlike previous techniques, produces subgraphs representative of community structure in the original network. These generated subgraphs may be viewed as stratified ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
We propose a novel method, based on concepts from expander graphs, to sample communities in networks. We show that our sampling method, unlike previous techniques, produces subgraphs representative of community structure in the original network. These generated subgraphs may be viewed as stratified samples in that they consist of members from most or all communities in the network. Using samples produced by our method, we show that the problem of community detection may be recast into a case of statistical relational learning. We empirically evaluate our approach against several realworld datasets and demonstrate that our sampling method can effectively be used to infer and approximate community affiliation in the larger network.
Estimating the number of citations using author reputation
, 2007
"... Abstract. We study the problem of predicting the popularity of items in a dynamic environment in which authors post continuously new items and provide feedback on existing items. This problem can be applied to predict popularity of blog posts, rank photographs in a photosharing system, or predict t ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract. We study the problem of predicting the popularity of items in a dynamic environment in which authors post continuously new items and provide feedback on existing items. This problem can be applied to predict popularity of blog posts, rank photographs in a photosharing system, or predict the citations of a scientific article using author information and monitoring the items of interest for a short period of time after their creation. As a case study, we show how to estimate the number of citations for an academic paper using information about past articles written by the same author(s) of the paper. If we use only the citation information over a short period of time, we obtain a predicted value that has a correlation of r = 0.57 with the actual value. This is our baseline prediction. Our bestperforming system can improve that prediction by adding features extracted from the past publishing history of its authors, increasing the correlation between the actual and the predicted values to r = 0.81. 1
Expansion and Search in Networks
, 1009
"... Borrowing from concepts in expander graphs, we study the expansion properties of realworld, complex networks (e.g. social networks, unstructuredpeertopeeror P2P networks) and the extent to which these properties can be exploited to understand and address the problem of decentralized search. We fi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Borrowing from concepts in expander graphs, we study the expansion properties of realworld, complex networks (e.g. social networks, unstructuredpeertopeeror P2P networks) and the extent to which these properties can be exploited to understand and address the problem of decentralized search. We first produce samples that concisely capture the overall expansion properties of an entire network, which we collectively refer to as the expansion signature. Using these signatures, we find a correspondence between the magnitude of maximum expansion and the extent to which a network can be efficiently searched. We further find evidence that standard graphtheoretic measures, such as average path length, fail to fully explain the level of“searchability”or ease of information diffusion and dissemination in a network. Finally, we demonstrate that this high expansion can be leveraged to facilitate decentralized search in networks and show that an expansionbased search strategy outperforms typical search methods.
Graphs over time: Densification and shrinking diameters
 ACM Transactions on Knowledge Discovery from Data (ACM TKDD
"... How do real graphs evolve over time? What are “normal ” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include he ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
How do real graphs evolve over time? What are “normal ” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include heavy tails for in and outdegree distributions, communities, smallworld phenomena, and others. However, given the lack of information about network evolution over long periods, it has been hard to convert these findings into statements about trends over time. Here we study a wide range of real graphs, and we observe some surprising phenomena. First, most of these graphs densify over time, with the number of edges growing superlinearly in the number of nodes. Second, the average distance between nodes often shrinks over time, in contrast to the conventional wisdom that such distance parameters should increase slowly as a function of the number of nodes (like O(log n) or O(log(log n)). Existing graph generation models do not exhibit these types of behavior, even at a qualitative level. We provide a new graph generator, based on a “forest fire ” spreading process, that has a simple, intuitive justification, requires very few parameters (like the “flammability ” of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study. We also notice that the “forest fire ” model exhibits a sharp transition between sparse graphs and graphs that are densifying. Graphs with decreasing distance between the nodes are generated around this transition point. Last, we analyze the connection between the temporal evolution of the degree distribution and densification of a graph. We find that the two are fundamentally related. We also observe that real networks exhibit this type of relation between densification and the degree distribution. 1 2 J. Leskovec et al.
Learning and the structure of citation networks
, 2012
"... The distribution of citations received by scientific publications can be approximated by a power law, a finding that has been explained by “cumulative advantage”. This paper argues that socially embedded learning is a plausible mechanism behind this cumulative advantage. A model assuming that scient ..."
Abstract
 Add to MetaCart
The distribution of citations received by scientific publications can be approximated by a power law, a finding that has been explained by “cumulative advantage”. This paper argues that socially embedded learning is a plausible mechanism behind this cumulative advantage. A model assuming that scientists face a time tradeoff between learning and writing papers, that they learn the papers known by their peers, and that they cite papers they know, generates a power law distribution of popularity, and a shifted power law for the distribution of citations received. The two distributions flatten if there is relatively more learning. The predicted exponent for the distribution of citations is independent of the average in(or out) degree, contrary to an untested prediction of the reference model (Price, 1976). Using publicly available citation networks, an estimate of the share of time devoted to learning (against producing) is given around two thirds.