Results 1  10
of
26
Statistical properties of community structure in large social and information networks
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract

Cited by 120 (10 self)
 Add to MetaCart
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse realworld networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large realworld networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “communitylike.” This behavior is not explained, even at a qualitative level, by any of the commonlyused network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are wellembeddable in a lowdimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
Community structure in large networks: Natural cluster sizes and the absence of large welldefined clusters
, 2008
"... A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins wit ..."
Abstract

Cited by 79 (6 self)
 Add to MetaCart
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempt to interpret these sets as a “real ” communities, we employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales. We study over 100 large realworld networks, ranging from traditional and online social networks, to technological and information networks and
Dynamics of Large Networks
, 2008
"... A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models that explain processes which govern the network evolution, fit such models to real networks, and use them to generate realistic graphs or give formal explanations about their properties. In addition, our work has a wide range of applications: it can help us spot anomalous graphs and outliers, forecast future graph structure and run simulations of network evolution. Another important aspect of our research is the study of “local ” patterns and structures of propagation in networks. We aim to identify building blocks of the networks and find the patterns of influence that these blocks have on information or virus propagation over the network. Our recent work included the study of the spread of influence in a large persontoperson
Editorial: The future of power law research
 Internet Mathematics
"... Abstract. I argue that power law research must move from focusing on observation, interpretation, and modeling of power law behavior to instead considering the challenging problems of validation of models and control of systems. 1. The Problem with Power Law Research To begin, I would like to recall ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
Abstract. I argue that power law research must move from focusing on observation, interpretation, and modeling of power law behavior to instead considering the challenging problems of validation of models and control of systems. 1. The Problem with Power Law Research To begin, I would like to recall a humorous insight from the paper of Fabrikant, Koutsoupias, and Papadimitriou [Fabrikant et al. 01], consisting of this quote and the following footnote. “Power laws... have been termed ‘the signature of human activity’... ” 1 The study of power laws, especially in networks, has clearly exploded over the last decade, with seemingly innumerable papers and even popular books, such as Barabási’s Linked [Barabási 02] and Watts ’ Six Degrees [Watts 03]. Power laws are, indeed, everywhere. Despite this remarkable success, I believe that research into power laws in computer networks (and networks more generally) suffers from glaring deficiencies that need to be addressed by the community. Coping with these deficiencies should lead to another great burst of exciting and compelling research. To explain the problem, I would like to make an analogy to the area of string theory. String theory is incredibly rich and beautiful mathematically, with a simple and compelling basic starting assumption: the universe’s building blocks do not really correspond to (zerodimensional) points, but to small 1 “They are certainly the product of one particular kind of human activity: looking for power laws... ” [Fabrikant et al. 01]
Random dot product graph models for social network
 OF LECTURE NOTES IN COMPUTER SCIENCE
, 2007
"... Inspired by the recent interest in combining geometry with random graph models, we explore in this paper two generalizations of the random dot product graph model proposed by Kraetzl, Nickel and Scheinerman, and Tucker [1, 2]. In particular we consider the properties of clustering, diameter and deg ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Inspired by the recent interest in combining geometry with random graph models, we explore in this paper two generalizations of the random dot product graph model proposed by Kraetzl, Nickel and Scheinerman, and Tucker [1, 2]. In particular we consider the properties of clustering, diameter and degree distribution with respect to these models. Additionally we explore the conductance of these models and show that in a geometric sense, the conductance is constant.
A spatial web graph model with local influence regions
 Internet Mathematics
"... Abstract. We present a new stochastic model for complex networks, based on a spatial embedding of the nodes, called the Spatial Preferred Attachment (SPA) model. In the SPA model, nodes have influence regions of varying size, and new nodes may only link to a node if they fall within its influence re ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
Abstract. We present a new stochastic model for complex networks, based on a spatial embedding of the nodes, called the Spatial Preferred Attachment (SPA) model. In the SPA model, nodes have influence regions of varying size, and new nodes may only link to a node if they fall within its influence region. The spatial embedding of the nodes models the background knowledge or identity of the node, which will influence its link environment. In our model, nodes can determine their link environment based only on local knowledge of the network. We prove that our model gives a power law indegree distribution, with exponent in [2, ∞) depending on the parameters, and with concentration for a wide range of indegree values. We show that the model allows for edges that span a large distance in the underlying space, modelling a feature often observed in realworld complex networks. 1.
A Constructing and Sampling Graphs with a Prescribed Joint Degree Distribution
"... One of the most influential recent results in network analysis is that many natural networks exhibit a powerlaw or lognormal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have th ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
One of the most influential recent results in network analysis is that many natural networks exhibit a powerlaw or lognormal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have the right degree distribution, they are not good models for real life networks due to their differences on other important metrics like conductance. We believe this is, in part, because many of these realworld networks have very different joint degree distributions, i.e. the probability that a randomly selected edge will be between nodes of degree k and l. Assortativity is a sufficient statistic of the joint degree distribution, and it has been previously noted that social networks tend to be assortative, while biological and technological networks tend to be disassortative. We suggest understanding the relationship between network structure and the joint degree distribution of graphs is an interesting avenue of further research. An important tool for such studies are algorithms that can generate random instances of graphs with the same joint degree distribution. This is the main topic of this paper and we study the problem from both a theoretical and practical perspective. We provide an algorithm for constructing simple graphs from a given joint degree distribution, and a Monte Carlo Markov Chain method for sampling them. We also show that the state space of simple graphs with a fixed degree distribution is connected via end point switches. We empirically evaluate the mixing time of this Markov Chain by using experiments based on the autocorrelation of each edge. These experiments show that our Markov Chain mixes quickly on these real graphs, allowing for utilization of our techniques in practice.
Sampling Graphs with a Prescribed Joint Degree Distribution Using Markov Chains
"... One of the most influential results in network analysis is that many natural networks exhibit a powerlaw or lognormal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have the right ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
One of the most influential results in network analysis is that many natural networks exhibit a powerlaw or lognormal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have the right degree distribution, they are not good models for real life networks due to their differences on other important metrics like conductance. We believe this is, in part, because many of these realworld networks have very different joint degree distributions, i.e. the probability that a randomly selected edge will be between nodes of degree k and l. Assortativity is a sufficient statistic of the joint degree distribution, and it has been previously noted that social networks tend to be assortative, while biological and technological networks tend to be disassortative. We suggest that the joint degree distribution of graphs is an interesting avenue of study for further research into network structure. We provide a simple greedy algorithm for constructing simple graphs from a given joint degree distribution, and a Monte Carlo Markov Chain method for sampling them. We also show that the state space of simple graphs with a fixed degree distribution is connected via endpoint switches. We empirically evaluate the mixing time of this Markov Chain by using experiments based on the autocorrelation of each edge.