Results 1 - 10
of
18
Statistical properties of community structure in large social and information networks
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract
-
Cited by 65 (6 self)
- Add to MetaCart
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse real-world networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large real-world networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “community-like.” This behavior is not explained, even at a qualitative level, by any of the commonly-used network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are well-embeddable in a low-dimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
Community structure in large networks: Natural cluster sizes and the absence of large welldefined clusters
- CoRR
"... A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins wit ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempt to interpret these sets as a “real ” communities, we employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales. We study over 100 large real-world networks, ranging from traditional and on-line social networks, to technological and information networks and
Editorial: The Future of Power Law Research
- Internet Mathematics
, 2006
"... Abstract. I argue that power law research must move from focusing on observation, interpretation, and modeling of power law behavior to instead considering the challenging problems of validation of models and control of systems. 1. The Problem with Power Law Research To begin, I would like to recall ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Abstract. I argue that power law research must move from focusing on observation, interpretation, and modeling of power law behavior to instead considering the challenging problems of validation of models and control of systems. 1. The Problem with Power Law Research To begin, I would like to recall a humorous insight from the paper of Fabrikant, Koutsoupias, and Papadimitriou [Fabrikant et al. 01], consisting of this quote and the following footnote. “Power laws... have been termed ‘the signature of human activity’... ” 1 The study of power laws, especially in networks, has clearly exploded over the last decade, with seemingly innumerable papers and even popular books, such as Barabási’s Linked [Barabási 02] and Watts ’ Six Degrees [Watts 03]. Power laws are, indeed, everywhere. Despite this remarkable success, I believe that research into power laws in computer networks (and networks more generally) suffers from glaring deficiencies that need to be addressed by the community. Coping with these deficiencies should lead to another great burst of exciting and compelling research. To explain the problem, I would like to make an analogy to the area of string theory. String theory is incredibly rich and beautiful mathematically, with a simple and compelling basic starting assumption: the universe’s building blocks do not really correspond to (zero-dimensional) points, but to small 1 “They are certainly the product of one particular kind of human activity: looking for power laws... ” [Fabrikant et al. 01] © A K Peters, Ltd. 1542-7951/05 $0.50 per page 525
Dynamics of Large Networks
, 2008
"... A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found very interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop models that explain processes which govern the network evolution, fit such models to real networks, and use them to generate realistic graphs or give formal explanations about their properties. In addition, our work has a wide range of applications: it can help us spot anomalous graphs and outliers, forecast future graph structure and run simulations of network evolution. Another important aspect of our research is the study of “local ” patterns and structures of propagation in networks. We aim to identify building blocks of the networks and find the patterns of influence that these blocks have on information or virus propagation over the network. Our recent work included the study of the spread of influence in a large person-to-person
Random dot product graph models for social networks,” in Algorithms and Models for the Web-Graph
- of Lecture Notes in Computer Science
, 2007
"... Abstract. Inspired by the recent interest in combining geometry with random graph models, we explore in this paper two generalizations of the random dot product graph model proposed by Kraetzl, Nickel and Scheinerman, and Tucker [1, 2]. In particular we consider the properties of clustering, diamete ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract. Inspired by the recent interest in combining geometry with random graph models, we explore in this paper two generalizations of the random dot product graph model proposed by Kraetzl, Nickel and Scheinerman, and Tucker [1, 2]. In particular we consider the properties of clustering, diameter and degree distribution with respect to these models. Additionally we explore the conductance of these models and show that in a geometric sense, the conductance is constant. 1
A spatial web graph model with local influence regions
- Internet Mathematics
"... Abstract. We present a new stochastic model for complex networks, based on a spatial embedding of the nodes, called the Spatial Preferred Attachment (SPA) model. In the SPA model, nodes have influence regions of varying size, and new nodes may only link to a node if they fall within its influence re ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Abstract. We present a new stochastic model for complex networks, based on a spatial embedding of the nodes, called the Spatial Preferred Attachment (SPA) model. In the SPA model, nodes have influence regions of varying size, and new nodes may only link to a node if they fall within its influence region. The spatial embedding of the nodes models the background knowledge or identity of the node, which will influence its link environment. In our model, nodes can determine their link environment based only on local knowledge of the network. We prove that our model gives a power law in-degree distribution, with exponent in [2, ∞) depending on the parameters, and with concentration for a wide range of in-degree values. We show that the model allows for edges that span a large distance in the underlying space, modelling a feature often observed in real-world complex networks. 1.
INFINITE RANDOM GEOMETRIC GRAPHS
"... Abstract. We introduce a new class of countably infinite random geometric graphs, whose vertices V are points in a metric space, and vertices are adjacent independently with probability p ∈ (0, 1) if the metric distance between the vertices is below a given threshold. If V is a countable dense set i ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. We introduce a new class of countably infinite random geometric graphs, whose vertices V are points in a metric space, and vertices are adjacent independently with probability p ∈ (0, 1) if the metric distance between the vertices is below a given threshold. If V is a countable dense set in Rn equipped with the metric derived from the L∞-norm, then it is shown that with probability 1 such infinite random geometric graphs have a unique isomorphism type. The isomorphism type, which we call GRn, is characterized by a geometric analogue of the existentially closed adjacency property, and we give a deterministic construction of GRn. In contrast, we show that infinite random geometric graphs in R2 with the Euclidean metric are not necessarily isomorphic. 1.
DIRECTED RANDOM DOT PRODUCT GRAPHS
"... Abstract. In this paper we consider three models for random graphs that utilize the inner product as their fundamental object. We analyze the behavior of these models with respect to clustering, the small world property, and degree distribution. These models are motivated by the random dot product g ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. In this paper we consider three models for random graphs that utilize the inner product as their fundamental object. We analyze the behavior of these models with respect to clustering, the small world property, and degree distribution. These models are motivated by the random dot product graphs developed by Kraetzl, Nickel and Scheinerman. We extend their results to fully parameterize the conditions under which clustering occurs, characterize the diameter of graphs generated by these models, and describe the behavior of the degree distribution. With the ubiquity and importance of the Internet and genetic information in medicine and biology, the study of complex networks relating to the Internet and genetics continues to be an important and vital area of study. This is especially true for networks such as the physical layer of the Internet, the link structure of the world wide web, and protein-protein and protein-gene interaction networks. Because of the size of these networks [3] and the difficulty of determining complete link information [2, 19] a significant amount of research has gone into finding

