## Statistical properties of community structure in large social and information networks

### Cached

### Download Links

- [www.cs.cmu.edu]
- [cs.stanford.edu]
- [cs-www.cs.yale.edu]
- [www-2.cs.cmu.edu]
- [www2008.org]
- [wwwconference.org]
- [cs-www.cs.yale.edu]
- DBLP

### Other Repositories/Bibliography

Citations: | 137 - 10 self |

### BibTeX

@TECHREPORT{Lang_statisticalproperties,

author = {Kevin J. Lang and Anirban Dasgupta and Michael W. Mahoney},

title = {Statistical properties of community structure in large social and information networks},

institution = {},

year = {}

}

### Years of Citing Articles

### OpenURL

### Abstract

A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse real-world networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large real-world networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “community-like.” This behavior is not explained, even at a qualitative level, by any of the commonly-used network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are well-embeddable in a low-dimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.

### Citations

2800 | Normalized Cuts and Image Segmentation
- Shi, Malik
- 2000
(Show Context)
Citation Context ...s (1)–(4), we will follows the usual path in this paper. For point (3), we choose a natural and widely-adopted notion of community goodness called conductance, also known as the normalized cut metric =-=[6, 31, 16]-=-. Since there exist a rich suite of both theoretical and practical algorithms to optimize this quantity [32, 20, 4, 17, 37, 10], we can for point (4) compare and contrast several methods to approximat... |

2496 | Emergence of scaling in random networks
- Barabási, Albert
- 1999
(Show Context)
Citation Context ...ions, at even a qualitative level. In Figure 7, we summarize these results. Figure 7(a) shows the NCP plot for a 10, 000 node network generated according to the original preferential attachment model =-=[1]-=-, where at each time step a node joins the graph and connects to m = 2 existing nodes. Note that the NCP plot is very shallow and flat (even more than the corresponding rewired graph), and thus the ne... |

2198 |
Collective dynamics of ‘small-world’ networks
- Watts, Strogatz
- 1998
(Show Context)
Citation Context ...networks that “live” in a low-dimensional structure, e.g., on a manifold or the surface of the earth. For example, Figure 2(b) shows the NCP plot for a power grid network of Western States Power Grid =-=[34]-=-, and Figure 2(c) shows the NCP plot for a road network of California. Finally, in contrast, Figures 2(d) shows NCP plots for a Gnm graph with 100, 000 nodes and average degrees of 4, 6, and 8, i.e., ... |

1957 | Random Graphs
- Bollobás
- 2001
(Show Context)
Citation Context ...graphs 41 here. (The other interesting special case, in which all the expected degrees wi are equal to np, for some p ∈ [0, 1], corresponds to the classical Gilbert-Erdös-Renyi Gnp random graph model =-=[24]-=-.) Given the number of nodes n, the power-law exponent β, and the parameters w and wmax, Chung and Lu [41] give the degree sequence for a power-law graph: wi = ci −1/(β−1) for i s.t. i0 ≤ i < n + i0, ... |

1808 |
A global geometric framework for nonlinear dimensionality reduction
- Tenenbaum, Silva, et al.
- 2000
(Show Context)
Citation Context ... by design there is an underlying geometry (e.g., power grid and road networks [153], simple meshes, low-dimensional manifolds including graphs corresponding to the well-studied “swiss roll” data set =-=[150]-=-, a geometric preferential attachment model [70, 71], etc.), several networks that are very good expanders, and many simulated networks generated by commonly-used network generation models(e.g., prefe... |

1711 | The structure and function of complex networks
- Newman
- 2003
(Show Context)
Citation Context ...real-world social and information networks. Finally, we also compare results with analytical and/or simulational results on a wide range of commonly and not-so-commonly used network generation models =-=[124, 25, 9, 101, 135, 111, 70, 71]-=-. 1.2 Summary of our results Main Empirical Findings: Taken as a whole, the results we will present in this paper suggest a rather detailed and somewhat counterintuitive picture of the community struc... |

1441 | Statistical mechanics of complex networks
- Albert, Barabási
(Show Context)
Citation Context ...munity identification [125, 52], data clustering [90], graph and spectral clustering [75, 151, 143], graph and heavy-tailed data analysis [126, 29, 49], surveys on various aspects of complex networks =-=[10, 55, 124, 25, 51, 114, 23]-=-, the monographs on spectral graph theory and complex networks [33, 41], and the book on social network analysis [152]. See Section 7 for a more detailed discussion of the relationship of our work wit... |

1435 | Data clustering: A review
- Jain, Murty, et al.
- 1999
(Show Context)
Citation Context ...orithms that we will use. There exist a large number of reviews on topics related to those discussed in this paper. For example, see the reviews on community identification [125, 52], data clustering =-=[90]-=-, graph and spectral clustering [75, 151, 143], graph and heavy-tailed data analysis [126, 29, 49], surveys on various aspects of complex networks [10, 55, 124, 25, 51, 114, 23], the monographs on spe... |

1356 | On Power-Law Relationships of the Internet Topology
- Faloutsos, Faloutsos, et al.
- 1999
(Show Context)
Citation Context ... edges are added via a preferential-attachment or rich-gets-richer mechanism [124, 25]. Much of this work aims at reproducing properties of real-world graphs such as heavy-tailed degree distributions =-=[11, 27, 61]-=-. In these preferential attachment models, one typically connects each new node to the existing network by adding exactly m edges to existing nodes with a nonuniform probability that depends on the cu... |

1129 |
S.: An efficient heuristic procedure for partitioning graphs
- Kernighan, Lin
- 1970
(Show Context)
Citation Context ...further subdivide the new groups until the desired number of clusters groups is achieved. This may be combined with local improvement methods like the Kernighan-Lin and Fiduccia-Mattheyses procedures =-=[96, 64]-=-, which are fast and can climb out of some local minima. The latter was combined with a multi-resolution framework to create Metis [94, 95], a very fast program intended to split mesh-like graphs into... |

1056 |
Spectral Graph Theory
- Chung
- 1997
(Show Context)
Citation Context ...s (1)–(4), we will follows the usual path in this paper. For point (3), we choose a natural and widely-adopted notion of community goodness called conductance, also known as the normalized cut metric =-=[6, 31, 16]-=-. Since there exist a rich suite of both theoretical and practical algorithms to optimize this quantity [32, 20, 4, 17, 37, 10], we can for point (4) compare and contrast several methods to approximat... |

874 | V.: A fast and high quality multilevel scheme for partitioning irregular graphs
- Karypis, Kumar
- 1998
(Show Context)
Citation Context ...on of community goodness called conductance, also known as the normalized cut metric [6, 31, 16]. Since there exist a rich suite of both theoretical and practical algorithms to optimize this quantity =-=[32, 20, 4, 17, 37, 10]-=-, we can for point (4) compare and contrast several methods to approximately optimize it. However, it is in point (5) that we deviate from previous work. Instead of focusing on individual groups of no... |

860 |
Community structure in social and biological networks
- Girvan, Newman
- 2002
(Show Context)
Citation Context ...works. 2.2 Clusters and communities in networks Hierarchical clustering is a common approach to community identification in the social sciences [152], but it has also found application more generally =-=[79, 89]-=-. In this procedure, one first defines a distance metric between pairs of nodes and then produces a tree (in either a bottom-up or a top-down manner) describing how nodes group into communities and ho... |

857 |
Finding and evaluating community structure in networks
- Newman, Girvan
(Show Context)
Citation Context ...ction [75, 151, 143]. There are many other density-based measures that have been used to partition a graph into a set of communities [75, 151, 143]. One that deserves particular mention is modularity =-=[129, 128]-=-. For a (6) 1 Throughout this paper we consistently use shorthand phrases like “this piece has good conductance” to mean “this piece is separated from the rest of the graph by a low-conductance cut.” ... |

697 |
Social network analysis
- Wasserman, Faust
- 1994
(Show Context)
Citation Context ... 29, 49], surveys on various aspects of complex networks [10, 55, 124, 25, 51, 114, 23], the monographs on spectral graph theory and complex networks [33, 41], and the book on social network analysis =-=[152]-=-. See Section 7 for a more detailed discussion of the relationship of our work with some of this prior work. 2.1 Social and information network datasets we analyze We have examined a large number of r... |

560 | A new approach to the maximum flow problem
- Goldberg, Tarjan
- 1988
(Show Context)
Citation Context ...blically available code that scales to the sizes we need. Ordinary max flow is a very thoroughly studied problem. Currently, the best theoretical time bounds are [81], the most practical algorithm is =-=[82]-=-, while the best implementation is hi pr by [32]. Since Metis+MQI using the hi pr code is very fast and scalable, while the method empirically seems to usually find the lowest or nearly lowest conduct... |

540 |
Detecting community structure in networks
- Newman
(Show Context)
Citation Context ... Empirically we observe that local minima in the NCP plot correspond to sets of nodes that are plausible communities. Consider, e.g., Zachary’s karate club [35], an extensivelyanalyzed social network =-=[24, 26]-=-. Figure 3(a) depicts the karate club network, and Figure 3(b) shows its NCP plot. Note that Cut B, which separates the graph roughly in half, has better conductance value than Cut A (note also commun... |

515 | Multilevel k-way partitioning scheme for irregular graphs
- Karypis, Kumar
- 1998
(Show Context)
Citation Context ...inatorial quantity; and it has a very natural interpretation in terms of random walkers on the interaction graph. Moreover, since there exist a rich suite of both theoretical and practical algorithms =-=[86, 146, 106, 107, 17, 94, 95, 159, 53]-=-, we can for point (4) compare and contrast several methods to approximately optimize it. However, it is in point (5) that we deviate from previous work. Instead of focusing on individual groups of no... |

466 |
R.W.: A multi-level algorithm for partitioning graphs
- Hendrickson, Leland
- 1995
(Show Context)
Citation Context ...inatorial quantity; and it has a very natural interpretation in terms of random walkers on the interaction graph. Moreover, since there exist a rich suite of both theoretical and practical algorithms =-=[86, 146, 106, 107, 17, 94, 95, 159, 53]-=-, we can for point (4) compare and contrast several methods to approximately optimize it. However, it is in point (5) that we deviate from previous work. Instead of focusing on individual groups of no... |

451 |
A Linear-Time Heuristic for Improving Network Partitions
- Fiduccia, Mattheyses
- 1982
(Show Context)
Citation Context ...further subdivide the new groups until the desired number of clusters groups is achieved. This may be combined with local improvement methods like the Kernighan-Lin and Fiduccia-Mattheyses procedures =-=[96, 64]-=-, which are fast and can climb out of some local minima. The latter was combined with a multi-resolution framework to create Metis [94, 95], a very fast program intended to split mesh-like graphs into... |

438 |
Algebraic connectivity of graphs
- Fiedler
- 1973
(Show Context)
Citation Context ...ere is the spectral method, which uses an eigenvector of the graph’s Laplacian matrix to find a cut whose conductance is no bigger than φ if the graph actually contains a cut with conductance O(φ 2 ) =-=[31, 54, 65, 120, 33]-=-. The spectral method also produces lower bounds which can show that the solution for a given graph is closer to optimal than promised by the worst-case guarantee. Second, there is an algorithm that u... |

420 |
C.: Finding community structure in very large networks
- Clauset, Newman, et al.
(Show Context)
Citation Context ... in which nodes are authors and edges connect authors co-authoring at least one paper. Here, publication venues (e.g., journals, conferences) play the role of “ground truth” communities. • AmazonProd =-=[8]-=- is a network linking products often purchased together at amazon.com. Each item belongs to one or more hierarchically organized categories, and products from the same category define a group which is... |

379 | Inferring Web Communities from Link Topology
- Gibson, Kleinberg, et al.
- 1998
(Show Context)
Citation Context ...there exists work which views communities from a very different perspective. For example, Kumar et al. [102] view communities as a dense bipartite subgraph of the Web; Gibson, Kleinberg, and Raghavan =-=[78]-=- view communities as consisting of a core of central authoritative pages linked together by hub pages; Hopcroft et al. [88, 89] are interested in the temporal evolution of communities that are robust ... |

357 | A Random Graph Model for Massive Graphs
- Aiello, Chung, et al.
- 2000
(Show Context)
Citation Context ...is model is different than the so-called “configuration model” in which the degree distribution is exactly specified and which was studied by Molloy and Reed [121, 122] and also Aiello, Chung, and Lu =-=[7, 8]-=-. This model is also different than generative models such as preferential attachment models [9, 124, 25] or models based on optimization [56, 57, 60], although common to all of these generative model... |

342 |
Lethality and centrality in protein networks
- Jeong, Mason, et al.
- 2001
(Show Context)
Citation Context ...tworks Bio-Proteins 4,626 14,801 0.72 0.91 6.40 24.25 0.12 12 4.24 Yeast protein interaction network [50] Bio-Yeast 1,458 1,948 0.37 0.51 2.67 7.13 0.14 19 6.89 Yeast protein interaction network data =-=[91]-=- Bio-YeastP0.001 353 1,517 0.73 0.93 8.59 20.18 0.57 11 4.33 Yeast protein-protein interaction map [132] Bio-YeastP0.01 1,266 8,511 0.79 0.97 13.45 47.73 0.44 12 3.87 Yeast protein-protein interaction... |

337 | Graphs over time: densification laws, shrinking diameters and possible explanations
- Leskovec, Kleinberg, et al.
- 2005
(Show Context)
Citation Context ... best cuts at large size scales are very shallow, and there is a relatively abrupt transition in between. This is a consequence of the extreme sparsity of the data. • A “forest fire” generative model =-=[21]-=-, in which edges are added in a manner that imitates a fire-spreading process, reproduces not only the deep cuts at small size scales and the absence of deep cuts at large size scales but other proper... |

329 | A critical point for random graphs with a given degree sequence, Random Structures and Algorithms
- Molloy, Reed
- 1995
(Show Context)
Citation Context ...ph generated in this manner. (Note that this model is different than the so-called “configuration model” in which the degree distribution is exactly specified and which was studied by Molloy and Reed =-=[121, 122]-=- and also Aiello, Chung, and Lu [7, 8]. This model is also different than generative models such as preferential attachment models [9, 124, 25] or models based on optimization [56, 57, 60], although c... |

325 | A tutorial on spectral clustering
- Luxburg
- 2007
(Show Context)
Citation Context ...t a large number of reviews on topics related to those discussed in this paper. For example, see the reviews on community identification [125, 52], data clustering [90], graph and spectral clustering =-=[75, 151, 143]-=-, graph and heavy-tailed data analysis [126, 29, 49], surveys on various aspects of complex networks [10, 55, 124, 25, 51, 114, 23], the monographs on spectral graph theory and complex networks [33, 4... |

321 | Group formation in large social networks: membership, growth, and evolution
- Backstrom, Huttenlocher, et al.
- 2006
(Show Context)
Citation Context ...30,507,070 0.47 0.88 8.78 351.66 0.23 23 5.43 Social network of professional contacts LiveJournal01 3,766,521 30,629,297 0.78 0.97 16.26 111.24 0.36 23 5.55 Friendship network of a blogging community =-=[20]-=- LiveJournal11 4,145,160 34,469,135 0.77 0.97 16.63 122.44 0.36 23 5.61 Friendship network of a blogging community [20] LiveJournal12 4,843,953 42,845,684 0.76 0.97 17.69 170.66 0.35 20 5.53 Friendshi... |

319 | Evolution of networks
- Dorogovtsev, Mendes
- 2002
(Show Context)
Citation Context ...munity identification [125, 52], data clustering [90], graph and spectral clustering [75, 151, 143], graph and heavy-tailed data analysis [126, 29, 49], surveys on various aspects of complex networks =-=[10, 55, 124, 25, 51, 114, 23]-=-, the monographs on spectral graph theory and complex networks [33, 41], and the book on social network analysis [152]. See Section 7 for a more detailed discussion of the relationship of our work wit... |

316 | A.: Trawling the web for emerging cyber-communities
- Kumar, Raghavan, et al.
- 1999
(Show Context)
Citation Context ... for community detection include [48, 155, 46, 21, 134]. In addition to this work we have cited, there exists work which views communities from a very different perspective. For example, Kumar et al. =-=[102]-=- view communities as a dense bipartite subgraph of the Web; Gibson, Kleinberg, and Raghavan [78] view communities as consisting of a core of central authoritative pages linked together by hub pages; H... |

311 | Mapping the Gnutella network: Properties of large-scale peer-to-peer systems and implications for system design
- Ripeanu, Foster, et al.
- 2002
(Show Context)
Citation Context ...-papers) networks Atp-DBLP 615,678 944,456 DBLP [21] AtM-Imdb 2,076,978 5,847,693 Actors-to-movies • Internet networks AsSkitter 1,719,037 12,814,089 Autonom. sys. Gnutella 62,561 147,878 P2P network =-=[29]-=- Table 1: Some of the network datasets we studied. 2. BACKGROUND AND OVERVIEW In this section, we will provide background on our data and methods. There exist a large number of reviews on topics relat... |

305 |
Uncovering the overlapping community structure of complex networks in nature and society
- Palla, Derényi, et al.
(Show Context)
Citation Context ...roft et al. [88, 89] are interested in the temporal evolution of communities that are robust when the input data to clustering algorithms that identify them are moderately perturbed; and Palla et al. =-=[131]-=- view communities as a chain of adjacent cliques and focusCommunity structure in real graphs 55 on the extent to which they are nested and overlap. The implications of our results for this body of wo... |

302 |
Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms
- Leighton, Rao
- 1999
(Show Context)
Citation Context ...on of community goodness called conductance, also known as the normalized cut metric [6, 31, 16]. Since there exist a rich suite of both theoretical and practical algorithms to optimize this quantity =-=[32, 20, 4, 17, 37, 10]-=-, we can for point (4) compare and contrast several methods to approximately optimize it. However, it is in point (5) that we deviate from previous work. Instead of focusing on individual groups of no... |

292 | Hierarchical organization of modularity in metabolic networks - Ravasz, Somera, et al. - 2002 |

280 | The dynamics of viral marketing
- Leskovec, Adamic, et al.
- 2007
(Show Context)
Citation Context ...3,315 3,505,519 0.94 0.99 14.81 52.70 0.41 19 5.66 Amazon products (all 4 graphs merged) [48] AmazonAllProd 524,371 1,491,793 0.80 0.91 5.69 11.75 0.35 42 11.18 Products (all products, source+target) =-=[108]-=- AmazonSrcProd 334,863 925,872 0.84 0.91 5.53 11.53 0.43 47 12.11 Products (only source products) [108] Table 3: Network datasets we analyzed. Statistics of networks we consider: number of nodes N; nu... |

277 | Structure and evolution of online social networks
- Kumar, Novak, et al.
(Show Context)
Citation Context ...ons 75,877 405,739 0.48 0.90 10.69 183.88 0.26 15 4.27 Who-trusts-whom network from epinions.com [139] Flickr 404,733 2,110,078 0.33 0.86 10.43 442.75 0.40 18 5.42 Flickr photo sharing social network =-=[100]-=- LinkedIn 6,946,668 30,507,070 0.47 0.88 8.78 351.66 0.23 23 5.43 Social network of professional contacts LiveJournal01 3,766,521 30,629,297 0.78 0.97 16.26 111.24 0.36 23 5.55 Friendship network of a... |

270 | Graph-theoretical methods for detecting and describing gestalt clusters
- Zahn
- 1971
(Show Context)
Citation Context ...to the outside. Although numerous measures have been proposed for how communitylikeisasetofnodes,itiscommonlynoted—e.g., see [31] and [16]—that conductance captures the “gestalt” notion of clustering =-=[36]-=-, and so it has been widely-used for graph clustering and community detection [13, 30]. 3. NETWORK COMMUNITY PROFILE PLOT In this section, we discuss the network community profile plot (NCP plot), whi... |

268 | On clusterings: Good, bad and spectral
- Kannan, Vempala, et al.
- 2004
(Show Context)
Citation Context ...s (1)–(4), we will follows the usual path in this paper. For point (3), we choose a natural and widely-adopted notion of community goodness called conductance, also known as the normalized cut metric =-=[6, 31, 16]-=-. Since there exist a rich suite of both theoretical and practical algorithms to optimize this quantity [32, 20, 4, 17, 37, 10], we can for point (4) compare and contrast several methods to approximat... |

267 | Finding community structure in networks using the eigenvalues of matrices
- Newman
(Show Context)
Citation Context ...ly linked among themselves and there are few edges between nodes of different communities. In a similar manner, Figure 3(c) depicts Newman’s network of 379 scientists who conduct research on networks =-=[25]-=-. In this latter case, we see a hierarchical structure, in which the community defined by Cut C is included in a larger community that has better conductance value. 3.3 Community profile plots of larg... |

251 | Efficient identification of web communities
- Flake, Lawrence, et al.
- 2000
(Show Context)
Citation Context ...nce of the communities and the relative weight of inter-community edges. Flake, Tarjan, and Tsioutsiouliklis [68] introduce a similar bicriterion that is based on network flow ideas, and Flake et al. =-=[66, 67]-=- defined a community as a set of nodes that has more intra-edges than inter-edges. Similar edge-counting ideas were used by Radicchi et al. [133] to define and apply the notions of a strong community ... |

246 | Graph structure in the web
- Broder, Kumar, et al.
- 2000
(Show Context)
Citation Context ... edges are added via a preferential-attachment or rich-gets-richer mechanism [124, 25]. Much of this work aims at reproducing properties of real-world graphs such as heavy-tailed degree distributions =-=[11, 27, 61]-=-. In these preferential attachment models, one typically connects each new node to the existing network by adding exactly m edges to existing nodes with a nonuniform probability that depends on the cu... |

244 | Expander flows, geometric embeddings, and graph partitionings
- Arora, Rao, et al.
- 2004
(Show Context)
Citation Context ...inatorial quantity; and it has a very natural interpretation in terms of random walkers on the interaction graph. Moreover, since there exist a rich suite of both theoretical and practical algorithms =-=[86, 146, 106, 107, 17, 94, 95, 159, 53]-=-, we can for point (4) compare and contrast several methods to approximately optimize it. However, it is in point (5) that we deviate from previous work. Instead of focusing on individual groups of no... |

243 | Stochastic models for the Web Graph
- Kumar
- 2001
(Show Context)
Citation Context ...than the corresponding rewired graph), and thus the network that is generated is very expander-like at all size scales. In a different type of generative model edges are added via a copying mechanism =-=[18]-=-. Figure 7(b) shows the results for a network with 50, 000 nodes, generated with m =2andβ=0.05. Although the copying model aims to produce communities by linking a new node to neighbors of a existing ... |

241 | An Approximate Max-Flow Min-cut Theorem for Uniform Multicommodity Flow Problems with Applications to Approximation Algorithms - Leighton, Rao - 1988 |

231 |
A lower bound for the smallest eigenvalue of the Laplacian, Problems in analysis (Papers dedicated to Salomon Bochner
- Cheeger
- 1969
(Show Context)
Citation Context ...ere is the spectral method, which uses an eigenvector of the graph’s Laplacian matrix to find a cut whose conductance is no bigger than φ if the graph actually contains a cut with conductance O(φ 2 ) =-=[31, 54, 65, 120, 33]-=-. The spectral method also produces lower bounds which can show that the solution for a given graph is closer to optimal than promised by the worst-case guarantee. Second, there is an algorithm that u... |

224 | Complex networks: Structure and dynamics
- Boccaletti, Latora, et al.
- 2006
(Show Context)
Citation Context ...munity identification [125, 52], data clustering [90], graph and spectral clustering [75, 151, 143], graph and heavy-tailed data analysis [126, 29, 49], surveys on various aspects of complex networks =-=[10, 55, 124, 25, 51, 114, 23]-=-, the monographs on spectral graph theory and complex networks [33, 41], and the book on social network analysis [152]. See Section 7 for a more detailed discussion of the relationship of our work wit... |

208 | Trust management for the semantic web
- Richardson, Agrawal, et al.
(Show Context)
Citation Context ...k: Social Networks & Web 2.0 - Discovery and Evolution of Communities • Social nets Nodes Edges Description LiveJournal 4,843,953 42,845,684 Blog friendships [5] Epinions 75,877 405,739 Trust network =-=[28]-=- CA-DBLP 317,080 1,049,866 Co-authorship [5] • Information (citation) networks Cit-hep-th 27,400 352,021 Arxiv hep-th [14] AmazonProd 524,371 1,491,793 Amazon products [8] • Web graphs Web-google 855,... |

207 | The average distance in a random graph with given expected degrees
- CHUNG, LU
(Show Context)
Citation Context ...s a baseline for understanding the community properties we have observed in our real-world networks. We will work with the random graph model with given expected degrees, as described by Chung and Lu =-=[41, 39, 43, 38, 40, 44, 45, 42]-=-. Let n, the number of nodes in the graph, and a vector w = (w1, . . . , wn), which will be the expected degree sequence vector (where we will assume that maxi w 2 i < ∑ k wk), be given. Then, in this... |

194 | Expander graphs and their applications
- Hoory, Linial, et al.
- 2006
(Show Context)
Citation Context ...The NCP plot is roughly flat, which we also observed in Figure 2(a) for a clique, which is to be expected since the minimum conductance cut in the entire graph cannot be too small for a good expander =-=[15]-=-. Interestingly, a steadily decreasing downward NCP plot is also seen for small social networks that have been extensively studied for validating community detection algorithms. Two examples are shown... |