Results 1  10
of
161
Statistical properties of community structure in large social and information networks
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract

Cited by 120 (10 self)
 Add to MetaCart
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse realworld networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large realworld networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “communitylike.” This behavior is not explained, even at a qualitative level, by any of the commonlyused network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are wellembeddable in a lowdimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
Community structure in large networks: Natural cluster sizes and the absence of large welldefined clusters
, 2008
"... A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins wit ..."
Abstract

Cited by 78 (6 self)
 Add to MetaCart
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempt to interpret these sets as a “real ” communities, we employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales. We study over 100 large realworld networks, ranging from traditional and online social networks, to technological and information networks and
Unbalanced expanders and randomness extractors from parvareshvardy codes
 In Proceedings of the 22nd Annual IEEE Conference on Computational Complexity
, 2007
"... We give an improved explicit construction of highly unbalanced bipartite expander graphs with expansion arbitrarily close to the degree (which is polylogarithmic in the number of vertices). Both the degree and the number of righthand vertices are polynomially close to optimal, whereas the previous ..."
Abstract

Cited by 77 (7 self)
 Add to MetaCart
We give an improved explicit construction of highly unbalanced bipartite expander graphs with expansion arbitrarily close to the degree (which is polylogarithmic in the number of vertices). Both the degree and the number of righthand vertices are polynomially close to optimal, whereas the previous constructions of TaShma, Umans, and Zuckerman (STOC ‘01) required at least one of these to be quasipolynomial in the optimal. Our expanders have a short and selfcontained description and analysis, based on the ideas underlying the recent listdecodable errorcorrecting codes of Parvaresh and Vardy (FOCS ‘05). Our expanders can be interpreted as nearoptimal “randomness condensers, ” that reduce the task of extracting randomness from sources of arbitrary minentropy rate to extracting randomness from sources of minentropy rate arbitrarily close to 1, which is a much easier task. Using this connection, we obtain a new construction of randomness extractors that is optimal up to constant factors, while being much simpler than the previous construction of Lu et al. (STOC ‘03) and improving upon it when the error parameter is small (e.g. 1/poly(n)).
A sample of samplers  a computational perspective on sampling (survey
 In FOCS
, 1997
"... Abstract. We consider the problem of estimating the average of a huge set of values. That is, given oracle access to an arbitrary function f: {0, 1} n P −n → [0, 1], we wish to estimate 2 x∈{0,1} n f(x) upto an additive error of ǫ. We are allowed to employ a randomized algorithm that may err with pr ..."
Abstract

Cited by 70 (7 self)
 Add to MetaCart
Abstract. We consider the problem of estimating the average of a huge set of values. That is, given oracle access to an arbitrary function f: {0, 1} n P −n → [0, 1], we wish to estimate 2 x∈{0,1} n f(x) upto an additive error of ǫ. We are allowed to employ a randomized algorithm that may err with probability at most δ. We survey known algorithms for this problem and focus on the ideas underlying their construction. In particular, we present an algorithm that makes O(ǫ −2 · log(1/δ)) queries and uses n + O(log(1/ǫ)) + O(log(1/δ)) coin tosses, both complexities being very close to the corresponding lower bounds.
Linear degree extractors and the inapproximability of max clique and chromatic number
 THEORY OF COMPUTING
, 2007
"... ... that for all ε> 0, approximating MAX CLIQUE and CHROMATIC NUMBER to within n1−ε are NPhard. We further derandomize results of Khot (FOCS ’01) and show that for some γ> 0, no quasipolynomial time algorithm approximates MAX CLIQUE or CHROMATIC NUMBER to within n/2 (logn)1−γ, unless N˜P = ˜P. The ..."
Abstract

Cited by 42 (0 self)
 Add to MetaCart
... that for all ε> 0, approximating MAX CLIQUE and CHROMATIC NUMBER to within n1−ε are NPhard. We further derandomize results of Khot (FOCS ’01) and show that for some γ> 0, no quasipolynomial time algorithm approximates MAX CLIQUE or CHROMATIC NUMBER to within n/2 (logn)1−γ, unless N˜P = ˜P. The key to these results is a new construction of dispersers, which are related to randomness extractors. A randomness extractor is an algorithm which extracts randomness from a lowquality random source, using some additional truly random bits. We construct new extractors which require only log2 n + O(1) additional random bits for sources with constant entropy rate, and have constant error. Our dispersers use an arbitrarily small constant
Optimal and scalable distribution of content updates over a mobile social network
 In Proc. IEEE INFOCOM
, 2009
"... Number: CRPRL2008080001 ..."
TwiceRamanujan sparsifiers
 IN PROC. 41ST STOC
, 2009
"... We prove that for every d> 1 and every undirected, weighted graph G = (V, E), there exists a weighted graph H with at most ⌈d V  ⌉ edges such that for every x ∈ IR V, 1 ≤ xT LHx x T LGx ≤ d + 1 + 2 √ d d + 1 − 2 √ d, where LG and LH are the Laplacian matrices of G and H, respectively. ..."
Abstract

Cited by 34 (7 self)
 Add to MetaCart
We prove that for every d> 1 and every undirected, weighted graph G = (V, E), there exists a weighted graph H with at most ⌈d V  ⌉ edges such that for every x ∈ IR V, 1 ≤ xT LHx x T LGx ≤ d + 1 + 2 √ d d + 1 − 2 √ d, where LG and LH are the Laplacian matrices of G and H, respectively.
Naïve Learning in Social Networks and the Wisdom of Crowds
, 2010
"... We study learning in a setting where agents receive independent noisy signals about the true value of a variable and then communicate in a network. They naïvely update beliefs by repeatedly taking weighted averages of neighbors’ opinions. We show that all opinions in a large society converge to the ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
We study learning in a setting where agents receive independent noisy signals about the true value of a variable and then communicate in a network. They naïvely update beliefs by repeatedly taking weighted averages of neighbors’ opinions. We show that all opinions in a large society converge to the truth if and only if the influence of the most influential agent vanishes as the society grows. We also identify obstructions to this, including prominent groups, and provide structural conditions on the network ensuring efficient learning. Whether agents converge to the truth is unrelated to how quickly consensus is approached. (JEL D83, D85, Z13)
Quasirandom Rumor Spreading
 In Proc. of SODA’08
, 2008
"... We propose and analyse a quasirandom analogue to the classical push model for disseminating information in networks (“randomized rumor spreading”). In the classical model, in each round each informed node chooses a neighbor at random and informs it. Results of Frieze and Grimmett (Discrete Appl. Mat ..."
Abstract

Cited by 24 (10 self)
 Add to MetaCart
We propose and analyse a quasirandom analogue to the classical push model for disseminating information in networks (“randomized rumor spreading”). In the classical model, in each round each informed node chooses a neighbor at random and informs it. Results of Frieze and Grimmett (Discrete Appl. Math. 1985) show that this simple protocol succeeds in spreading a rumor from one node of a complete graph to all others within O(log n) rounds. For the network being a hypercube or a random graph G(n, p) with p ≥ (1+ε)(log n)/n, also O(log n) rounds suffice (Feige, Peleg, Raghavan, and Upfal, Random Struct. Algorithms 1990). In the quasirandom model, we assume that each node has a (cyclic) list of its neighbors. Once informed, it starts at a random position of the list, but from then on informs its neighbors in the order of the list. Surprisingly, irrespective of the orders of the lists, the above mentioned bounds still hold. In addition, we also show a O(log n) bound for sparsely connected random graphs G(n, p) with p = (log n+f(n))/n, where f(n) → ∞ and f(n) = O(log log n). Here, the classical model needs Θ(log 2 (n)) rounds. Hence the quasirandom model achieves similar or better broadcasting times with a greatly reduced use of random bits.