Results 1  10
of
354
Statistical properties of community structure in large social and information networks
"... A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structur ..."
Abstract

Cited by 242 (14 self)
 Add to MetaCart
(Show Context)
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales, and we study over 70 large sparse realworld networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large realworld networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually “blend in ” with the rest of the network and thus become less “communitylike.” This behavior is not explained, even at a qualitative level, by any of the commonlyused network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are wellembeddable in a lowdimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative “forest fire” burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
Community structure in large networks: Natural cluster sizes and the absence of large welldefined clusters
, 2008
"... A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins wit ..."
Abstract

Cited by 198 (17 self)
 Add to MetaCart
(Show Context)
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempt to interpret these sets as a “real ” communities, we employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the “best ” possible community—according to the conductance measure—over a wide range of size scales. We study over 100 large realworld networks, ranging from traditional and online social networks, to technological and information networks and
Unbalanced expanders and randomness extractors from parvareshvardy codes
 In Proceedings of the 22nd Annual IEEE Conference on Computational Complexity
, 2007
"... We give an improved explicit construction of highly unbalanced bipartite expander graphs with expansion arbitrarily close to the degree (which is polylogarithmic in the number of vertices). Both the degree and the number of righthand vertices are polynomially close to optimal, whereas the previous ..."
Abstract

Cited by 126 (7 self)
 Add to MetaCart
(Show Context)
We give an improved explicit construction of highly unbalanced bipartite expander graphs with expansion arbitrarily close to the degree (which is polylogarithmic in the number of vertices). Both the degree and the number of righthand vertices are polynomially close to optimal, whereas the previous constructions of TaShma, Umans, and Zuckerman (STOC ‘01) required at least one of these to be quasipolynomial in the optimal. Our expanders have a short and selfcontained description and analysis, based on the ideas underlying the recent listdecodable errorcorrecting codes of Parvaresh and Vardy (FOCS ‘05). Our expanders can be interpreted as nearoptimal “randomness condensers, ” that reduce the task of extracting randomness from sources of arbitrary minentropy rate to extracting randomness from sources of minentropy rate arbitrarily close to 1, which is a much easier task. Using this connection, we obtain a new construction of randomness extractors that is optimal up to constant factors, while being much simpler than the previous construction of Lu et al. (STOC ‘03) and improving upon it when the error parameter is small (e.g. 1/poly(n)).
Naïve Learning in Social Networks and the Wisdom of Crowds
, 2010
"... We study learning in a setting where agents receive independent noisy signals about the true value of a variable and then communicate in a network. They naïvely update beliefs by repeatedly taking weighted averages of neighbors’ opinions. We show that all opinions in a large society converge to the ..."
Abstract

Cited by 97 (1 self)
 Add to MetaCart
We study learning in a setting where agents receive independent noisy signals about the true value of a variable and then communicate in a network. They naïvely update beliefs by repeatedly taking weighted averages of neighbors’ opinions. We show that all opinions in a large society converge to the truth if and only if the influence of the most influential agent vanishes as the society grows. We also identify obstructions to this, including prominent groups, and provide structural conditions on the network ensuring efficient learning. Whether agents converge to the truth is unrelated to how quickly consensus is approached. (JEL D83, D85, Z13)
TwiceRamanujan sparsifiers
 IN PROC. 41ST STOC
, 2009
"... We prove that for every d> 1 and every undirected, weighted graph G = (V, E), there exists a weighted graph H with at most ⌈d V  ⌉ edges such that for every x ∈ IR V, 1 ≤ xT LHx x T LGx ≤ d + 1 + 2 √ d d + 1 − 2 √ d, where LG and LH are the Laplacian matrices of G and H, respectively. ..."
Abstract

Cited by 89 (12 self)
 Add to MetaCart
We prove that for every d> 1 and every undirected, weighted graph G = (V, E), there exists a weighted graph H with at most ⌈d V  ⌉ edges such that for every x ∈ IR V, 1 ≤ xT LHx x T LGx ≤ d + 1 + 2 √ d d + 1 − 2 √ d, where LG and LH are the Laplacian matrices of G and H, respectively.
A sample of samplers  a computational perspective on sampling (survey
 In FOCS
, 1997
"... Abstract. We consider the problem of estimating the average of a huge set of values. That is, given oracle access to an arbitrary function f: {0, 1} n P −n → [0, 1], we wish to estimate 2 x∈{0,1} n f(x) upto an additive error of ǫ. We are allowed to employ a randomized algorithm that may err with pr ..."
Abstract

Cited by 79 (8 self)
 Add to MetaCart
Abstract. We consider the problem of estimating the average of a huge set of values. That is, given oracle access to an arbitrary function f: {0, 1} n P −n → [0, 1], we wish to estimate 2 x∈{0,1} n f(x) upto an additive error of ǫ. We are allowed to employ a randomized algorithm that may err with probability at most δ. We survey known algorithms for this problem and focus on the ideas underlying their construction. In particular, we present an algorithm that makes O(ǫ −2 · log(1/δ)) queries and uses n + O(log(1/ǫ)) + O(log(1/δ)) coin tosses, both complexities being very close to the corresponding lower bounds.
Optimal and scalable distribution of content updates over a mobile social network
 In Proc. IEEE INFOCOM
, 2009
"... Number: CRPRL2008080001 ..."
Linear degree extractors and the inapproximability of max clique and chromatic number
 THEORY OF COMPUTING
, 2007
"... ... that for all ε> 0, approximating MAX CLIQUE and CHROMATIC NUMBER to within n1−ε are NPhard. We further derandomize results of Khot (FOCS ’01) and show that for some γ> 0, no quasipolynomial time algorithm approximates MAX CLIQUE or CHROMATIC NUMBER to within n/2 (logn)1−γ, unless N˜P = ˜ ..."
Abstract

Cited by 68 (2 self)
 Add to MetaCart
... that for all ε> 0, approximating MAX CLIQUE and CHROMATIC NUMBER to within n1−ε are NPhard. We further derandomize results of Khot (FOCS ’01) and show that for some γ> 0, no quasipolynomial time algorithm approximates MAX CLIQUE or CHROMATIC NUMBER to within n/2 (logn)1−γ, unless N˜P = ˜P. The key to these results is a new construction of dispersers, which are related to randomness extractors. A randomness extractor is an algorithm which extracts randomness from a lowquality random source, using some additional truly random bits. We construct new extractors which require only log2 n + O(1) additional random bits for sources with constant entropy rate, and have constant error. Our dispersers use an arbitrarily small constant
Arithmetic Circuits: a survey of recent results and open questions
"... A large class of problems in symbolic computation can be expressed as the task of computing some polynomials; and arithmetic circuits form the most standard model for studying the complexity of such computations. This algebraic model of computation attracted a large amount of research in the last fi ..."
Abstract

Cited by 65 (5 self)
 Add to MetaCart
A large class of problems in symbolic computation can be expressed as the task of computing some polynomials; and arithmetic circuits form the most standard model for studying the complexity of such computations. This algebraic model of computation attracted a large amount of research in the last five decades, partially due to its simplicity and elegance. Being a more structured model than Boolean circuits, one could hope that the fundamental problems of theoretical computer science, such as separating P from NP, will be easier to solve for arithmetic circuits. However, in spite of the appearing simplicity and the vast amount of mathematical tools available, no major breakthrough has been seen. In fact, all the fundamental questions are still open for this model as well. Nevertheless, there has been a lot of progress in the area and beautiful results have been found, some in the last few years. As examples we mention the connection between polynomial identity testing and lower bounds of Kabanets and Impagliazzo, the lower bounds of Raz for multilinear formulas, and two new approaches for proving lower bounds: Geometric Complexity Theory and Elusive Functions. The goal of this monograph is to survey the field of arithmetic circuit complexity, focusing mainly on what we find to be the most interesting and accessible research directions. We aim to cover the main results and techniques, with an emphasis on works from the last two decades. In particular, we