Results 1  10
of
27
Graph mining: laws, generators, and algorithms
 ACM COMPUT SURV (CSUR
, 2006
"... How does the Web look? How could we tell an abnormal social network from a normal one? These and similar questions are important in many fields where the data can intuitively be cast as a graph; examples range from computer networks to sociology to biology and many more. Indeed, any M: N relation in ..."
Abstract

Cited by 132 (7 self)
 Add to MetaCart
How does the Web look? How could we tell an abnormal social network from a normal one? These and similar questions are important in many fields where the data can intuitively be cast as a graph; examples range from computer networks to sociology to biology and many more. Indeed, any M: N relation in database terminology can be represented as a graph. A lot of these questions boil down to the following: “How can we generate synthetic but realistic graphs? ” To answer this, we must first understand what patterns are common in realworld graphs and can thus be considered a mark of normality/realism. This survey give an overview of the incredible variety of work that has been done on these problems. One of our main contributions is the integration of points of view from physics, mathematics, sociology, and computer science. Further, we briefly describe recent advances on some related and interesting graph problems.
Lower Bounds for Polynomial Calculus: NonBinomial Case
, 2001
"... We generalize recent linear lower bounds for Polynomial Calculus based on binomial ideals. We produce a general hardness criterion (that we call immunity) which is satisfied by a random function and prove linear lower bounds on the degree of PC refutations for a wide class of tautologies based on im ..."
Abstract

Cited by 45 (9 self)
 Add to MetaCart
We generalize recent linear lower bounds for Polynomial Calculus based on binomial ideals. We produce a general hardness criterion (that we call immunity) which is satisfied by a random function and prove linear lower bounds on the degree of PC refutations for a wide class of tautologies based on immune functions. As some applications of our techniques, we introduce mod p Tseitin tautologies in the Boolean case (e.g. in the presence of axioms x 2 i = x i ), prove that they are hard for PC over fields with characteristic different from p, and generalize them to Flow tautologies which are based on the MAJORITY function and are proved to be hard over any field. We also show the Ω(n) lower bound for random kCNF's over fields of characteristic 2.
Pseudorandom Generators in Propositional Proof Complexity
 ELECTRONIC COLLOQUIUM ON COMPUTATIONAL COMPLEXITY, REP. NO.23
, 2000
"... We call a pseudorandom generator Gn : {0, 1}^n → {0, 1}^m hard for a propositional proof system P if P can not efficiently prove the (properly encoded) statement G(x1, ..., xn) ≠ b for any string b ∈ {0, 1}^m. We consider a variety of "combinatorial" pseudorandom g ..."
Abstract

Cited by 41 (6 self)
 Add to MetaCart
We call a pseudorandom generator Gn : {0, 1}^n &rarr; {0, 1}^m hard for a propositional proof system P if P can not efficiently prove the (properly encoded) statement G(x1, ..., xn) &ne; b for any string b &isin; {0, 1}^m. We consider a variety of "combinatorial" pseudorandom generators inspired by the NisanWigderson generator on the one hand, and by the construction of Tseitin tautologies on the other. We prove that under certain circumstances these generators are hard for such proof systems as Resolution, Polynomial Calculus and Polynomial Calculus with Resolution (PCR).
The Largest Eigenvalue of Sparse Random Graphs
 Combinatorics, Probability and Computing
, 2003
"... We prove that for all values of the edge probability p(n) the largest eigenvalue of a random graph G(n, p) satisfies almost surely: 1 (G) = (1 + o(1)) maxf p ; npg, where is a maximal degree of G, and the o(1) term tends to zero as maxf p ; npg tends to infinity. ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
(Show Context)
We prove that for all values of the edge probability p(n) the largest eigenvalue of a random graph G(n, p) satisfies almost surely: 1 (G) = (1 + o(1)) maxf p ; npg, where is a maximal degree of G, and the o(1) term tends to zero as maxf p ; npg tends to infinity.
Graph Products, Fourier Analysis and Spectral Techniques
, 2003
"... We consider powers of regular graphs defined by the weak graph product and give a characterization of maximumsize independent sets for a wide family of base graphs which includes, among others, complete graphs, line graphs of regular graphs which contain a perfect matching and Kneser graphs. In man ..."
Abstract

Cited by 31 (10 self)
 Add to MetaCart
(Show Context)
We consider powers of regular graphs defined by the weak graph product and give a characterization of maximumsize independent sets for a wide family of base graphs which includes, among others, complete graphs, line graphs of regular graphs which contain a perfect matching and Kneser graphs. In many cases this also characterizes the optimal colorings of these products. We show that the independent sets induced by the base graph are the only maximumsize independent sets. Furthermore we give a qualitative stability statement: any independent set of size close to the maximum is close to some independent set of maximum size. Our approach is based on Fourier analysis on Abelian groups and on Spectral Techniques. To this end we develop some basic lemmas regarding the Fourier transform of functions on f0; : : : ; r \Gamma 1gn, generalizing some useful results from the f0; 1gn case.
Approximating the independence number and the chromatic number in expected polynomial time
, 2001
"... ..."
Recognizing more unsatisfiable random kSAT instances efficiently
, 2001
"... It is known that random kSAT instances with at least cn clauses where c = ck is a suitable constant are unsatisfiable (with high probability). We consider the problem to certify efficiently the unsatisfiability of such formulas. A backtracking based algorithm of Beame et al. shows that kSAT instan ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
It is known that random kSAT instances with at least cn clauses where c = ck is a suitable constant are unsatisfiable (with high probability). We consider the problem to certify efficiently the unsatisfiability of such formulas. A backtracking based algorithm of Beame et al. shows that kSAT instances with at least n clauses can be certified unsatisfiable in polynomial time. We employ spectral methods to improve on this bound: For even k 4 we present a polynomial time algorithm which certifies random kSAT instances with at least clauses as unsatisfiable (with high probability). For odd k we focus on 3SAT instances and obtain an ecient algorithm for formulas with at least n clauses, where " > 0 is an arbitrary constant.
On Trapping Sets and Guaranteed Error Correction Capability of LDPC Codes and GLDPC Codes
, 2008
"... The relation between the girth and the guaranteed error correction capability of γleft regular LDPC codes when decoded using the bit flipping (serial and parallel) algorithms is investigated. A lower bound on the size of variable node sets which expand by a factor of at least 3γ/4 is found based on ..."
Abstract

Cited by 8 (7 self)
 Add to MetaCart
The relation between the girth and the guaranteed error correction capability of γleft regular LDPC codes when decoded using the bit flipping (serial and parallel) algorithms is investigated. A lower bound on the size of variable node sets which expand by a factor of at least 3γ/4 is found based on the Moore bound. An upper bound on the guaranteed error correction capability is established by studying the sizes of smallest possible trapping sets. The results are extended to generalized LDPC codes. It is shown that generalized LDPC codes can correct a linear fraction of errors under the parallel bit flipping algorithm when the underlying Tanner graph is a good expander. It is also shown that the bound cannot be improved when γ is even by studying a class of trapping sets. A lower bound on the size of variable node sets which have the required expansion is established.
Community detection in sparse networks via Grothendieck’s inequality
, 2015
"... We present a simple and flexible method to prove consistency of semidefinite optimization problems on random graphs. The method is based on Grothendieck’s inequality. Unlike the previous uses of this inequality that lead to constant relative accuracy, we achieve any given relative accuracy by lever ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
We present a simple and flexible method to prove consistency of semidefinite optimization problems on random graphs. The method is based on Grothendieck’s inequality. Unlike the previous uses of this inequality that lead to constant relative accuracy, we achieve any given relative accuracy by leveraging randomness. We illustrate the method with the problem of community detection in sparse networks, those with bounded average degrees. We demonstrate that even in this regime, various simple and natural semidefinite programs can be used to recover the community structure up to an arbitrarily small fraction of misclassified vertices. The method is general; it can be applied to a variety of stochastic models of networks and semidefinite programs.
Community Detection in Sparse Random Networks
, 2013
"... We consider the problem of detecting a tight community in a sparse random network. This is formalized as testing for the existence of a dense random subgraph in a random graph. Under the null hypothesis, the graph is a realization of an ErdösRényi graph on N vertices and with connection probability ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of detecting a tight community in a sparse random network. This is formalized as testing for the existence of a dense random subgraph in a random graph. Under the null hypothesis, the graph is a realization of an ErdösRényi graph on N vertices and with connection probability p0; under the alternative, there is an unknown subgraph on n vertices where the connection probability is p1> p0. In (AriasCastro and Verzelen, 2012), we focused on the asymptotically dense regime where p0 is large enough that log(1 ∨ (np0) −1) = o(log(N/n)). We consider here the asymptotically sparse regime where p0 is small enough that log(N/n) = O(log(1 ∨ (np0) −1)). As before, we derive information theoretic lower bounds, and also establish the performance of various tests. Compared to our previous work (AriasCastro and Verzelen, 2012), the arguments for the lower bounds are based on the same technology, but are substantially more technical in the details; also, the methods we study are different: besides a variant of the scan statistic, we study other statistics such as the size of the largest connected component, the number of triangles, the eigengap of the adjacency matrix, etc. Our detection bounds are sharp, except in the Poisson regime where we were not able to fully characterize the constant arising in the bound.