Results 1 - 10
of
42
Regularity lemmas and combinatorial algorithms
- In Proc. FOCS
"... Abstract — We present new combinatorial algorithms for Boolean matrix multiplication (BMM) and preprocessing a graph to answer independent set queries. We give the first asymptotic improvements on combinatorial algorithms for dense BMM in many years, improving on the “Four Russians ” O(n 3 /(w log n ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
Abstract — We present new combinatorial algorithms for Boolean matrix multiplication (BMM) and preprocessing a graph to answer independent set queries. We give the first asymptotic improvements on combinatorial algorithms for dense BMM in many years, improving on the “Four Russians ” O(n 3 /(w log n)) bound for machine models with wordsize w. (For a pointer machine, we can set w = log n.) The algorithms utilize notions from Regularity Lemmas for graphs in a novel way. • We give two randomized combinatorial algorithms for BMM. The first algorithm is essentially a reduction from BMM to the Triangle Removal Lemma. The best known bounds for the Triangle Removal Lemma only imply an O ` (n 3 log β)/(βw log n) ´ time algorithm for BMM where β = (log ∗ n) δ for some δ> 0, but improvements on the Triangle Removal Lemma would yield corresponding runtime improvements. The second algorithm applies the Weak Regularity Lemma of Frieze and Kannan along with “ several information compression ideas, running in O n 3 (log log n) 2 /(log n) 9/4 ”) time with probability exponentially “ close to 1. When w ≥ log n, it can be implemented in O n 3 (log log n) 2 /(w log n) 7/6 ”) time. Our results immediately imply improved combinatorial methods for CFG parsing, detecting triangle-freeness, and transitive closure. Using Weak Regularity, we also give an algorithm for answering queries of the form is S ⊆ V an independent set? in a graph. Improving on prior work, we show how to randomly preprocess a graph in O(n 2+ε) time (for all ε> 0) so that with high probability, all subsequent batches of log n independent “ set queries can be answered deterministically in O n 2 (log log n) 2 /((log n) 5/4 ”) time. When w ≥ log n, w queries can be answered in O n 2 (log log n) 2 /((log n) 7/6 ” time. In addition to its nice applications, this problem is interesting in that it is not known how to do better than O(n 2) using “algebraic ” methods. 1.
Triangle sparsifiers
- Journal of Graph Algorithms and Applications
"... In this work, we introduce the notion of triangle sparsifiers, i.e., sparse graphs which are approximately the same to the original graph with respect to the triangle count. This results in a practical triangle counting method with strong theoretical guarantees. For instance, for unweighted graphs w ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
(Show Context)
In this work, we introduce the notion of triangle sparsifiers, i.e., sparse graphs which are approximately the same to the original graph with respect to the triangle count. This results in a practical triangle counting method with strong theoretical guarantees. For instance, for unweighted graphs we show a randomized algorithm for approximately counting the number of triangles in a graph G, which proceeds as follows: keep each edge independently with probability p, enumerate the triangles in the sparsified graph G ′ and return the number of triangles found in G ′ multiplied by p −3. We prove that under mild assumptions on G and p our algorithm returns a good approximation for the number of triangles with high probability. Specifically, we show that if p ≥ max ( polylog(n)∆ t polylog(n) t1/3), where n, t, ∆, and T denote the number of vertices in G, the number of triangles in G, the maximum number of triangles an edge of G is contained and our triangle count estimate respectively, then T is strongly concentrated around t: Pr [|T − t | ≥ ɛt] ≤ n −K. We illustrate the efficiency of our algorithm on various large real-world datasets where we obtain significant speedups. Finally, we investigate cut and spectral sparsifiers with respect to triangle counting and show that they are not optimal. Submitted:
COUNTING TRIANGLES IN MASSIVE GRAPHS WITH MAPREDUCE
, 2013
"... Graphs and networks are used to model interactions in a variety of contexts. There is a growing need to quickly assess the characteristics of a graph in order to understand its underlying structure. Some of the most useful metrics are triangle-based and give a measure of the connectedness of mutual ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
(Show Context)
Graphs and networks are used to model interactions in a variety of contexts. There is a growing need to quickly assess the characteristics of a graph in order to understand its underlying structure. Some of the most useful metrics are triangle-based and give a measure of the connectedness of mutual friends. This is often summarized in terms of clustering coefficients, which measure the likelihood that two neighbors of a node are themselves connected. Computing these measures exactly for large-scale networks is prohibitively expensive in both memory and time. However, a recent wedge sampling algorithm has proved successful in efficiently and accurately estimating clustering coefficients. In this paper, we describe how to implement this approach in MapReduce to deal with extremely massive graphs. We show results on publicly-available networks, the largest of which is 132M nodes and 4.7B edges, as well as artificially generated networks (using the Graph500 benchmark), the largest of which has 240M nodes and 8.5B edges. We can estimate the clustering coefficient by degree bin (e.g., we use exponential binning) and the number of triangles per bin, as well as the global clustering coefficient and total number of triangles, in an average of 0.33 sec. per million edges plus overhead (approximately 225 sec. total for our configuration). The technique can also be used to study triangle statistics such as the ratio of the highest and lowest degree, and we highlight differences between social and non-social networks. To the best of our knowledge, these are the largest triangle-based graph computations published to date.
Popular conjectures imply strong lower bounds for dynamic problems
- CoRR
"... Abstract—We consider several well-studied problems in dy-namic algorithms and prove that sufficient progress on any of them would imply a breakthrough on one of five major open problems in the theory of algorithms: 1) Is the 3SUM problem on n numbers in O(n2−ε) time for some ε> 0? 2) Can one dete ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
(Show Context)
Abstract—We consider several well-studied problems in dy-namic algorithms and prove that sufficient progress on any of them would imply a breakthrough on one of five major open problems in the theory of algorithms: 1) Is the 3SUM problem on n numbers in O(n2−ε) time for some ε> 0? 2) Can one determine the satisfiability of a CNF formula on n variables and poly n clauses in O((2 − ε)npoly n) time for some ε> 0? 3) Is the All Pairs Shortest Paths problem for graphs on n vertices in O(n3−ε) time for some ε> 0? 4) Is there a linear time algorithm that detects whether a given graph contains a triangle? 5) Is there an O(n3−ε) time combinatorial algorithm for n×n Boolean matrix multiplication? The problems we consider include dynamic versions of bipartite perfect matching, bipartite maximum weight matching, single source reachability, single source shortest paths, strong connec-tivity, subgraph connectivity, diameter approximation and some nongraph problems such as Pagh’s problem defined in a recent paper by Pǎtraşcu[STOC 2010]. Index Terms—dynamic algorithms; all pairs shortest paths; 3SUM; lower bounds; I.
The input/output complexity of triangle enumeration
- In PODS'14
, 2014
"... ar ..."
(Show Context)
Listing triangles
- In Automata, Languages, and Programming - 41st International Colloquium, ICALP 2014
"... Abstract. We present new algorithms for listing triangles in dense and sparse graphs. The running time of our algorithm for dense graphs is Õ(nω + n3(ω−1)/(5−ω)t2(3−ω)/(5−ω)), and the running time of the algo-rithm for sparse graphs is Õ(m2ω/(ω+1) + m3(ω−1)/(ω+1)t(3−ω)/(ω+1)), where n is the numbe ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
Abstract. We present new algorithms for listing triangles in dense and sparse graphs. The running time of our algorithm for dense graphs is Õ(nω + n3(ω−1)/(5−ω)t2(3−ω)/(5−ω)), and the running time of the algo-rithm for sparse graphs is Õ(m2ω/(ω+1) + m3(ω−1)/(ω+1)t(3−ω)/(ω+1)), where n is the number of vertices, m is the number of edges, t is the number of triangles to be listed, and ω < 2.373 is the exponent of fast matrix multiplication. With the current bound on ω, the running times of our algorithms are Õ(n2.373 +n1.568 t0.478) and Õ(m1.408 +m1.222 t0.186), respectively. We first obtain randomized algorithms with the desired run-ning times and then derandomize them using sparse recovery techniques. If ω = 2, the running times of the algorithms become Õ(n2 + nt2/3) and Õ(m4/3 +mt1/3), respectively. In particular, if ω = 2, our algorithm lists m triangles in Õ(m4/3) time. Pǎtraşcu (STOC 2010) showed that Ω(m4/3−o(1)) time is required for listing m triangles, unless there exist subquadratic algorithms for 3SUM. We show that unless one can solve quadratic equation systems over a finite field significantly faster than the brute force algorithm, our triangle listing runtime bounds are tight assuming ω = 2, also for graphs with more triangles. 1
Consequences of Faster Alignment of Sequences
"... Abstract. The Local Alignment problem is a classical problem with ap-plications in biology. Given two input strings and a scoring function on pairs of letters, one is asked to find the substrings of the two input strings that are most similar under the scoring function. The best algorithms for Local ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
(Show Context)
Abstract. The Local Alignment problem is a classical problem with ap-plications in biology. Given two input strings and a scoring function on pairs of letters, one is asked to find the substrings of the two input strings that are most similar under the scoring function. The best algorithms for Local Alignment run in time that is roughly quadratic in the string length. It is a big open problem whether substantially subquadratic al-gorithms exist. In this paper we show that for all ε> 0, an O(n2−ε) time algorithm for Local Alignment on strings of length n would imply breakthroughs on three longstanding open problems: it would imply that for some δ> 0, 3SUM on n numbers is in O(n2−δ) time, CNF-SAT on n variables is in O((2 − δ)n) time, and Max Weight 4-Clique is in O(n4−δ) time. Our result for CNF-SAT also applies to the easier problem of find-ing the longest common substring of binary strings with don’t cares. We also give strong conditional lower bounds for the more general Multiple Local Alignment problem on k strings, under both k-wise and SP scor-ing, and for other string similarity problems such as Global Alignment with gap penalties and normalized Longest Common Subsequence. 1
Improved Distance Sensitivity Oracles via Fast Single-Source Replacement Paths
"... Abstract—A distance sensitivity oracle is a data structure which, given two nodes s and t in a directed edge-weighted graph G and an edge e, returns the shortest length of an s-t path not containing e, a so called replacement path for the triple (s,t,e). Such oracles are used to quickly recover from ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Abstract—A distance sensitivity oracle is a data structure which, given two nodes s and t in a directed edge-weighted graph G and an edge e, returns the shortest length of an s-t path not containing e, a so called replacement path for the triple (s,t,e). Such oracles are used to quickly recover from edge failures. In this paper we consider the case of integer weights in the interval [−M,M], and present the first distance sensitivity oracle that achieves simultaneously subcubic preprocessing time and sublinear query time. More precisely, for a given parameter α ∈ [0,1], our oracle has preprocessing time Õ(Mnω+1 2 +Mn ω+α(4−ω) ) and query time Õ(n 1−α). Here ω < 2.373 denotes the matrix multiplication exponent. For a comparison, the previous best oracle for small integer weights has Õ(Mnω+1−α) preprocessing time and (superlinear) Õ(n1+α) query time [Weimann,Yuster-FOCS’10]. The main novelty in our approach is an algorithm to compute all the replacement paths from a given source s, an interesting problem on its own. We can solve the latter single-source replacement paths problem in Õ(APSP(n,M))) time, where APSP(n,M) < Õ(M 0.681 n 2.575) [Zwick-JACM’02] is the runtime for computing all-pairs shortest paths in a graph with n vertices and integer edge weights in [−M,M]. For positive weights the runtime of our algorithm reduces to Õ(Mnω). This matches the best known runtime for the simpler replacement paths problem in which both the source s and the target t are fixed [Vassilevska-SODA’11]. Keywords-replacement paths; distance sensitivity oracles; shortest paths. I.