Results 1 - 10
of
53
Main-memory triangle computations for very large (sparse (power-law)) graphs
- Theor. Comput. Sci
"... Finding, counting and/or listing triangles (three vertices with three edges) in massive graphs are natural fundamental problems, which received recently much attention because of their importance in complex network analysis. We provide here a detailed survey of proposed main-memory solutions to thes ..."
Abstract
-
Cited by 44 (0 self)
- Add to MetaCart
(Show Context)
Finding, counting and/or listing triangles (three vertices with three edges) in massive graphs are natural fundamental problems, which received recently much attention because of their importance in complex network analysis. We provide here a detailed survey of proposed main-memory solutions to these problems, in an unified way. We note that previous authors paid surprisingly little attention to space complexity of main-memory solutions, despite its both fundamental and practical interest. We therefore detail space complexities of known algorithms and discuss their implications. We also present new algorithms which are time optimal for triangle listing and beats previous algorithms concerning space needs. They have the additional advantage of performing better on power-law graphs, which we also detail. We finally show with an experimental study that these two algorithms perform very well in practice, allowing to handle cases which were previously out of reach. 1 Introduction. A triangle in an undirected graph is a set of three vertices such that each possible edge between them is present in the graph. Following classical conventions, we call finding, counting and listing the problems of
All-pairs shortest paths with real weights in O(n³ / log n) time
- PROC. OF THE 9TH WADS, LECTURE NOTES IN COMPUTER SCIENCE 3608
, 2005
"... We describe an O(n³ / log n) ..."
On dynamic shortest paths problems
, 2004
"... We obtain the following results related to dynamic versions of the shortest-paths problem: (i) Reductions that show that the incremental and decremental single-source shortest-paths problems, for weighted directed or undirected graphs, are, in a strong sense, at least as hard as the static all-pairs ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
(Show Context)
We obtain the following results related to dynamic versions of the shortest-paths problem: (i) Reductions that show that the incremental and decremental single-source shortest-paths problems, for weighted directed or undirected graphs, are, in a strong sense, at least as hard as the static all-pairs shortest-paths problem. We also obtain slightly weaker results for the corresponding unweighted problems. (ii) A randomized fully-dynamic algorithm for the all-pairs shortest-paths problem in directed unweighted graphs with an amortized update time of ~O(mpn) and a worst case query time is O(n3/4). (iii) A deterministic O(n2 log n) time algorithm for constructing a (log n)-spanner with O(n) edges for any weighted undirected graph on n vertices. The algorithm uses a simple algorithm for incrementally maintaining single-source shortest-paths tree up to a given distance.
On the Representation and Multiplication of Hypersparse Matrices
, 2008
"... Multicore processors are marking the beginning of a new era of computing where massive parallelism is available and necessary. Slightly slower but easy to parallelize kernels are becoming more valuable than sequentially faster kernels that are unscalable when parallelized. In this paper, we focus on ..."
Abstract
-
Cited by 23 (11 self)
- Add to MetaCart
Multicore processors are marking the beginning of a new era of computing where massive parallelism is available and necessary. Slightly slower but easy to parallelize kernels are becoming more valuable than sequentially faster kernels that are unscalable when parallelized. In this paper, we focus on the multiplication of sparse matrices (SpGEMM). We first present the issues with existing sparse matrix representations and multiplication algorithms that make them unscalable to thousands of processors. Then, we develop and analyze two new algorithms that overcome these limitations. We consider our algorithms first as the sequential kernel of a scalable parallel sparse matrix multiplication algorithm and second as part of a polyalgorithm for SpGEMM that would execute different kernels depending on the sparsity of the input matrices. Such a sequential kernel requires a new data structure that exploits the hypersparsity of the individual submatrices owned by a single processor after the 2D partitioning. We experimentally evaluate the performance and characteristics of our algorithms and show that they scale significantly better than existing kernels.
Clustering Social Networks
"... Social networks are ubiquitous. The discovery of close-knit clusters in these networks is of fundamental and practical interest. Existing clustering criteria are limited in that clusters typically do not overlap, all vertices are clustered and/or external sparsity is ignored. We introduce a new crit ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
(Show Context)
Social networks are ubiquitous. The discovery of close-knit clusters in these networks is of fundamental and practical interest. Existing clustering criteria are limited in that clusters typically do not overlap, all vertices are clustered and/or external sparsity is ignored. We introduce a new criterion that overcomes these limitations by combining internal density with external sparsity in a natural way. An algorithm is given for provably finding the clusters, provided there is a sufficiently large gap between internal density and external sparsity. Experiments on real social networks illustrate the effectiveness of the algorithm.
Compressed Matrix Multiplication ∗
"... Motivated by the problems of computing sample covariance matrices, and of transforming a collection of vectors to a basis where they are sparse, we present a simple algorithm that computes an approximation of the product of two n-by-n real matrices A and B. Let ||AB||F denote the Frobenius norm of A ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
(Show Context)
Motivated by the problems of computing sample covariance matrices, and of transforming a collection of vectors to a basis where they are sparse, we present a simple algorithm that computes an approximation of the product of two n-by-n real matrices A and B. Let ||AB||F denote the Frobenius norm of AB, and b be a parameter determining the time/accuracy trade-off. Given 2-wise independent hash functions h1, h2: [n] → [b], and s1, s2: [n] → {−1, +1} the algorithm works by first “compressing ” the matrix product into the polynomial n∑ n∑ p(x) = Aiks1(i) x h1(i) n∑ Bkjs2(j) x h2(j) k=1 i=1 j=1 Using FFT for polynomial multiplication, we can compute c0,..., cb−1 such that ∑ i cixi = (p(x) mod x b)+(p(x) div x b) in time Õ(n2 + nb). An unbiased estimator of (AB)ij with variance at most ||AB| | 2 F /b can then be computed as: Cij = s1(i) s2(j) c(h1(i)+h2(j)) mod b. Our approach also leads to an algorithm for computing AB exactly, whp., in time Õ(N + nb) in the case where A and B have at most N nonzero entries, and AB has at most b nonzero entries. Also, we use error-correcting codes in a novel way to recover significant entries of AB in near-linear time.
Faster Join-Projects and Sparse Matrix Multiplications
, 2009
"... Computing an equi-join followed by a duplicate eliminating projection is conventionally done by performing the two operations in serial. If some join attribute is projected away the intermediate result may be much larger than both the input and the output, and the computation could therefore potenti ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
(Show Context)
Computing an equi-join followed by a duplicate eliminating projection is conventionally done by performing the two operations in serial. If some join attribute is projected away the intermediate result may be much larger than both the input and the output, and the computation could therefore potentially be performed faster by a direct procedure that does not produce such a large intermediate result. We present a new algorithm that has smaller intermediate results on worst-case inputs, and in particular is more efficient in both the RAM and I/O model. It is easy to see that join-project where the join attributes are projected away is equivalent to boolean matrix multiplication. Our results can therefore also be interpreted as improved sparse, output-sensitive matrix multiplication.
Faster algorithms for rectangular matrix multiplication
- In FOCS
, 2012
"... ar ..."
(Show Context)
Better Size Estimation for Sparse Matrix Products ⋆
"... Abstract. We consider the problem of doing fast and reliable estimation of the number of non-zero entries in a sparse boolean matrix product. Let n denote the total number of non-zero entries in the input matrices. We show how to compute a 1 ± ε approximation (with small probability of error) in exp ..."
Abstract
-
Cited by 14 (5 self)
- Add to MetaCart
(Show Context)
Abstract. We consider the problem of doing fast and reliable estimation of the number of non-zero entries in a sparse boolean matrix product. Let n denote the total number of non-zero entries in the input matrices. We show how to compute a 1 ± ε approximation (with small probability of error) in expected time O(n) for any ε> 4 / 4 √ n. The previously best estimation algorithm, due to Cohen (JCSS 1997), uses time O(n/ε 2). We also present a variant using O(sort(n)) I/Os in expectation in the cache-oblivious model. We also describe how sampling can be used to maintain (independent) sketches of matrices that allow estimation to be performed in time o(n) if z is sufficiently large. This gives a simpler alternative to the sketching technique of Ganguly et al. (PODS 2005), and matches a space lower bound shown in that paper. 1
Colored Intersection Searching via Sparse Rectangular Matrix Multiplication
, 2006
"... In a Batched Colored Intersection Searching Problem (CI), one is given a set of n geometric objects (of a certain class). Each object is colored by one of c colors, and the goal is to report all pairs of colors (c1, c2) such that there are two objects, one colored c1 and one colored c2, that interse ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
(Show Context)
In a Batched Colored Intersection Searching Problem (CI), one is given a set of n geometric objects (of a certain class). Each object is colored by one of c colors, and the goal is to report all pairs of colors (c1, c2) such that there are two objects, one colored c1 and one colored c2, that intersect each other. We also consider the bipartite version of the problem, where we are interested in intersections between objects of one class with objects of another class (e.g., points and halfspaces). In a Sparse