Results 1  10
of
23
A scalable distributed parallel breadthfirst search algorithm on bluegene/l
 In SC ’05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing
, 2005
"... Many emerging largescale data science applications require searching large graphs distributed across multiple memories and processors. This paper presents a distributed breadthfirst search (BFS) scheme that scales for random graphs with up to three billion vertices and 30 billion edges. Scalability ..."
Abstract

Cited by 39 (2 self)
 Add to MetaCart
Many emerging largescale data science applications require searching large graphs distributed across multiple memories and processors. This paper presents a distributed breadthfirst search (BFS) scheme that scales for random graphs with up to three billion vertices and 30 billion edges. Scalability was tested on IBM BlueGene/L with 32,768 nodes at the Lawrence Livermore National Laboratory. Scalability was obtained through a series of optimizations, in particular, those that ensure scalable use of memory. We use 2D (edge) partitioning of the graph instead of conventional 1D (vertex) partitioning to reduce communication overhead. For Poisson random graphs, we show that the expected size of the messages is scalable for both 2D and 1D partitionings. Finally, we have developed efficient collective communication functions for the 3D torus architecture of BlueGene/L that also take advantage of the structure in the problem. The performance and characteristics of the algorithm are measured and reported. 1
Distributed LTL Model Checking Based on Negative Cycle Detection
, 2001
"... This paper addresses the state explosion problem in automata based LTL model checking. To deal with large space requirements we turn to use a distributed approach. All the known methods for automata based model checking are based on depth first traversal of the state space which is difficult to para ..."
Abstract

Cited by 29 (12 self)
 Add to MetaCart
This paper addresses the state explosion problem in automata based LTL model checking. To deal with large space requirements we turn to use a distributed approach. All the known methods for automata based model checking are based on depth first traversal of the state space which is difficult to parallelise as the ordering in which vertices are visited plays an important role. We come up with entirely different approach which is dependent on locating cycles with negative length in a directed graph with real number length of edges. Our method allows reasonable distribution and the experimental results confirm its usefulness for distributed model checking.
A shortest path algorithm for realweighted undirected graphs
 in 13th ACMSIAM Symp. on Discrete Algs
, 1985
"... Abstract. We present a new scheme for computing shortest paths on realweighted undirected graphs in the fundamental comparisonaddition model. In an efficient preprocessing phase our algorithm creates a linearsize structure that facilitates singlesource shortest path computations in O(m log α) ti ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
Abstract. We present a new scheme for computing shortest paths on realweighted undirected graphs in the fundamental comparisonaddition model. In an efficient preprocessing phase our algorithm creates a linearsize structure that facilitates singlesource shortest path computations in O(m log α) time, where α = α(m, n) is the very slowly growing inverseAckermann function, m the number of edges, and n the number of vertices. As special cases our algorithm implies new bounds on both the allpairs and singlesource shortest paths problems. We solve the allpairs problem in O(mnlog α(m, n)) time and, if the ratio between the maximum and minimum edge lengths is bounded by n (log n)O(1) , we can solve the singlesource problem in O(m + nlog log n) time. Both these results are theoretical improvements over Dijkstra’s algorithm, which was the previous best for real weighted undirected graphs. Our algorithm takes the hierarchybased approach invented by Thorup. Key words. singlesource shortest paths, allpairs shortest paths, undirected graphs, Dijkstra’s
I/Oefficient undirected shortest paths
 In Proc. 11th Annual European Symposium on Algorithms, volume 2832 of LNCS
, 2003
"... Abstract. We show how to compute singlesource shortest paths in undirected graphs with nonnegative edge lengths in O ( p nm/B log n + MST (n, m)) I/Os, where n is the number of vertices, m is the number of edges, B is the disk block size, and MST (n, m) is the I/Ocost of computing a minimum spann ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
Abstract. We show how to compute singlesource shortest paths in undirected graphs with nonnegative edge lengths in O ( p nm/B log n + MST (n, m)) I/Os, where n is the number of vertices, m is the number of edges, B is the disk block size, and MST (n, m) is the I/Ocost of computing a minimum spanning tree. For sparse graphs, the new algorithm performs O((n / √ B) log n) I/Os. This result removes our previous algorithm’s dependence on the edge lengths in the graph. 1
Buckets strike back: Improved Parallel ShortestPaths
 Proc. 16th Intl. Par. Distr. Process. Symp. (IPDPS
, 2002
"... We study the averagecase complexity of the parallel singlesource shortestpath (SSSP) problem, assuming arbitrary directed graphs with n nodes, m edges, and independent random edge weights uniformly distributed in [0; 1]. We provide a new bucketbased parallel SSSP algorithm that runs in T = O(log ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
We study the averagecase complexity of the parallel singlesource shortestpath (SSSP) problem, assuming arbitrary directed graphs with n nodes, m edges, and independent random edge weights uniformly distributed in [0; 1]. We provide a new bucketbased parallel SSSP algorithm that runs in T = O(log 2 n min i f2 i L + jV i jg) averagecase time using O(n+m+T ) work on a PRAM where L denotes the maximum shortestpath weight and jV i j is the number of graph vertices with indegree at least 2 i . All previous algorithms either required more time or more work. The minimum performance gain is a logarithmic factor improvement; on certain graph classes, accelerations by factors of more than n 0:4 can be achieved. The algorithm allows adaptation to distributed memory machines, too.
A spaceefficient parallel algorithm for computing betweenness centrality in distributed memory
 In Proc. Int’l. Conf. on High Performance Computing (HiPC 2010
, 2010
"... Abstract—Betweenness centrality is a measure based on shortest paths that attempts to quantify the relative importance of nodes in a network. As computation of betweenness centrality becomes increasingly important in areas such as social network analysis, networks of interest are becoming too large ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Abstract—Betweenness centrality is a measure based on shortest paths that attempts to quantify the relative importance of nodes in a network. As computation of betweenness centrality becomes increasingly important in areas such as social network analysis, networks of interest are becoming too large to fit in the memory of a single processing unit, making parallel execution a necessity. Parallelization over the vertex set of the standard algorithm, with a final reduction of the centrality for each vertex, is straightforward but requires Ω(V  2) storage. In this paper we present a new parallelizable algorithm with low spatial complexity that is based on the best known sequential algorithm. Our algorithm requires O(V  + E) storage and enables efficient parallel execution. Our algorithm is especially well suited to distributed memory processing because it can be implemented using coarsegrained parallelism. The presented time bounds for parallel execution of our algorithm on CRCW PRAM and on distributed memory systems both show good asymptotic performance. Experimental results with a distributed memory computer show the practical applicability of our algorithm. I.
Distributed shortest path for directed graphs with negative edge lengths
, 2001
"... w\Delta\Theta\Xi\Pi\Sigma\Upsilon\Phi\Omega fffiflffiij`'ae/!"#$%&'()+,./012345!yA ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
w\Delta\Theta\Xi\Pi\Sigma\Upsilon\Phi\Omega fffiflffiij`'ae/!"#$%&'()+,./012345!yA
AM++: A generalized active message framework
 In
, 2010
"... Active messages have proven to be an effective approach for certain communication problems in high performance computing. Many MPI implementations, as well as runtimes for Partitioned Global Address Space languages, use active messages in their lowlevel transport layers. However, most active messag ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Active messages have proven to be an effective approach for certain communication problems in high performance computing. Many MPI implementations, as well as runtimes for Partitioned Global Address Space languages, use active messages in their lowlevel transport layers. However, most active message frameworks have lowlevel programming interfaces that require significant programming effort to use directly in applications and that also prevent optimization opportunities. In this paper we present AM++, a new userlevel library for active messages based on generic programming techniques. Our library allows message handlers to be run in an explicit loop that can be optimized and vectorized by the compiler and that can also be executed in parallel on multicore architectures. Runtime optimizations, such as message combining and filtering, are also provided by the library, removing the need to implement that functionality at the application level. Evaluation of AM++ with distributedmemory graph algorithms shows the usability benefits provided by these library features, as well as their performance advantages.
SingleSource Shortest Paths with the Parallel Boost Graph Library
"... The Parallel Boost Graph Library (Parallel BGL) is a library of graph algorithms and data structures for distributedmemory computation on large graphs. Developed with the Generic Programming paradigm, the Parallel BGL is highly customizable, supporting various graph data structures, arbitrary verte ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
The Parallel Boost Graph Library (Parallel BGL) is a library of graph algorithms and data structures for distributedmemory computation on large graphs. Developed with the Generic Programming paradigm, the Parallel BGL is highly customizable, supporting various graph data structures, arbitrary vertex and edge properties, and different communication media. In this paper, we describe the implementation of two parallel variants of Dijkstra’s singlesource shortest paths algorithm in the Parallel BGL. We also provide an experimental evaluation of these implementations using synthetic and realworld benchmark graphs from the 9 th DIMACS Implementation Challenge. 1