Results 1  10
of
36
Programming Parallel Algorithms
, 1996
"... In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a th ..."
Abstract

Cited by 193 (9 self)
 Add to MetaCart
In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a theoretical framework, many are quite efficient in practice or have key ideas that have been used in efficient implementations. This research on parallel algorithms has not only improved our general understanding ofparallelism but in several cases has led to improvements in sequential algorithms. Unf:ortunately there has been less success in developing good languages f:or prograftlftling parallel algorithftls, particularly languages that are well suited for teaching and prototyping algorithms. There has been a large gap between languages
A Comparison of DataParallel Algorithms for Connected Components
 In Proc. 6th Ann. Symp. Parallel Algorithms and Architectures (SPAA94
, 1994
"... This paper presents a pragmatic comparison of three parallel algorithms for finding connected components, together with optimizations on these algorithms. Those being compared are two similar algorithms by Awerbuch and Shiloach [2] and by Shiloach and Vishkin [19] and a randomized contraction algori ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
This paper presents a pragmatic comparison of three parallel algorithms for finding connected components, together with optimizations on these algorithms. Those being compared are two similar algorithms by Awerbuch and Shiloach [2] and by Shiloach and Vishkin [19] and a randomized contraction algorithm by Blelloch [7], based on algorithms by Reif [18] and Phillips [17]. Major improvements are given for the first two which significantly reduces the superlinear component of their work complexity. An improvement is also given for randomized algorithm, and this algorithm is shown to be the fastest of those tested. These comparisons are presented with NESL dataparallel code as executed on a Connection Machine 2. This research was sponsored in part by the Defense Advanced Research Projects Agency, CSTO, under the title "The Fox Project: Advanced Development of Systems Software", ARPA Order No. 8313, issued by ESD/AVS under Contract No. F1962891C0168, and in part by the ONR Graduate Fell...
A Fast, Parallel Spanning Tree Algorithm for Symmetric Multiprocessors (SMPs) (Extended Abstract)
, 2004
"... Our study in this paper focuses on implementing parallel spanning tree algorithms on SMPs. Spanning tree is an important problem in the sense that it is the building block for many other parallel graph algorithms and also because it is representative of a large class of irregular combinatorial probl ..."
Abstract

Cited by 31 (11 self)
 Add to MetaCart
Our study in this paper focuses on implementing parallel spanning tree algorithms on SMPs. Spanning tree is an important problem in the sense that it is the building block for many other parallel graph algorithms and also because it is representative of a large class of irregular combinatorial problems that have simple and efficient sequential implementations and fast PRAM algorithms, but often have no known efficient parallel implementations. In this paper we present a new randomized algorithm and implementation with superior performance that for the firsttime achieves parallel speedup on arbitrary graphs (both regular and irregular topologies) when compared with the best sequential implementation for finding a spanning tree. This new algorithm uses several techniques to give an expected running time that scales linearly with the number p of processors for suitably large inputs (n> p 2). As the spanning tree problem is notoriously hard for any parallel implementation to achieve reasonable speedup, our study may shed new light on implementing PRAM algorithms for sharedmemory parallel computers. The main results of this paper are 1. A new and practical spanning tree algorithm for symmetric multiprocessors that exhibits parallel speedups on graphs with regular and irregular topologies; and 2. An experimental study of parallel spanning tree algorithms that reveals the superior performance of our new approach compared with the previous algorithms. The source code for these algorithms is freelyavailable from our web site hpc.ece.unm.edu.
Parallel Algorithmic Techniques for Combinatorial Computation
 Ann. Rev. Comput. Sci
, 1988
"... this paper and supplied many helpful comments. This research was supported in part by NSF grants DCR8511713, CCR8605353, and CCR8814977, and by DARPA contract N0003984C0165. ..."
Abstract

Cited by 29 (3 self)
 Add to MetaCart
this paper and supplied many helpful comments. This research was supported in part by NSF grants DCR8511713, CCR8605353, and CCR8814977, and by DARPA contract N0003984C0165.
Fast Connected Components Algorithms For The EREW PRAM
 SIAM J. COMPUT
, 1999
"... We present fast and e#cient parallel algorithms for finding the connected components of an undirected graph. These algorithms run on the exclusiveread, exclusivewrite (EREW) PRAM. On a graph with n vertices and m edges, our randomized algorithm runs ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
We present fast and e#cient parallel algorithms for finding the connected components of an undirected graph. These algorithms run on the exclusiveread, exclusivewrite (EREW) PRAM. On a graph with<F3.492e+05> n<F3.822e+05> vertices and<F3.492e+05> m<F3.822e+05> edges, our randomized algorithm runs in<F3.492e+05><F3.822e+05> O(log<F3.492e+05><F3.822e+05> n) time using<F3.492e+05> (m<F3.822e+05> +<F3.492e+05> n<F2.77e+05><F2.072e+05> 1+#<F3.822e+05><F3.492e+05> )/<F3.822e+05> log<F3.492e+05> n<F3.822e+05> EREW processors (for any fixed<F3.492e+05> # ><F3.822e+05> 0). A variant uses<F3.492e+05> (m<F3.822e+05> +<F3.492e+05><F3.822e+05><F3.492e+05> n)/<F3.822e+05> log<F3.492e+05> n<F3.822e+05> processors and runs in<F3.492e+05><F3.822e+05> O(log<F3.492e+05> n<F3.822e+05> log log<F3.492e+05><F3.822e+05> n) time. A deterministic version of the algorithm runs in<F3.492e+05><F3.822e+05> O(log<F2.77e+05><F2.072e+05><F2.77e+05> 1.5<F3.492e+05><F3.822e+05> n) time using<F3.492e+...
Efficient parallel algorithms for chordal graphs
"... We give the first efficient parallel algorithms for recognizing chordal graphs, finding a maximum clique and a maximum independent set in a chordal graph, finding an optimal coloring of a chordal graph, finding a breadthfirst search tree and a depthfirst search tree of a chordal graph, recognizing ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
We give the first efficient parallel algorithms for recognizing chordal graphs, finding a maximum clique and a maximum independent set in a chordal graph, finding an optimal coloring of a chordal graph, finding a breadthfirst search tree and a depthfirst search tree of a chordal graph, recognizing interval graphs, and testing interval graphs for isomorphism. The key to our results is an efficient parallel algorithm for finding a perfect elimination ordering.
Parallel Implementation of Algorithms for Finding Connected Components in Graphs
, 1997
"... In this paper, we describe our implementation of several parallel graph algorithms for finding connected components. Our implementation, with virtual processing, is on a 16,384processor MasPar MP1 using the language MPL. We present extensive test data on our code. In our previous projects [21, 22, ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
In this paper, we describe our implementation of several parallel graph algorithms for finding connected components. Our implementation, with virtual processing, is on a 16,384processor MasPar MP1 using the language MPL. We present extensive test data on our code. In our previous projects [21, 22, 23], we reported the implementation of an extensible parallel graph algorithms library. We developed general implementation and finetuning techniques without expending too much effort on optimizing each individual routine. We also handled the issue of implementing virtual processing. In this paper, we describe several algorithms and finetuning techniques that we developed for the problem of finding connected components in parallel; many of the finetuning techniques are of general interest, and should be applicable to code for other problems. We present data on the execution time and memory usage of our various implementations.
Parallel Open Ear Decomposition with Applications to Graph Biconnectivity and Triconnectivity
 Synthesis of Parallel Algorithms
, 1992
"... This report deals with a parallel algorithmic technique that has proved to be very useful in the design of efficient parallel algorithms for several problems on undirected graphs. We describe this method for searching undirected graphs, called "open ear decomposition", and we relate this decompos ..."
Abstract

Cited by 25 (9 self)
 Add to MetaCart
This report deals with a parallel algorithmic technique that has proved to be very useful in the design of efficient parallel algorithms for several problems on undirected graphs. We describe this method for searching undirected graphs, called "open ear decomposition", and we relate this decomposition to graph biconnectivity. We present an efficient parallel algorithm for finding this decomposition and we relate it to a sequential algorithm based on depthfirst search. We then apply open ear decomposition to obtain an efficient parallel algorithm for testing graph triconnectivity and for finding the triconnnected components of a graph.
Connected Components on Distributed Memory Machines
 Parallel Algorithms: 3rd DIMACS Implementation Challenge October 1719, 1994, volume 30 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science
, 1994
"... . The efforts of the theory community to develop efficient PRAM algorithms often receive little attention from application programmers. Although there are PRAM algorithm implementations that perform reasonably on shared memory machines, they often perform poorly on distributed memory machines, where ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
. The efforts of the theory community to develop efficient PRAM algorithms often receive little attention from application programmers. Although there are PRAM algorithm implementations that perform reasonably on shared memory machines, they often perform poorly on distributed memory machines, where the cost of remote memory accesses is relatively high. We present a hybrid approach to solving the connected components problem, whereby a PRAM algorithm is merged with a sequential algorithm and then optimized to create an efficient distributed memory implementation. The sequential algorithm handles local work on each processor, and the PRAM algorithm handles interactions between processors. Our hybrid algorithm uses the ShiloachVishkin CRCW PRAM algorithm on a partition of the graph distributed over the processors and sequential breadthfirst search within each local subgraph. The implementation uses the SplitC language developed at Berkeley, which provides a global address space and al...