Results 1 - 10
of
33
Programming Parallel Algorithms
, 1996
"... In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a th ..."
Abstract
-
Cited by 164 (7 self)
- Add to MetaCart
In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a theoretical framework, many are quite efficient in practice or have key ideas that have been used in efficient implementations. This research on parallel algorithms has not only improved our general understanding ofparallelism but in several cases has led to improvements in sequential algorithms. Unf:ortunately there has been less success in developing good languages f:or prograftlftling parallel algorithftls, particularly languages that are well suited for teaching and prototyping algorithms. There has been a large gap between languages
A Comparison of Data-Parallel Algorithms for Connected Components
- In Proc. 6th Ann. Symp. Parallel Algorithms and Architectures (SPAA-94
, 1994
"... This paper presents a pragmatic comparison of three parallel algorithms for finding connected components, together with optimizations on these algorithms. Those being compared are two similar algorithms by Awerbuch and Shiloach [2] and by Shiloach and Vishkin [19] and a randomized contraction algori ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
This paper presents a pragmatic comparison of three parallel algorithms for finding connected components, together with optimizations on these algorithms. Those being compared are two similar algorithms by Awerbuch and Shiloach [2] and by Shiloach and Vishkin [19] and a randomized contraction algorithm by Blelloch [7], based on algorithms by Reif [18] and Phillips [17]. Major improvements are given for the first two which significantly reduces the super-linear component of their work complexity. An improvement is also given for randomized algorithm, and this algorithm is shown to be the fastest of those tested. These comparisons are presented with NESL data-parallel code as executed on a Connection Machine 2. This research was sponsored in part by the Defense Advanced Research Projects Agency, CSTO, under the title "The Fox Project: Advanced Development of Systems Software", ARPA Order No. 8313, issued by ESD/AVS under Contract No. F19628-91-C-0168, and in part by the ONR Graduate Fell...
Parallel Algorithmic Techniques for Combinatorial Computation
- Ann. Rev. Comput. Sci
, 1988
"... this paper and supplied many helpful comments. This research was supported in part by NSF grants DCR-85-11713, CCR-86-05353, and CCR-88-14977, and by DARPA contract N00039-84-C-0165. ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
this paper and supplied many helpful comments. This research was supported in part by NSF grants DCR-85-11713, CCR-86-05353, and CCR-88-14977, and by DARPA contract N00039-84-C-0165.
A Fast, Parallel Spanning Tree Algorithm for Symmetric Multiprocessors (SMPs) (Extended Abstract)
, 2004
"... Our study in this paper focuses on implementing parallel spanning tree algorithms on SMPs. Spanning tree is an important problem in the sense that it is the building block for many other parallel graph algorithms and also because it is representative of a large class of irregular combinatorial probl ..."
Abstract
-
Cited by 27 (11 self)
- Add to MetaCart
Our study in this paper focuses on implementing parallel spanning tree algorithms on SMPs. Spanning tree is an important problem in the sense that it is the building block for many other parallel graph algorithms and also because it is representative of a large class of irregular combinatorial problems that have simple and efficient sequential implementations and fast PRAM algorithms, but often have no known efficient parallel implementations. In this paper we present a new randomized algorithm and implementation with superior performance that for the first-time achieves parallel speedup on arbitrary graphs (both regular and irregular topologies) when compared with the best sequential implementation for finding a spanning tree. This new algorithm uses several techniques to give an expected running time that scales linearly with the number p of processors for suitably large inputs (n> p 2). As the spanning tree problem is notoriously hard for any parallel implementation to achieve reasonable speedup, our study may shed new light on implementing PRAM algorithms for shared-memory parallel computers. The main results of this paper are 1. A new and practical spanning tree algorithm for symmetric multiprocessors that exhibits parallel speedups on graphs with regular and irregular topologies; and 2. An experimental study of parallel spanning tree algorithms that reveals the superior performance of our new approach compared with the previous algorithms. The source code for these algorithms is freely-available from our web site hpc.ece.unm.edu.
Fast Connected Components Algorithms For The EREW PRAM
- SIAM J. COMPUT
, 1999
"... We present fast and e#cient parallel algorithms for finding the connected components of an undirected graph. These algorithms run on the exclusive-read, exclusive-write (EREW) PRAM. On a graph with n vertices and m edges, our randomized algorithm runs ..."
Abstract
-
Cited by 25 (3 self)
- Add to MetaCart
We present fast and e#cient parallel algorithms for finding the connected components of an undirected graph. These algorithms run on the exclusive-read, exclusive-write (EREW) PRAM. On a graph with<F3.492e+05> n<F3.822e+05> vertices and<F3.492e+05> m<F3.822e+05> edges, our randomized algorithm runs in<F3.492e+05><F3.822e+05> O(log<F3.492e+05><F3.822e+05> n) time using<F3.492e+05> (m<F3.822e+05> +<F3.492e+05> n<F2.77e+05><F2.072e+05> 1+#<F3.822e+05><F3.492e+05> )/<F3.822e+05> log<F3.492e+05> n<F3.822e+05> EREW processors (for any fixed<F3.492e+05> # ><F3.822e+05> 0). A variant uses<F3.492e+05> (m<F3.822e+05> +<F3.492e+05><F3.822e+05><F3.492e+05> n)/<F3.822e+05> log<F3.492e+05> n<F3.822e+05> processors and runs in<F3.492e+05><F3.822e+05> O(log<F3.492e+05> n<F3.822e+05> log log<F3.492e+05><F3.822e+05> n) time. A deterministic version of the algorithm runs in<F3.492e+05><F3.822e+05> O(log<F2.77e+05><F2.072e+05><F2.77e+05> 1.5<F3.492e+05><F3.822e+05> n) time using<F3.492e+...
Efficient parallel algorithms for chordal graphs
"... We give the first efficient parallel algorithms for recognizing chordal graphs, finding a maximum clique and a maximum independent set in a chordal graph, finding an optimal coloring of a chordal graph, finding a breadth-first search tree and a depth-first search tree of a chordal graph, recognizing ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
We give the first efficient parallel algorithms for recognizing chordal graphs, finding a maximum clique and a maximum independent set in a chordal graph, finding an optimal coloring of a chordal graph, finding a breadth-first search tree and a depth-first search tree of a chordal graph, recognizing interval graphs, and testing interval graphs for isomorphism. The key to our results is an efficient parallel algorithm for finding a perfect elimination ordering.
Parallel Implementation of Algorithms for Finding Connected Components in Graphs
, 1997
"... In this paper, we describe our implementation of several parallel graph algorithms for finding connected components. Our implementation, with virtual processing, is on a 16,384-processor MasPar MP-1 using the language MPL. We present extensive test data on our code. In our previous projects [21, 22, ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
In this paper, we describe our implementation of several parallel graph algorithms for finding connected components. Our implementation, with virtual processing, is on a 16,384-processor MasPar MP-1 using the language MPL. We present extensive test data on our code. In our previous projects [21, 22, 23], we reported the implementation of an extensible parallel graph algorithms library. We developed general implementation and fine-tuning techniques without expending too much effort on optimizing each individual routine. We also handled the issue of implementing virtual processing. In this paper, we describe several algorithms and fine-tuning techniques that we developed for the problem of finding connected components in parallel; many of the fine-tuning techniques are of general interest, and should be applicable to code for other problems. We present data on the execution time and memory usage of our various implementations.
Parallel Open Ear Decomposition with Applications to Graph Biconnectivity and Triconnectivity
- Synthesis of Parallel Algorithms
, 1992
"... This report deals with a parallel algorithmic technique that has proved to be very useful in the design of efficient parallel algorithms for several problems on undirected graphs. We describe this method for searching undirected graphs, called "open ear decomposition", and we relate this decompos ..."
Abstract
-
Cited by 21 (9 self)
- Add to MetaCart
This report deals with a parallel algorithmic technique that has proved to be very useful in the design of efficient parallel algorithms for several problems on undirected graphs. We describe this method for searching undirected graphs, called "open ear decomposition", and we relate this decomposition to graph biconnectivity. We present an efficient parallel algorithm for finding this decomposition and we relate it to a sequential algorithm based on depth-first search. We then apply open ear decomposition to obtain an efficient parallel algorithm for testing graph triconnectivity and for finding the triconnnected components of a graph.
Finding Minimum Spanning Forests in Logarithmic Time and Linear Work Using Random Sampling
, 1996
"... We describe a randomized CRCW PRAM algorithm that finds a minimum spanning forest of an n-vertex graph in O(log n) time and linear work. This shaves a factor of 2 log n off the best previous running time for a linear-work algorithm. The novelty in our approach is to divide the computation into two ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
We describe a randomized CRCW PRAM algorithm that finds a minimum spanning forest of an n-vertex graph in O(log n) time and linear work. This shaves a factor of 2 log n off the best previous running time for a linear-work algorithm. The novelty in our approach is to divide the computation into two phases, the first of which finds only a partial solution. This idea has been used previously in parallel connected components algorithms. 1 Introduction We describe the first work-optimal minimum spanning forest (MSF) algorithm that runs in O(log n) time. The algorithm uses a random-sampling technique previously used by Karger, Klein, and Tarjan in a sequential linear-time algorithm and by Cole, Klein, and Tarjan in a parallel algorithm. These previous algorithms have the following form. Choose a random subset of edges, and recursively calculate the MSF of the sample graph, the graph consisting of the chosen edges. Use the recursively calculated minimum spanning forest to identify edges ...
Connected Components on Distributed Memory Machines
- Parallel Algorithms: 3rd DIMACS Implementation Challenge October 17-19, 1994, volume 30 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science
, 1994
"... . The efforts of the theory community to develop efficient PRAM algorithms often receive little attention from application programmers. Although there are PRAM algorithm implementations that perform reasonably on shared memory machines, they often perform poorly on distributed memory machines, where ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
. The efforts of the theory community to develop efficient PRAM algorithms often receive little attention from application programmers. Although there are PRAM algorithm implementations that perform reasonably on shared memory machines, they often perform poorly on distributed memory machines, where the cost of remote memory accesses is relatively high. We present a hybrid approach to solving the connected components problem, whereby a PRAM algorithm is merged with a sequential algorithm and then optimized to create an efficient distributed memory implementation. The sequential algorithm handles local work on each processor, and the PRAM algorithm handles interactions between processors. Our hybrid algorithm uses the Shiloach-Vishkin CRCW PRAM algorithm on a partition of the graph distributed over the processors and sequential breadth-first search within each local subgraph. The implementation uses the Split-C language developed at Berkeley, which provides a global address space and al...

