Results 1  10
of
71
Programming Parallel Algorithms
, 1996
"... In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a th ..."
Abstract

Cited by 193 (9 self)
 Add to MetaCart
In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a theoretical framework, many are quite efficient in practice or have key ideas that have been used in efficient implementations. This research on parallel algorithms has not only improved our general understanding ofparallelism but in several cases has led to improvements in sequential algorithms. Unf:ortunately there has been less success in developing good languages f:or prograftlftling parallel algorithftls, particularly languages that are well suited for teaching and prototyping algorithms. There has been a large gap between languages
Applications of parametric searching in geometric optimization
 J. Algorithms
, 1994
"... z Sivan Toledo x ..."
Biconnectivity Approximations and Graph Carvings
, 1994
"... A spanning tree in a graph is the smallest connected spanning subgraph. Given a graph, how does one find the smallest (i.e., least number of edges) 2connected spanning subgraph (connectivity refers to both edge and vertex connectivity, if not specified) ? Unfortunately, the problem is known to be ..."
Abstract

Cited by 84 (3 self)
 Add to MetaCart
A spanning tree in a graph is the smallest connected spanning subgraph. Given a graph, how does one find the smallest (i.e., least number of edges) 2connected spanning subgraph (connectivity refers to both edge and vertex connectivity, if not specified) ? Unfortunately, the problem is known to be NP hard. We consider the problem of finding a better approximation to the smallest 2connected subgraph, by an efficient algorithm. For 2edge connectivity our algorithm guarantees a solution that is no more than 3 2 times the optimal. For 2vertex connectivity our algorithm guarantees a solution that is no more than 5 3 times the optimal. The previous best approximation factor is 2 for each of these problems. The new algorithms (and their analyses) depend upon a structure called a carving of a graph, which is of independent interest. We show that approximating the optimal solution to within an additive constant is NP hard as well. We also consider the case where the graph has edge weigh...
Cacheoblivious priority queue and graph algorithm applications
 In Proc. 34th Annual ACM Symposium on Theory of Computing
, 2002
"... In this paper we develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in O ( 1 B logM/B N) amortized memory B transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hi ..."
Abstract

Cited by 68 (10 self)
 Add to MetaCart
In this paper we develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in O ( 1 B logM/B N) amortized memory B transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hierarchy. In a cacheoblivious data structure, M and B are not used in the description of the structure. The bounds match the bounds of several previously developed externalmemory (cacheaware) priority queue data structures, which all rely crucially on knowledge about M and B. Priority queues are a critical component in many of the best known externalmemory graph algorithms, and using our cacheoblivious priority queue we develop several cacheoblivious graph algorithms.
Efficient parallel graph algorithms for coarse grained multicomputers and BSP (Extended Abstract)
 in Proc. 24th International Colloquium on Automata, Languages and Programming (ICALP'97
, 1997
"... In this paper, we present deterministic parallel algorithms for the coarse grained multicomputer (CGM) and bulksynchronous parallel computer (BSP) models which solve the following well known graph problems: (1) list ranking, (2) Euler tour construction, (3) computing the connected components and s ..."
Abstract

Cited by 59 (23 self)
 Add to MetaCart
In this paper, we present deterministic parallel algorithms for the coarse grained multicomputer (CGM) and bulksynchronous parallel computer (BSP) models which solve the following well known graph problems: (1) list ranking, (2) Euler tour construction, (3) computing the connected components and spanning forest, (4) lowest common ancestor preprocessing, (5) tree contraction and expression tree evaluation, (6) computing an ear decomposition or open ear decomposition, (7) 2edge connectivity and biconnectivity (testing and component computation), and (8) cordal graph recognition (finding a perfect elimination ordering). The algorithms for Problems 17 require O(log p) communication rounds and linear sequential work per round. Our results for Problems 1 and 2, i.e.they are fully scalable, and for Problems hold for arbitrary ratios n p 38 it is assumed that n p,>0, which is true for all commercially
Parallel Ear Decomposition Search (EDS) And STNumbering In Graphs
, 1986
"... [LEC67] linear time serial algorithm for testing planarity of graphs uses the linear time serial algorithm of [ET76] for stnumbering. This stnumbering algorithm is based on depthfirst search (DFS). A known conjecture states that DFS, which is a key technique in designing serial algorithms, is n ..."
Abstract

Cited by 42 (2 self)
 Add to MetaCart
[LEC67] linear time serial algorithm for testing planarity of graphs uses the linear time serial algorithm of [ET76] for stnumbering. This stnumbering algorithm is based on depthfirst search (DFS). A known conjecture states that DFS, which is a key technique in designing serial algorithms, is not amenable to polylog time parallelism using "around linearly" (or even polynomially) many processors. The first contribution of this paper is a general method for searching efficiently in parallel undirected graphs, called eardecomposition search (EDS). The second contribution demonstrates the applicability of this search method. We present an efficient parallel algorithm for stnumbering in a biconnected graph. The algorithm runs in logarithmic time using a linear number of processors on a concurrentread concurrentwrite (CRCW) PRAM. An efficient parallel algorithm for the problem did not exist before. The problem was not even known to be in NC. 1. Introduction We define the problems ...
Analyzing the Behavior and Performance of Parallel Programs
 Univ. of WisconsinMadison, UW CS Tech. Rep
, 1993
"... An analytical performance model for parallel programs can provide qualitative insight as well as efficient quantitative evaluation and prediction of parallel program performance. While stochastic models for parallel programs can represent execution time variance due to communication and resource con ..."
Abstract

Cited by 39 (5 self)
 Add to MetaCart
An analytical performance model for parallel programs can provide qualitative insight as well as efficient quantitative evaluation and prediction of parallel program performance. While stochastic models for parallel programs can represent execution time variance due to communication and resource contention delays, a qualitative assessment of previous models shows that the stochastic assumption makes it extremely difficult to compute synchronization costs and overall execution times. This thesis first reevaluates the need for the stochastic assumption by examining the influence of nondeterministic communication and resource contention delays on execution times in parallel programs. An analytical model of program behavior, combined with detailed program measurements, provides compelling evidence that in sharedmemory programs on current systems as well as programs with similar granularity on foreseeable future systems, such delays introduce extremely low variance into the execution tim...
The Accelerated Centroid Decomposition Technique For Optimal Parallel Tree Evaluation In Logarithmic Time
, 1986
"... A new general parallel algorithmic technique for computations on trees is presented. The new technique performs a reduction of the tree expression evaluation problem to list ranking; then, the list ranking provides a schedule for evaluating the tree operations. The technique needs logarithmic tim ..."
Abstract

Cited by 39 (3 self)
 Add to MetaCart
A new general parallel algorithmic technique for computations on trees is presented. The new technique performs a reduction of the tree expression evaluation problem to list ranking; then, the list ranking provides a schedule for evaluating the tree operations. The technique needs logarithmic time using an optimal number of processors and has applications to other tree problems. This new technique enables us to systematically order four basic ideas and techniques for parallel algorithms on tree: (1) The list ranking problem. (2) The Euler tour technique on trees. (3) The centroid decomposition technique. (4) The new accelerated centroid decomposition (ACD) technique. 1. Introduction The model of parallel computation used in this paper is the concurrentread exclusivewrite (CREW) parallel random access machine (PRAM). A PRAM employs p synchronous processors all having access to a common memory. A CREW PRAM allows concurrent access by several processors to the same common memo...
A new succinct representation of RMQinformation and improvements in the enhanced suffix array
 PROC. ESCAPE. LNCS
, 2007
"... The RangeMinimumQueryProblem is to preprocess an array of length n in O(n) time such that all subsequent queries asking for the position of a minimal element between two specified indices can be obtained in constant time. This problem was first solved by Berkman and Vishkin [1], and Sadakane [2] ..."
Abstract

Cited by 38 (14 self)
 Add to MetaCart
The RangeMinimumQueryProblem is to preprocess an array of length n in O(n) time such that all subsequent queries asking for the position of a minimal element between two specified indices can be obtained in constant time. This problem was first solved by Berkman and Vishkin [1], and Sadakane [2] gave the first succinct data structure that uses 4n+o(n) bits of additional space. In practice, this method has several drawbacks: it needs O(nlog n) bits of intermediate space when constructing the data structure, and it builds on previous results on succinct data structures. We overcome these problems by giving the first algorithm that never uses more than 2n + o(n) bits, and does not rely on rank and selectqueries or other succinct data structures. We stress the importance of this result by simplifying and reducing the space consumption of the Enhanced Suffix Array [3], while retaining its capability of simulating topdowntraversals of the suffix tree, used, e.g., to locate all occ positions of a pattern p in a text in optimal O(p  + occ) time (assuming constant alphabet size). We further prove a lower bound of 2n − o(n) bits, which makes our algorithm asymptotically optimal.