Results 1  10
of
17
A Decomposition of MultiDimensional Point Sets with Applications to kNearestNeighbors and nBody Potential Fields
 J. ACM
, 1992
"... We define the notion of a wellseparated pair decomposition of points in ddimensional space. We then develop efficient sequential and parallel algorithms for computing such a decomposition. We apply the resulting decomposition to the efficient computation of knearest neighbors and nbody potential ..."
Abstract

Cited by 242 (4 self)
 Add to MetaCart
We define the notion of a wellseparated pair decomposition of points in ddimensional space. We then develop efficient sequential and parallel algorithms for computing such a decomposition. We apply the resulting decomposition to the efficient computation of knearest neighbors and nbody potential fields.
Parallel Algorithms with Optimal Speedup for Bounded Treewidth
 Proceedings 22nd International Colloquium on Automata, Languages and Programming
, 1995
"... We describe the first parallel algorithm with optimal speedup for constructing minimumwidth tree decompositions of graphs of bounded treewidth. On nvertex input graphs, the algorithm works in O((logn)^2) time using O(n) operations on the EREW PRAM. We also give faster parallel algorithms with opti ..."
Abstract

Cited by 32 (10 self)
 Add to MetaCart
We describe the first parallel algorithm with optimal speedup for constructing minimumwidth tree decompositions of graphs of bounded treewidth. On nvertex input graphs, the algorithm works in O((logn)^2) time using O(n) operations on the EREW PRAM. We also give faster parallel algorithms with optimal speedup for the problem of deciding whether the treewidth of an input graph is bounded by a given constant and for a variety of problems on graphs of bounded treewidth, including all decision problems expressible in monadic secondorder logic. On nvertex input graphs, the algorithms use O(n) operations together with O(log n log n) time on the EREW PRAM, or O(log n) time on the CRCW PRAM.
Optimal Parallel AllNearestNeighbors Using the WellSeparated Pair Decomposition
 In Proc. 34th IEEE Symposium on Foundations of Computer Science
, 1993
"... We present an optimal parallel algorithm to construct the wellseparated pair decomposition of a point set P in ! d . We show how this leads to a deterministic optimal O(logn) time parallel algorithm for finding the knearestneighbors of each point in P , where k is a constant. We discuss severa ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
We present an optimal parallel algorithm to construct the wellseparated pair decomposition of a point set P in ! d . We show how this leads to a deterministic optimal O(logn) time parallel algorithm for finding the knearestneighbors of each point in P , where k is a constant. We discuss several additional applications of the wellseparated pair decomposition for which we can derive faster parallel algorithms. 1 Introduction In [4] we introduced the wellseparated pair decomposition of a set P of n points in ! d , and showed how to apply this decomposition to develop efficient parallel algorithms for two problems posed on multidimensional point sets. One of these applications led to the fastest known deterministic parallel algorithm for finding the knearestneighbors of each point in P using O(n) processors. The time required for this algorithm is \Theta(log 2 n), which is within a log n factor of optimal. In this paper, we close the gap by developing an optimal O(log n) ti...
The Owner Concept for PRAMs
, 1991
"... We analyze the owner concept for PRAMs. In OROWPRAMs each memory cell has one distinct processor that is the only one allowed to write into this memory cell and one distinct processor that is the only one allowed to read from it. By symmetric pointer doubling, a new proof technique for OROWPRAMs, ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
We analyze the owner concept for PRAMs. In OROWPRAMs each memory cell has one distinct processor that is the only one allowed to write into this memory cell and one distinct processor that is the only one allowed to read from it. By symmetric pointer doubling, a new proof technique for OROWPRAMs, it is shown that list ranking can be done in O(log n) time by an OROWPRAM and that LOGSPACE ` OROWTIME(log n). Then we prove that OROWPRAMs are a fairly robust model and recognize the same class of languages when the model is modified in several ways and that all kinds of PRAMs intertwine with the NC hierarchy without timeloss. Finally it is shown that EREWPRAMs can be simulated by OREWPRAMs and ERCWPRAMs by ORCWPRAMs. 3 This research was partially supported by the Deutsche Forschungsgemeinschaft, SFB 342, Teilprojekt A4 "Klassifikation und Parallelisierung durch Reduktionsanalyse" y Email: rossmani@lan.informatik.tumuenchen.dbp.de Introduction Fortune and Wyllie introduced in...
The Design and Analysis of BulkSynchronous Parallel Algorithms
, 1998
"... The model of bulksynchronous parallel (BSP) computation is an emerging paradigm of generalpurpose parallel computing. This thesis presents a systematic approach to the design and analysis of BSP algorithms. We introduce an extension of the BSP model, called BSPRAM, which reconciles sharedmemory s ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
The model of bulksynchronous parallel (BSP) computation is an emerging paradigm of generalpurpose parallel computing. This thesis presents a systematic approach to the design and analysis of BSP algorithms. We introduce an extension of the BSP model, called BSPRAM, which reconciles sharedmemory style programming with efficient exploitation of data locality. The BSPRAM model can be optimally simulated by a BSP computer for a broad range of algorithms possessing certain characteristic properties: obliviousness, slackness, granularity. We use BSPRAM to design BSP algorithms for problems from three large, partially overlapping domains: combinatorial computation, dense matrix computation, graph computation. Some of the presented algorithms are adapted from known BSP algorithms (butterfly dag computation, cube dag computation, matrix multiplication). Other algorithms are obtained by application of established nonBSP techniques (sorting, randomised list contraction, Gaussian elimination without pivoting and with column pivoting, algebraic path computation), or use original techniques specific to the BSP model (deterministic list contraction, Gaussian elimination with nested block pivoting, communicationefficient multiplication of Boolean matrices, synchronisationefficient shortest paths computation). The asymptotic BSP cost of each algorithm is established, along with its BSPRAM characteristics. We conclude by outlining some directions for future research.
Diffusion: Calculating Efficient Parallel Programs
 IN 1999 ACM SIGPLAN WORKSHOP ON PARTIAL EVALUATION AND SEMANTICSBASED PROGRAM MANIPULATION (PEPM ’99
, 1999
"... Parallel primitives (skeletons) intend to encourage programmers to build a parallel program from readymade components for which efficient implementations are known to exist, making the parallelization process easier. However, programmers often suffer from the difficulty to choose a combination of p ..."
Abstract

Cited by 9 (7 self)
 Add to MetaCart
Parallel primitives (skeletons) intend to encourage programmers to build a parallel program from readymade components for which efficient implementations are known to exist, making the parallelization process easier. However, programmers often suffer from the difficulty to choose a combination of proper parallel primitives so as to construct efficient parallel programs. To overcome this difficulty, we shall propose a new transformation, called diffusion, which can efficiently decompose a recursive definition into several functions such that each function can be described by some parallel primitive. This allows programmers to describe algorithms in a more natural recursive form. We demonstrate our idea with several interesting examples. Our diffusion transformation should be significant not only in development of new parallel algorithms, but also in construction of parallelizing compilers.
Nested dissection: A survey and comparison of various nested dissection algorithms
, 1992
"... Methods for solving sparse linear systems of equations can be categorized under two broad classes direct and iterative. Direct methods are methods based on gaussian elimination. This report discusses one such direct method namely Nested dissection. Nested Dissection, originally proposed by Alan Geo ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Methods for solving sparse linear systems of equations can be categorized under two broad classes direct and iterative. Direct methods are methods based on gaussian elimination. This report discusses one such direct method namely Nested dissection. Nested Dissection, originally proposed by Alan George, is a technique for solving sparse linear systems efficiently. This report is a survey of some of the work in the area of nested dissection and attempts to put it together using a common framework.
Parallel External Memory Graph Algorithms
"... In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one of the privatecache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to solutions for many problems on trees, such as computing the Euler to ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one of the privatecache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to solutions for many problems on trees, such as computing the Euler tour, preorder and postorder numbering of the vertices, the depth of each vertex and the sizes of subtrees rooted at each vertex of the tree. We also study the problems of computing the connected components of a graph and minimum spanning tree of a connected graph. All our solutions provide an optimal speedup of O(p) in parallel I/O complexity compared to the singleprocessor external memory versions of the algorithms. 1
RealTime Minimum Vertex Cover For TwoTerminal SeriesParallel Graphs
 Proceedings of the Thirteenth Conference on Parallel and Distributed Computing and Systems
, 2000
"... Tree contraction is a powerful technique for solving a large number of graph problems on families of recursively definable graphs. The method is based on processing the parse tree associated with a member of such a family of graphs in a bottomup fashion, such that the solution to the problem is ..."
Abstract

Cited by 7 (7 self)
 Add to MetaCart
Tree contraction is a powerful technique for solving a large number of graph problems on families of recursively definable graphs. The method is based on processing the parse tree associated with a member of such a family of graphs in a bottomup fashion, such that the solution to the problem is obtained at the root of the tree. Sequentially, this can be done in linear time with respect to the size of the input graph. In parallel, efficient and even cost optimal tree contraction algorithms have also been developed. In this paper we show how the method can be applied to compute the cardinality of the minimum vertex cover of a twoterminal seriesparallel graph. We then construct a realtime paradigm for this problem and show that in the new computational environment, a parallel algorithm is superior to the best possible sequential algorithm, in terms of the accuracy of the solution computed. Specifically, there are cases in which the solution produced by a parallel algorithm ...
Parallel Priority Queue and List Contraction: The BSP Approach
 In Proc. EuroPar 97. LNCS
, 1997
"... . In this paper we present efficient and practical extensions of the randomized Parallel Priority Queue (PPQ) algorithms of Ranade et al., and efficient randomized and deterministic algorithms for the problem of list contraction on the BulkSynchronous Parallel (BSP) model. We also present an experi ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
. In this paper we present efficient and practical extensions of the randomized Parallel Priority Queue (PPQ) algorithms of Ranade et al., and efficient randomized and deterministic algorithms for the problem of list contraction on the BulkSynchronous Parallel (BSP) model. We also present an experimental study of their performance. We show that our algorithms are communication efficient and achieve small multiplicative constant factors for a wide range of parallel machines. 1 Introduction We present an architecture independent study of the computation and communication requirements of an efficient Parallel Priority Queue (PPQ) implementation and list contraction algorithms along with an experimental study. The computational model adopted is the BulkSynchronous Parallel (BSP) model, proposed by L. G. Valiant [20], which deals explicitly with the notion of communication and synchronization among computational threads. A detailed discussion of the BSP model appears in [20]. The first a...