Results 1 - 10
of
16
A Decomposition of Multi-Dimensional Point Sets with Applications to k-Nearest-Neighbors and n-Body Potential Fields
- J. ACM
, 1992
"... We define the notion of a well-separated pair decomposition of points in d-dimensional space. We then develop efficient sequential and parallel algorithms for computing such a decomposition. We apply the resulting decomposition to the efficient computation of k-nearest neighbors and n-body potential ..."
Abstract
-
Cited by 214 (4 self)
- Add to MetaCart
We define the notion of a well-separated pair decomposition of points in d-dimensional space. We then develop efficient sequential and parallel algorithms for computing such a decomposition. We apply the resulting decomposition to the efficient computation of k-nearest neighbors and n-body potential fields.
Parallel Algorithms with Optimal Speedup for Bounded Treewidth
- Proceedings 22nd International Colloquium on Automata, Languages and Programming
, 1995
"... We describe the first parallel algorithm with optimal speedup for constructing minimum-width tree decompositions of graphs of bounded treewidth. On n-vertex input graphs, the algorithm works in O((logn)^2) time using O(n) operations on the EREW PRAM. We also give faster parallel algorithms with opti ..."
Abstract
-
Cited by 29 (10 self)
- Add to MetaCart
We describe the first parallel algorithm with optimal speedup for constructing minimum-width tree decompositions of graphs of bounded treewidth. On n-vertex input graphs, the algorithm works in O((logn)^2) time using O(n) operations on the EREW PRAM. We also give faster parallel algorithms with optimal speedup for the problem of deciding whether the treewidth of an input graph is bounded by a given constant and for a variety of problems on graphs of bounded treewidth, including all decision problems expressible in monadic second-order logic. On n-vertex input graphs, the algorithms use O(n) operations together with O(log n log n) time on the EREW PRAM, or O(log n) time on the CRCW PRAM.
Optimal Parallel All-Nearest-Neighbors Using the Well-Separated Pair Decomposition
- In Proc. 34th IEEE Symposium on Foundations of Computer Science
, 1993
"... We present an optimal parallel algorithm to construct the well-separated pair decomposition of a point set P in ! d . We show how this leads to a deterministic optimal O(logn) time parallel algorithm for finding the k-nearest-neighbors of each point in P , where k is a constant. We discuss severa ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
We present an optimal parallel algorithm to construct the well-separated pair decomposition of a point set P in ! d . We show how this leads to a deterministic optimal O(logn) time parallel algorithm for finding the k-nearest-neighbors of each point in P , where k is a constant. We discuss several additional applications of the well-separated pair decomposition for which we can derive faster parallel algorithms. 1 Introduction In [4] we introduced the well-separated pair decomposition of a set P of n points in ! d , and showed how to apply this decomposition to develop efficient parallel algorithms for two problems posed on multidimensional point sets. One of these applications led to the fastest known deterministic parallel algorithm for finding the k-nearest-neighbors of each point in P using O(n) processors. The time required for this algorithm is \Theta(log 2 n), which is within a log n factor of optimal. In this paper, we close the gap by developing an optimal O(log n) ti...
The Owner Concept for PRAMs
, 1991
"... We analyze the owner concept for PRAMs. In OROW-PRAMs each memory cell has one distinct processor that is the only one allowed to write into this memory cell and one distinct processor that is the only one allowed to read from it. By symmetric pointer doubling, a new proof technique for OROW-PRAMs, ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
We analyze the owner concept for PRAMs. In OROW-PRAMs each memory cell has one distinct processor that is the only one allowed to write into this memory cell and one distinct processor that is the only one allowed to read from it. By symmetric pointer doubling, a new proof technique for OROW-PRAMs, it is shown that list ranking can be done in O(log n) time by an OROWPRAM and that LOGSPACE ` OROW-TIME(log n). Then we prove that OROW-PRAMs are a fairly robust model and recognize the same class of languages when the model is modified in several ways and that all kinds of PRAMs intertwine with the NC -hierarchy without timeloss. Finally it is shown that EREWPRAMs can be simulated by OREW-PRAMs and ERCW-PRAMs by ORCW-PRAMs. 3 This research was partially supported by the Deutsche Forschungsgemeinschaft, SFB 342, Teilprojekt A4 "Klassifikation und Parallelisierung durch Reduktionsanalyse" y E-mail: rossmani@lan.informatik.tu-muenchen.dbp.de Introduction Fortune and Wyllie introduced in...
The Design and Analysis of Bulk-Synchronous Parallel Algorithms
, 1998
"... The model of bulk-synchronous parallel (BSP) computation is an emerging paradigm of general-purpose parallel computing. This thesis presents a systematic approach to the design and analysis of BSP algorithms. We introduce an extension of the BSP model, called BSPRAM, which reconciles shared-memory s ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
The model of bulk-synchronous parallel (BSP) computation is an emerging paradigm of general-purpose parallel computing. This thesis presents a systematic approach to the design and analysis of BSP algorithms. We introduce an extension of the BSP model, called BSPRAM, which reconciles shared-memory style programming with efficient exploitation of data locality. The BSPRAM model can be optimally simulated by a BSP computer for a broad range of algorithms possessing certain characteristic properties: obliviousness, slackness, granularity. We use BSPRAM to design BSP algorithms for problems from three large, partially overlapping domains: combinatorial computation, dense matrix computation, graph computation. Some of the presented algorithms are adapted from known BSP algorithms (butterfly dag computation, cube dag computation, matrix multiplication). Other algorithms are obtained by application of established non-BSP techniques (sorting, randomised list contraction, Gaussian elimination without pivoting and with column pivoting, algebraic path computation), or use original techniques specific to the BSP model (deterministic list contraction, Gaussian elimination with nested block pivoting, communication-efficient multiplication of Boolean matrices, synchronisation-efficient shortest paths computation). The asymptotic BSP cost of each algorithm is established, along with its BSPRAM characteristics. We conclude by outlining some directions for future research.
Diffusion: Calculating Efficient Parallel Programs
- IN 1999 ACM SIGPLAN WORKSHOP ON PARTIAL EVALUATION AND SEMANTICS-BASED PROGRAM MANIPULATION (PEPM ’99
, 1999
"... Parallel primitives (skeletons) intend to encourage programmers to build a parallel program from ready-made components for which efficient implementations are known to exist, making the parallelization process easier. However, programmers often suffer from the difficulty to choose a combination of p ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
Parallel primitives (skeletons) intend to encourage programmers to build a parallel program from ready-made components for which efficient implementations are known to exist, making the parallelization process easier. However, programmers often suffer from the difficulty to choose a combination of proper parallel primitives so as to construct efficient parallel programs. To overcome this difficulty, we shall propose a new transformation, called diffusion, which can efficiently decompose a recursive definition into several functions such that each function can be described by some parallel primitive. This allows programmers to describe algorithms in a more natural recursive form. We demonstrate our idea with several interesting examples. Our diffusion transformation should be significant not only in development of new parallel algorithms, but also in construction of parallelizing compilers.
Real-Time Minimum Vertex Cover For Two-Terminal Series-Parallel Graphs
- Proceedings of the Thirteenth Conference on Parallel and Distributed Computing and Systems
, 2000
"... Tree contraction is a powerful technique for solving a large number of graph problems on families of recursively definable graphs. The method is based on processing the parse tree associated with a member of such a family of graphs in a bottom-up fashion, such that the solution to the problem is ..."
Abstract
-
Cited by 8 (8 self)
- Add to MetaCart
Tree contraction is a powerful technique for solving a large number of graph problems on families of recursively definable graphs. The method is based on processing the parse tree associated with a member of such a family of graphs in a bottom-up fashion, such that the solution to the problem is obtained at the root of the tree. Sequentially, this can be done in linear time with respect to the size of the input graph. In parallel, efficient and even cost optimal tree contraction algorithms have also been developed. In this paper we show how the method can be applied to compute the cardinality of the minimum vertex cover of a two-terminal series-parallel graph. We then construct a real-time paradigm for this problem and show that in the new computational environment, a parallel algorithm is superior to the best possible sequential algorithm, in terms of the accuracy of the solution computed. Specifically, there are cases in which the solution produced by a parallel algorithm ...
Locating The Median Of A Tree In Real Time
- Proceedings of the Fourteenth Conference on Parallel and Distributed Computing and Systems
, 2001
"... Determining the optimal location of a switching center in a tree network of users is accurately modeled by the median problem. A real-time approach is used in this paper to investigate the dynamics of such a communication network in two cases: (1) a growing tree of nodes associated with equal dem ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
Determining the optimal location of a switching center in a tree network of users is accurately modeled by the median problem. A real-time approach is used in this paper to investigate the dynamics of such a communication network in two cases: (1) a growing tree of nodes associated with equal demand rates, and (2) a stream of corrections that arbitrarily change the demand rates at the nodes. The worst-case analysis performed in both situations clearly demonstrates the importance of parallelism in such real-time paradigms. It is shown that the error generated by the best sequential algorithm in the first case can be arbitrarily large. A synergistic behavior is revealed when the quality-up is investigated in the second case. 1
Nested dissection: A survey and comparison of various nested dissection algorithms
, 1992
"... Methods for solving sparse linear systems of equations can be categorized under two broad classes- direct and iterative. Direct methods are methods based on gaussian elimination. This report discusses one such direct method namely Nested dissection. Nested Dissection, originally proposed by Alan Geo ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Methods for solving sparse linear systems of equations can be categorized under two broad classes- direct and iterative. Direct methods are methods based on gaussian elimination. This report discusses one such direct method namely Nested dissection. Nested Dissection, originally proposed by Alan George, is a technique for solving sparse linear systems efficiently. This report is a survey of some of the work in the area of nested dissection and attempts to put it together using a common framework.
Parallel Priority Queue and List Contraction: The BSP Approach
- In Proc. Euro-Par 97. LNCS
, 1997
"... . In this paper we present efficient and practical extensions of the randomized Parallel Priority Queue (PPQ) algorithms of Ranade et al., and efficient randomized and deterministic algorithms for the problem of list contraction on the Bulk-Synchronous Parallel (BSP) model. We also present an experi ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
. In this paper we present efficient and practical extensions of the randomized Parallel Priority Queue (PPQ) algorithms of Ranade et al., and efficient randomized and deterministic algorithms for the problem of list contraction on the Bulk-Synchronous Parallel (BSP) model. We also present an experimental study of their performance. We show that our algorithms are communication efficient and achieve small multiplicative constant factors for a wide range of parallel machines. 1 Introduction We present an architecture independent study of the computation and communication requirements of an efficient Parallel Priority Queue (PPQ) implementation and list contraction algorithms along with an experimental study. The computational model adopted is the Bulk-Synchronous Parallel (BSP) model, proposed by L. G. Valiant [20], which deals explicitly with the notion of communication and synchronization among computational threads. A detailed discussion of the BSP model appears in [20]. The first a...

