Results 1  10
of
27
Programming Parallel Algorithms
, 1996
"... In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a th ..."
Abstract

Cited by 191 (9 self)
 Add to MetaCart
In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a theoretical framework, many are quite efficient in practice or have key ideas that have been used in efficient implementations. This research on parallel algorithms has not only improved our general understanding ofparallelism but in several cases has led to improvements in sequential algorithms. Unf:ortunately there has been less success in developing good languages f:or prograftlftling parallel algorithftls, particularly languages that are well suited for teaching and prototyping algorithms. There has been a large gap between languages
The NPcompleteness column: an ongoing guide
 Journal of Algorithms
, 1985
"... This is the nineteenth edition of a (usually) quarterly column that covers new developments in the theory of NPcompleteness. The presentation is modeled on that used by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NPCompleteness,’ ’ W. H. Freeman & Co ..."
Abstract

Cited by 188 (0 self)
 Add to MetaCart
This is the nineteenth edition of a (usually) quarterly column that covers new developments in the theory of NPcompleteness. The presentation is modeled on that used by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NPCompleteness,’ ’ W. H. Freeman & Co., New York, 1979 (hereinafter referred to as ‘‘[G&J]’’; previous columns will be referred to by their dates). A background equivalent to that provided by [G&J] is assumed, and, when appropriate, crossreferences will be given to that book and the list of problems (NPcomplete and harder) presented there. Readers who have results they would like mentioned (NPhardness, PSPACEhardness, polynomialtimesolvability, etc.) or open problems they would like publicized, should
merging, and sorting in parallel models of computation
 in “Proc. 14th Annual ACM Sympos. on Theory of Cornput
, 1982
"... A variety of models have been proposed for the study of synchronous parallel computation. These models are reviewed and some prototype problems are studied further. Two classes of models are recognized, fixed connection networks and models based on a shared memory. Routing and sorting are prototype ..."
Abstract

Cited by 105 (3 self)
 Add to MetaCart
A variety of models have been proposed for the study of synchronous parallel computation. These models are reviewed and some prototype problems are studied further. Two classes of models are recognized, fixed connection networks and models based on a shared memory. Routing and sorting are prototype problems for the networks; in particular, they provide the basis for simulating the more powerful shared memory models. It is shown that a simple but important class of deterministic strategies (oblivious routing) is necessarily inefficient with respect to worst case analysis. Routing can be viewed as a special case of sorting, and the existence of an O(log n) sorting algorithm for some n processor fixed connection network has only recently been established by Ajtai, Komlos, and Szemeredi (“15th ACM Sympos. on Theory of Cornput., ” Boston, Mass., 1983, pp. l9). If the more powerful class of shared memory models is considered then it is possible to simply achieve an O(log n loglog n) sort via Valiant’s parallel merging algorithm, which it is shown can be implemented on certain models. Within a spectrum of shared memory models, it is shown that loglogn is asymptotically optimal for n processors to merge two sorted lists containing n elements. 0 1985 Academic Press, Inc.
Horizons of Parallel Computation
 JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1993
"... This paper considers the ultimate impact of fundamental physical limitationsnotably, speed of light and device sizeon parallel computing machines. Although we fully expect an innovative and very gradual evolution to the limiting situation, we take here the provocative view of exploring the ..."
Abstract

Cited by 39 (3 self)
 Add to MetaCart
This paper considers the ultimate impact of fundamental physical limitationsnotably, speed of light and device sizeon parallel computing machines. Although we fully expect an innovative and very gradual evolution to the limiting situation, we take here the provocative view of exploring the consequences of the accomplished attainment of the physical bounds. The main result is that scalability holds only for neighborly interconnections, such as the square mesh, of boundedsize synchronous modules, presumably of the areauniversal type. We also discuss the ultimate infeasibility of latencyhiding, the violation of intuitive maximal speedups, and the emerging novel processortime tradeoffs.
Fast Parallel Absolute Irreducibility Testing
 J. Symbolic Comput
, 1985
"... We present a fast parallel deterministic algorithm for testing multivariate integral polynomials for absolute irreducibility, that is irreducibility over the complex numbers. More precisely, we establish that the set of absolutely irreducible integral polynomials belongs to the complexity class NC o ..."
Abstract

Cited by 31 (7 self)
 Add to MetaCart
We present a fast parallel deterministic algorithm for testing multivariate integral polynomials for absolute irreducibility, that is irreducibility over the complex numbers. More precisely, we establish that the set of absolutely irreducible integral polynomials belongs to the complexity class NC of Boolean circuits of polynomial size and logarithmic depth. Therefore it also belongs to the class of sequentially polynomialtime problems. Our algorithm can be extended to compute in parallel one irreducible complex factor of a multivariate integral polynomial. However, the coefficients of the computed factor are only represented modulo a not necessarily irreducible polynomial specifying a splitting field. A consequence of our algoithm is that multivariate polynomials over finite fields can be tested for absolute irreducibility in deterministic sequential polynomial time in the size of the input. We also obtain a sharp bound for the last prime p for which, when taking an absolutely irreducible integral polynomial modulo p, the polynomial's irreducibility in the algebraic closure of the finite field order p is not preserved.
Efficient Parallel Evaluation of Straightline Code and Arithmetic Circuits
 SIAM J. Comput
, 1988
"... A new parallel algorithm is given to evaluate a straight line program. The algorithm evaluates a program over a commutative semiring R of degree d and size n in time O(log n(log nd)) using M(n) processors, where M(n) is the number of processors required for multiplying n \Theta n matrices over the ..."
Abstract

Cited by 31 (5 self)
 Add to MetaCart
A new parallel algorithm is given to evaluate a straight line program. The algorithm evaluates a program over a commutative semiring R of degree d and size n in time O(log n(log nd)) using M(n) processors, where M(n) is the number of processors required for multiplying n \Theta n matrices over the semiring R in O(log n) time. Appears in SIAM J. Comput., 17/4, pp. 687695 (1988). Preliminary version of this paper appeared in [6]. y Research supported in part by National Science Foundation Grant MCS800756 A01. z Research supported by NSF under ECS8404866, the Semiconductor Research Corporation under RSCH 84060496, and by an IBM Faculty Development Award. x Research Supported in part by NSF Grant DCR8504391 and by an IBM Faculty Development Award. 1 INTRODUCTION 1 1 Introduction In this paper we consider the problem of dynamic evaluation of a straight line program in parallel. This is a generalization of the result of Valiant et al [10]. They consider the problem of ta...
Parallel RAMs with Owned Global Memory and Deterministic ContextFree Language Recognition
, 1997
"... We identify and study a natural and frequently occurring subclass of ConcurrentRead, ExclusiveWrite Parallel Random Access Machines (CREWPRAMs). Called ConcurrentRead, OwnerWrite, or CROWPRAMs, these are machines in which each global memory location is assigned a unique "owner" processor, whi ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
We identify and study a natural and frequently occurring subclass of ConcurrentRead, ExclusiveWrite Parallel Random Access Machines (CREWPRAMs). Called ConcurrentRead, OwnerWrite, or CROWPRAMs, these are machines in which each global memory location is assigned a unique "owner" processor, which is the only processor allowed to write into it. Considering the difficulties that would be involved in physically realizing a full CREWPRAM model, it is interesting to observe that in fact, most known CREWPRAM algorithms satisfy the CROW restriction or can be easily modified to do so. This paper makes three main contributions. First, we formally define the CROWPRAM model and demonstrate its stability
Open Problems in Number Theoretic Complexity, II
"... this paper contains a list of 36 open problems in numbertheoretic complexity. We expect that none of these problems are easy; we are sure that many of them are hard. This list of problems reflects our own interests and should not be viewed as definitive. As the field changes and becomes deeper, new ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
this paper contains a list of 36 open problems in numbertheoretic complexity. We expect that none of these problems are easy; we are sure that many of them are hard. This list of problems reflects our own interests and should not be viewed as definitive. As the field changes and becomes deeper, new problems will emerge and old problems will lose favor. Ideally there will be other `open problems' papers in future ANTS proceedings to help guide the field. It is likely that some of the problems presented here will remain open for the forseeable future. However, it is possible in some cases to make progress by solving subproblems, or by establishing reductions between problems, or by settling problems under the assumption of one or more well known hypotheses (e.g. the various extended Riemann hypotheses, NP 6= P; NP 6= coNP). For the sake of clarity we have often chosen to state a specific version of a problem rather than a general one. For example, questions about the integers modulo a prime often have natural generalizations to arbitrary finite fields, to arbitrary cyclic groups, or to problems with a composite modulus. Questions about the integers often have natural generalizations to the ring of integers in an algebraic number field, and questions about elliptic curves often generalize to arbitrary curves or abelian varieties. The problems presented here arose from many different places and times. To those whose research has generated these problems or has contributed to our present understanding of them but to whom inadequate acknowledgement is given here, we apologize. Our list of open problems is derived from an earlier `open problems' paper we wrote in 1986 [AM86]. When we wrote the first version of this paper, we feared that the problems presented were so difficult...
Scalability of Parallel Sorting on Mesh Multicomputers
, 1991
"... This paper presents two new parallel algorithms QSP1 and QSP2 based on sequential quicksort for sorting data on a mesh multicomputer, and analyzes their scalability using the isoefficiency metric. We show that QSP2 matches the lower bound on the isoefficiency function for mesh multicomputers. The is ..."
Abstract

Cited by 18 (12 self)
 Add to MetaCart
This paper presents two new parallel algorithms QSP1 and QSP2 based on sequential quicksort for sorting data on a mesh multicomputer, and analyzes their scalability using the isoefficiency metric. We show that QSP2 matches the lower bound on the isoefficiency function for mesh multicomputers. The isoefficiency of QSP1 is also fairly close to optimal. Lang et al. and Schnorr et al. have developed parallel sorting algorithms for the mesh architecture that have either optimal (Schnorr) or close to optimal (Lang) runtime complexity for the oneelementperprocessor case. Both QSP1 and QSP2 have worse performance than these algorithms for the oneelementperprocessor case. But QSP1 and QSP2 have better scalability than the scaleddown variants of these algorithms (for the case in which there are more elements than processors). As a result, our new parallel formulations are better than these scaleddown variants in terms of speedup w.r.t the best sequential algorithms. We also present a dif...