Results 1  10
of
19
Analyzing Scalability of Parallel Algorithms and Architectures
 Journal of Parallel and Distributed Computing
, 1994
"... The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithmarchitecture combination for a problem under different constraints on the growth of ..."
Abstract

Cited by 90 (18 self)
 Add to MetaCart
The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithmarchitecture combination for a problem under different constraints on the growth of the problem size and the number of processors. It may be used to predict the performance of a parallel algorithm and a parallel architecture for a large number of processors from the known performance on fewer processors. For a fixed problem size, it may be used to determine the optimal number of processors to be used and the maximum possible speedup that can be obtained. The objective of this paper is to critically assess the state of the art in the theory of scalability analysis, and motivate further research on the development of new and more comprehensive analytical tools to study the scalability of parallel algorithms and architectures. We survey a number of techniques and formalisms t...
Can Parallel Algorithms Enhance Serial Implementation? (Extended Abstract)
, 1996
"... The broad thesis presented in this paper suggests that the serial emulation of a parallel algorithm has the potential advantage of running on a serial machine faster than a standard serial algorithm for the same problem. It is too early to reach definite conclusions regarding the significance of th ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
The broad thesis presented in this paper suggests that the serial emulation of a parallel algorithm has the potential advantage of running on a serial machine faster than a standard serial algorithm for the same problem. It is too early to reach definite conclusions regarding the significance of this thesis. However, using some imagination, validity of the thesis and some arguments supporting it may lead to several farreaching outcomes: (1) Reliance on "predictability of reference" in the design of computer systems will increase. (2) Parallel algorithms will be taught as part of the standard computer science and engineering undergraduate curriculum irrespective of whether (or when) parallel processing will become ubiquitous in the generalpurpose computing world. (3) A strategic agenda for highperformance parallel computing: A multistage agenda, which in no stage compromises userfriendliness of the programmer 's...
PRO: a model for Parallel ResourceOptimal computation
 IN 16TH ANNUAL INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTING SYSTEMS AND APPLICATIONS. IEEE, THE INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS
, 2002
"... We present a new parallel computation model that enables the design of resourceoptimal scalable parallel algorithms and simplifies their analysis. The model rests on the novel idea of incorporating relative optimality as an integral part and measuring the quality of a parallel algorithm in terms of ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
We present a new parallel computation model that enables the design of resourceoptimal scalable parallel algorithms and simplifies their analysis. The model rests on the novel idea of incorporating relative optimality as an integral part and measuring the quality of a parallel algorithm in terms of granularity.
Local Consistency in Parallel ConstraintSatisfaction Networks
 Artificial Intelligence
, 1994
"... We summarize our work on the parallel complexity of local consistency in constraint networks, and present several basic techniques for achieving parallel execution of constraint networks. We are interested primarily in developing a classification of constraint networks according to whether they admi ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
We summarize our work on the parallel complexity of local consistency in constraint networks, and present several basic techniques for achieving parallel execution of constraint networks. We are interested primarily in developing a classification of constraint networks according to whether they admit massively parallel execution. The major result supported by our investigations is that the parallel complexity of constraint networks is critically dependent on subtle properties of the network that do not influence its sequential complexity. 1 Introduction In this position paper we summarize our work on the parallel complexity of local consistency in constraint networks [Kas90, Kas86, Kas89, KRS87, KD90]. Our research is aimed at deriving a precise characterization of the utility of parallelism in such networks. We are interested primarily in developing a classification of constraint networks according to whether they admit massively parallel execution. We have analyzed parallel executio...
A Theory Of Strict PCompleteness
 STACS 1992, in Lecture Notes in Computer Science 577
, 1992
"... . A serious limitation of the theory of Pcompleteness is that it fails to distinguish between those Pcomplete problems that do have polynomial speedup on parallel machines from those that don't. We introduce the notion of strict Pcompleteness and develop tools to prove precise limits on the possi ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
. A serious limitation of the theory of Pcompleteness is that it fails to distinguish between those Pcomplete problems that do have polynomial speedup on parallel machines from those that don't. We introduce the notion of strict Pcompleteness and develop tools to prove precise limits on the possible speedups obtainable for a number of Pcomplete problems. Key words. Parallel computation; Pcompleteness. Subject classifications. 68Q15, 68Q22. 1. Introduction A major goal of the theory of parallel computation is to understand how much speedup is obtainable in solving a problem on parallel machines over sequential machines. The theory of Pcompleteness has successfully classified many problems as unlikely to have polylog time algorithms on a parallel machine with a polynomial number of processors. However, the theory fails to distinguish between those Pcomplete problems that do have significant, polynomial speedup on parallel machines from those that don't. Yet this distinction is e...
Analysis and Design of Scalable Parallel Algorithms for Scientific Computing
, 1995
"... This dissertation presents a methodology for understanding the performance and scalability of algorithms on parallel computers and the scalability analysis of a variety of numerical algorithms. We demonstrate the analytical power of this technique and show how it can guide the development of better ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
This dissertation presents a methodology for understanding the performance and scalability of algorithms on parallel computers and the scalability analysis of a variety of numerical algorithms. We demonstrate the analytical power of this technique and show how it can guide the development of better parallel algorithms. We present some new highly scalable parallel algorithms for sparse matrix computations that were widely considered to be poorly suitable for large scale parallel computers. We present some laws governing the performance and scalability properties that apply to all parallel systems. We show that our results generalize or extend a range of earlier research results concerning the performance of parallel systems. Our scalability analysis of algorithms such as fast Fourier transform (FFT), dense matrix multiplication, sparse matrixvector multiplication, and the preconditioned conjugate gradient (PCG) provides many interesting insights into their behavior on parallel computer...
Graph Coloring on Coarse Grained Multicomputers
, 2002
"... We present an efficient and scalable Coarse Grained Multicomputer (CGM) coloring algorithm that colors a graph G with at most D+ 1 colors where D is the maximum degree in G. This algorithm is given in two variants: randomized and deterministic. We show that on a pprocessor CGM model the proposed al ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We present an efficient and scalable Coarse Grained Multicomputer (CGM) coloring algorithm that colors a graph G with at most D+ 1 colors where D is the maximum degree in G. This algorithm is given in two variants: randomized and deterministic. We show that on a pprocessor CGM model the proposed algorithms require a parallel time of O( G p ) and a total work and overall communication cost of O(G). These bounds correspond to the average case for the randomized version and to the worstcase for the deterministic variant. Key words: graph algorithms, parallel algorithms, graph coloring, Coarse Grained Multicomputers 1
Computers for Symbolic Processing
 Proceedings of the IEEE
, 1989
"... In this paper, we provide a detailed survey on the motivations, desisn, applications, current status, and limitations of computers destsned fo symbolic processing. Symbolic processin applications are computations that are performed at the word, relation, or meanin levels. A major difference between ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
In this paper, we provide a detailed survey on the motivations, desisn, applications, current status, and limitations of computers destsned fo symbolic processing. Symbolic processin applications are computations that are performed at the word, relation, or meanin levels. A major difference between symbolic and conventional numeric applications is that the knowledge used in symbolic applications may be fuzzy, uncertain, indeterminate, and ill represented. As a result, the collection, representation, and management of knowledge is more difficult in symbolic applications than in conventional numeric applications. We survey various techniques for knowledge representation and processing, from both the designers' and users' points of view. The design and choice of a suitable language fo symbolic processing and the mapping of applications into a software architecture are then presented. We examine the design process of refining the application requirements into hardware and software architectures and discuss stateoftheart sequential and parallel computers designed for symbolic processing.
Practical Parallel Algorithms for Graph Coloring Problems in Numerical Optimization
, 2003
"... This work was financially supported by the University of Bergen through a ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
This work was financially supported by the University of Bergen through a
Matching and Unification for the ObjectOriented Symbolic Computation System AlgBench
 In Proc. of the 3rd Intern. Symposium on Design and Implementation of Symbolic Computation Systems (DISCO'93), SpringerVerlag, LNCS 722
, 1993
"... . Term matching has become one of the most important primitive operations for symbolic computation. This paper describes the extension of the objectoriented symbolic computation system AlgBench with pattern matching and unification facilities. The various pattern objects are organized in subclasses ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
. Term matching has become one of the most important primitive operations for symbolic computation. This paper describes the extension of the objectoriented symbolic computation system AlgBench with pattern matching and unification facilities. The various pattern objects are organized in subclasses of the class of the composite expressions. This leads to a clear design and to a distributed implementation of the pattern matcher in the subclasses. New pattern object classes can consequently be added easily to the system. Huet's and our simple mark and retract algorithm for standard unification as well as Stickel's algorithm for associative commutative unification have been implemented in an objectoriented style. Unifiers are selected at runtime. We extend Mathematica's typeconstrained pattern matching by taking into account inheritance information from a userdefined hierarchy of object types. The argument unification is basically instance variable unification. The improvement of the ...