Results 1 - 10
of
70
Analyzing Scalability of Parallel Algorithms and Architectures
- Journal of Parallel and Distributed Computing
, 1994
"... The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithm-architecture combination for a problem under different constraints on the growth of ..."
Abstract
-
Cited by 84 (17 self)
- Add to MetaCart
The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithm-architecture combination for a problem under different constraints on the growth of the problem size and the number of processors. It may be used to predict the performance of a parallel algorithm and a parallel architecture for a large number of processors from the known performance on fewer processors. For a fixed problem size, it may be used to determine the optimal number of processors to be used and the maximum possible speedup that can be obtained. The objective of this paper is to critically assess the state of the art in the theory of scalability analysis, and motivate further research on the development of new and more comprehensive analytical tools to study the scalability of parallel algorithms and architectures. We survey a number of techniques and formalisms t...
HERMES: an Infrastructure for Low Area Overhead Packet-switching Networks on Chip
- Integration, the VLSI Journal
, 2004
"... The increasing complexity of integrated circuits drives the research of new intra-chip interconnection architectures. A network on chip draws on concepts inherited from distributed systems and computer networks subject areas to interconnect IP cores in a structured and scalable way. The main goal ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
The increasing complexity of integrated circuits drives the research of new intra-chip interconnection architectures. A network on chip draws on concepts inherited from distributed systems and computer networks subject areas to interconnect IP cores in a structured and scalable way. The main goal pursued is to achieve superior bandwidth when compared to conventional intra-chip bus architectures. This paper reviews the state of the art in networks on chip. It also describes an infrastructure called Hermes, targeted to implement packetswitching mesh and related interconnection architectures. The basic element of Hermes is a switch with five bi-directional ports, connecting to four other switches and to a local IP core.
Local-Area MultiProcessor: the Scalable Coherent Interface
- DEFINING THE GLOBAL INFORMATION INFRASTRUCTURE: INFRASTRUCTURE, SYSTEMS, AND SERVICES
, 1994
"... There is rapidly increasing demand for very high performance shared access to distributed data, for multiprocessors, networked workstation clusters, distributed databases, industrial data acquisition and control systems, etc. The objective is to satisfy this demand at the lowest longterm cost. This ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
There is rapidly increasing demand for very high performance shared access to distributed data, for multiprocessors, networked workstation clusters, distributed databases, industrial data acquisition and control systems, etc. The objective is to satisfy this demand at the lowest longterm cost. This paper first considers the general properties that an appropriate system architecture should have. A new architectural model, Local-Area MultiProcessor, is introduced. These properties are then considered in more detail, and practical design decisions are made, illustrated by the evolution of the ISO/ANSI/IEEE standard Scalable Coherent Interface (SCI) as it addressed these issues. Finally, the current status of the various SCI follow-on and support projects is reported.
Performance and scalability of preconditioned conjugate gradient methods on parallel computers
- Department of Computer Science, University of Minnesota
, 1995
"... ..."
Proactive Management of Software Aging
, 2001
"... this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor. ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor.
Scalability of Parallel Algorithms for Matrix Multiplication
- in Proc. of Int. Conf. on Parallel Processing
, 1991
"... A number of parallel formulations of dense matrix multiplication algorithm have been developed. For arbitrarily large number of processors, any of these algorithms or their variants can provide near linear speedup for sufficiently large matrix sizes and none of the algorithms can be clearly claimed ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
A number of parallel formulations of dense matrix multiplication algorithm have been developed. For arbitrarily large number of processors, any of these algorithms or their variants can provide near linear speedup for sufficiently large matrix sizes and none of the algorithms can be clearly claimed to be superior than the others. In this paper we analyze the performance and scalability of a number of parallel formulations of the matrix multiplication algorithm and predict the conditions under which each formulation is better than the others. We present a parallel formulation for hypercube and related architectures that performs better than any of the schemes described in the literature so far for a wide range of matrix sizes and number of processors. The superior performance and the analytical scalability expressions for this algorithm are verified through experiments on the Thinking Machines Corporation's CM-5 TM y parallel computer for up to 512 processors. We show that special har...
Direct Parallel Algorithms for Banded Linear Systems
, 1994
"... . We investigate direct algorithms to solve linear banded systems of equations on MIMD multiprocessor computers with distributed memory. We show that it is hard to beat ordinary one-processor Gaussian elimination. Numerical computation results from the Intel Paragon are given. 1. Introductio ..."
Abstract
-
Cited by 19 (8 self)
- Add to MetaCart
. We investigate direct algorithms to solve linear banded systems of equations on MIMD multiprocessor computers with distributed memory. We show that it is hard to beat ordinary one-processor Gaussian elimination. Numerical computation results from the Intel Paragon are given. 1. Introduction In a project on divide and conquer algorithms in numerical linear algebra, the authors studied parallel algorithms to solve systems of linear equations and eigenvalue problems. The latter consisted in a study of the divide and conquer algorithm proposed by Cuppen [4] and stabilized by Sorensen and Tang [11]. This algorithm is evolving as the standard algorithm for solving the symmetric tridiagonal eigenvalue problem on sequential as on parallel computers. In [7], Gates and Arbenz report on the first successful parallel implementation of the algorithm. They observed almost optimal speedups on the Intel Paragon. The accuracy observed is as good as with any other known (fast) algorithm. The...
Knowledge-Independent Data Mining with Fine-Grained Parallel Evolutionary Algorithms
- In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’2001
, 2001
"... This paper illustrates the application of evolutionary algorithms (EA) to data mining problems. The objectives are to demonstrate that EA can provide a competitive general purpose data mining scheme for classification tasks without constraining the knowledge representation, and that it can be ..."
Abstract
-
Cited by 16 (8 self)
- Add to MetaCart
This paper illustrates the application of evolutionary algorithms (EA) to data mining problems. The objectives are to demonstrate that EA can provide a competitive general purpose data mining scheme for classification tasks without constraining the knowledge representation, and that it can be achieved reducing the amount of time required using the inherent parallel processing nature of EA.
Selected problems of scheduling tasks in multiprocessor computing systems
- PhD thesis, Instytut Informatyki Politechnika Poznanska
, 1997
"... ..."

