Results 1  10
of
118
A low area overhead packetswitched network on chip: architecture and prototyping
 in: IFIP Very Large Scale Integration (VLSISOC
"... packetswitching networks on chip ..."
Analyzing Scalability of Parallel Algorithms and Architectures
 Journal of Parallel and Distributed Computing
, 1994
"... The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithmarchitecture combination for a problem under different constraints on the growth of ..."
Abstract

Cited by 99 (20 self)
 Add to MetaCart
(Show Context)
The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithmarchitecture combination for a problem under different constraints on the growth of the problem size and the number of processors. It may be used to predict the performance of a parallel algorithm and a parallel architecture for a large number of processors from the known performance on fewer processors. For a fixed problem size, it may be used to determine the optimal number of processors to be used and the maximum possible speedup that can be obtained. The objective of this paper is to critically assess the state of the art in the theory of scalability analysis, and motivate further research on the development of new and more comprehensive analytical tools to study the scalability of parallel algorithms and architectures. We survey a number of techniques and formalisms t...
Proactive Management of Software Aging
, 2001
"... this paper may be copied or distributed royalty free without further permission by computerbased and other informationservice systems. Permission to republish any other portion of this paper must be obtained from the Editor. ..."
Abstract

Cited by 57 (3 self)
 Add to MetaCart
this paper may be copied or distributed royalty free without further permission by computerbased and other informationservice systems. Permission to republish any other portion of this paper must be obtained from the Editor.
LocalArea MultiProcessor: the Scalable Coherent Interface
 Proceedings of the Second International Workshop on SCIbased HighPerformance LowCost Computing
, 1995
"... ..."
Performance and scalability of preconditioned conjugate gradient methods on parallel computers
 Department of Computer Science, University of Minnesota
, 1995
"... ..."
Selected problems of scheduling tasks in multiprocessor computing systems
 PHD THESIS, INSTYTUT INFORMATYKI POLITECHNIKA POZNANSKA
, 1997
"... ..."
KnowledgeIndependent Data Mining with FineGrained Parallel Evolutionary Algorithms
 In Proceedings of the Genetic and Evolutionary Computation Conference (GECCOâ€™2001
, 2001
"... This paper illustrates the application of evolutionary algorithms (EA) to data mining problems. The objectives are to demonstrate that EA can provide a competitive general purpose data mining scheme for classification tasks without constraining the knowledge representation, and that it can be ..."
Abstract

Cited by 27 (9 self)
 Add to MetaCart
This paper illustrates the application of evolutionary algorithms (EA) to data mining problems. The objectives are to demonstrate that EA can provide a competitive general purpose data mining scheme for classification tasks without constraining the knowledge representation, and that it can be achieved reducing the amount of time required using the inherent parallel processing nature of EA.
Scalability of Parallel Algorithms for Matrix Multiplication
 in Proc. of Int. Conf. on Parallel Processing
, 1991
"... A number of parallel formulations of dense matrix multiplication algorithm have been developed. For arbitrarily large number of processors, any of these algorithms or their variants can provide near linear speedup for sufficiently large matrix sizes and none of the algorithms can be clearly claimed ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
A number of parallel formulations of dense matrix multiplication algorithm have been developed. For arbitrarily large number of processors, any of these algorithms or their variants can provide near linear speedup for sufficiently large matrix sizes and none of the algorithms can be clearly claimed to be superior than the others. In this paper we analyze the performance and scalability of a number of parallel formulations of the matrix multiplication algorithm and predict the conditions under which each formulation is better than the others. We present a parallel formulation for hypercube and related architectures that performs better than any of the schemes described in the literature so far for a wide range of matrix sizes and number of processors. The superior performance and the analytical scalability expressions for this algorithm are verified through experiments on the Thinking Machines Corporation's CM5 TM y parallel computer for up to 512 processors. We show that special har...
Direct Parallel Algorithms for Banded Linear Systems
, 1994
"... . We investigate direct algorithms to solve linear banded systems of equations on MIMD multiprocessor computers with distributed memory. We show that it is hard to beat ordinary oneprocessor Gaussian elimination. Numerical computation results from the Intel Paragon are given. 1. Introductio ..."
Abstract

Cited by 22 (8 self)
 Add to MetaCart
. We investigate direct algorithms to solve linear banded systems of equations on MIMD multiprocessor computers with distributed memory. We show that it is hard to beat ordinary oneprocessor Gaussian elimination. Numerical computation results from the Intel Paragon are given. 1. Introduction In a project on divide and conquer algorithms in numerical linear algebra, the authors studied parallel algorithms to solve systems of linear equations and eigenvalue problems. The latter consisted in a study of the divide and conquer algorithm proposed by Cuppen [4] and stabilized by Sorensen and Tang [11]. This algorithm is evolving as the standard algorithm for solving the symmetric tridiagonal eigenvalue problem on sequential as on parallel computers. In [7], Gates and Arbenz report on the first successful parallel implementation of the algorithm. They observed almost optimal speedups on the Intel Paragon. The accuracy observed is as good as with any other known (fast) algorithm. The...