### Table 1. A spectrum of heterogeneity

1995

"... In PAGE 2: ... Choosing the best set of available resources is a difficult problem and is the subject of this paper. Consider the set of machines in Table1 and observe that they have different computation and communication capacities. Loosely-coupled parallel computations with infrequent communication would likely benefit by applying the fastest set of computa- tional resources (perhaps the DEC-Alpha cluster), and may benefit from distribution across many machines.... ..."

Cited by 18

### Table 1. A spectrum of heterogeneity

"... In PAGE 2: ... Choosing the best set of available resources is a difficult problem and is the subject of this paper. Consider the set of machines in Table1 and observe that they have different computation and communication capacities. Loosely-coupled parallel computations with infrequent communication would likely benefit by applying the fastest set of computa- tional resources (perhaps the DEC-Alpha cluster), and may benefit from distribution across many machines.... ..."

### Table 2 Specifications of the Eleven Heterogeneous Computers Machine

"... In PAGE 12: ... 5.2 Applications A small heterogeneous local network of 11 different Solaris and Linux workstations shown in Table2 is used in the experiments. The network is based on 100 Mbit Ethernet with a switch enabling parallel communications between the computers.... In PAGE 15: ... 7. Determination of a set with relatively few points used to build the speed functions of the processors X2-X5 whose specifications are shown in Table2 . As few as 6 points and 5 points are used to build an efficient speed function for matrix multiplication and LU factorization respectively with deviation approximately 5% from other speed functions built with more number of points.... In PAGE 15: ... Though the absolute speed must be obtained by multiplication of two dense non-square matrices, we observed that our serial version gives almost the same speeds for multiplication of two dense square matrices if the number of elements in a dense non-square matrix is the same as the number of elements in a dense square matrix. This is illustrated in Table 3 for computers X2-X5 whose specifications are shown in Table2 . Thus speed functions of the processors built using dense square matrices will be the same as those built using dense non-square matrices.... In PAGE 17: ... However allocation of a task to these computers, the size of which is greater than 36000000 and 81000000 for matrix-matrix multiplication and LU factorization respectively, will result in severe performance degradation of the parallel application. For each of these two applications, the largest problem size that can be solved on the network of heterogeneous networks shown in Table2 is just the sum of the largest sizes of the tasks that can be solved on each computer. There are three important issues in selecting a set of points to build a speed function of a processor: 1.... In PAGE 18: ... Speeds of the processors are assumed to be zero for problem sizes beyond their upper bounds. multiplication obtained using three sets of 6, 7, and 8 points and speed functions for LU factorization obtained using three sets of 5, 7, and 8 points for the computers X2-X5 whose specifications are shown in Table2 . It can be seen that 6 points and 5 points are enough to build an efficient speed function that fall within acceptable limits of deviation for matrix multiplication and LU factorization respectively.... ..."

### Table 4: Pseudospectra computation: results for heterogeneous cluster

"... In PAGE 22: ...Table 3 (resp. Table4 ) gives the results for the eld of values (resp. pseudospectra) computation.... In PAGE 23: ...Table 4: Pseudospectra computation: results for heterogeneous cluster We can see the improvement that results from omitting the continuation on heterogeneous clusters (see the rows labeled 13(nc) in Table4 ). The speed-ups are approximatively twice those when the continuation is used.... ..."

### Table 3. Performance measurements for the HPGA on the heterogeneous cluster environ- ment.

"... In PAGE 7: ... Performance measurements for the HPGA on the heterogeneous cluster environ- ment. Table3 shows the performance measurements for the protein folding simulations using the HPGA on the hierar- chical grid environment. The processors used are distributed evenly on the four clusters.... In PAGE 7: ... The island model and the step- ping stone model are used on the grid level and the clus- ter level respectively. From Table3 we can see that the hi- erarchical communication architecture in Figure 5 for HP- GAs can be efficiently applied to a hierarchical grid envi- ronment. From Table 2 and Table 3 we can also find that the speedups predicted by Eq.... In PAGE 7: ... From Table 3 we can see that the hi- erarchical communication architecture in Figure 5 for HP- GAs can be efficiently applied to a hierarchical grid envi- ronment. From Table 2 and Table3 we can also find that the speedups predicted by Eq. (4) are very close to the experi- mental speedups.... ..."

### Table 1. Comparison of networks for parallel cluster computing Fast- Ethernet Gigabit-

"... In PAGE 2: ...etwork comparisons regarding pure technical aspects can be found e.g. in [1] and [13]). Table1 provides a short comparison of essential network technologies. The table re ects beside basic technical characteristics such as network struc- ture, minimal one way latency, maximal access bandwidth and the existence of support for isochronous communication, also aspects of operating system sup- port, manufacturer support and price.... ..."

### Table 1. The first three columns show the characteristics of available machine platforms for parallel computing (that is, multicomputers and clusters). The remaining two describe features of parallel virtual machines built on top of clusters. (See also the Glossary sidebar for details about the adopted attributes). The virtual machines must overcome four main obstacles to resemble the ideal SPMD machine because the hardware heterogeneity is unavoidable. The attributes of the ideal machine that have been reached by DAME are written in italics.

in DAME: An Environment for Preserving Efficiency of Data Parallel Computations on Distributed Systems

"... In PAGE 2: ... They hide the heterogeneity due to the different operating systems and communication layers by providing the programmer with a common set of message-passing primitives. Even if they allow the programmer to write SPMD programs, the resulting parallel virtual machine is substantially different from traditional multicomputers for which SPMD was born (the features of the ideal SPMD machine are described in the first column of Table1 ). A comparison between the first and the fourth column of Table 1 shows that PVM and MPI eliminate only one of the four main obstacles between parallel virtual and ideal SPMD machine.... In PAGE 2: ... Even if they allow the programmer to write SPMD programs, the resulting parallel virtual machine is substantially different from traditional multicomputers for which SPMD was born (the features of the ideal SPMD machine are described in the first column of Table 1). A comparison between the first and the fourth column of Table1 shows that PVM and MPI eliminate only one of the four main obstacles between parallel virtual and ideal SPMD machine. The potential inefficiencies due both to computing speed nonuniformity and the unpredictable variability of shared resources are still present.... In PAGE 5: ... For example, the parallel implementation of toy algorithms on a nonuniform machine would become as complex as that of irregular programs on homogeneous and uniform platforms. In fact, as the first column of Table1 remarks, the simplicity of the SPMD style is preserved only if the programmer can assume uniform data distribution, regular network topology, uniform and static node computational powers. In order to preserve the ideal SPMD style without loosing efficiency, DAME provides the programmer with five supports for virtual topology (VTS), data distribution (DDS), data management (DMS), interprocess communication (ICS) and workload reconfiguration (WRS).... ..."

### TABLE 2. Steps for computing optimal partitions with Rmax D 4

1997

Cited by 2

### Table 1: Comparison of parallel computational models

"... In PAGE 8: ... Our tabular format for comparison is inspired by a similar presentation in [13], where the Queuing Shared Memory (QSM) model is proposed. The columns of Table1 are labeled with the names of the selected models in our comparison and some relevant features of a model are listed along the rows.... ..."

### Table 1. Comparison of parallel computational models

2002

"... In PAGE 4: ... Our tabular format for com- parison is inspired by a similar presentation in [13], where the Queuing Shared Memory (QSM) model is proposed. The columns of Table1 are labeled with the names of the selected models in our comparison and some relevant fea- tures of a model are listed along the rows. The synchrony assumption of the model is indicated in the row labeled synch.... ..."

Cited by 7