### Table 2: Execution time related to the coarse-grain parallel version of SFPar: ( )

"... In PAGE 14: ... Indicated times include the block loading, the research of the starting voxel and the skin building. Table2 shows the elapsed time of the coarse-grain version of Sewing Faces on any elementary sub-block. The measured time includes the sub-block loading, the research of the starting voxel, the sub-skin building and the half-sews detection.... ..."

### Table 3. Speed-up obtained with the coarse-grain parallelization. Size of the 3D block Time saved (%) Speed-up

"... In PAGE 17: ... Table3 points out the time saved by the coarse-grain approach. The last column indicates the speed-up factor due to the coarse-grain approach and underlines the fact that using 8 processors we get an speed-up factor of about 5.... ..."

### Table 2. Execution time related to the coarse-grain parallel version of SFPar: (). Size of the 3D block Time to compute on a sub-block Time of the gluing phase Total (s.)

"... In PAGE 16: ... Table2 shows the elapsed time of the coarse-grain version of Sewing Faces on any elementary sub-block. The measured durations include the sub-block loading, the starting voxel detection, the sub-skin building and the half-sews detection.... ..."

### Table 6 Memory usage per processor in MB for the serial and coarse-grained parallel code. The last column shows the ratio of memory used by parallel code over the serial code.

"... In PAGE 29: ... For the studies, we use Nx = 28; 56; 122; 224; 448. Table6 shows the observed memory use per processor for the serial and parallel codes. That is, for instance for Nx = 28, 41.... In PAGE 29: ... The nal test case (Nx = 448) corresponds to Lx = 31:76, which approaches the domain in Section 1. The improvement in memory usage for this initial design of the memory code is not optimal, but the parallelism used does allow for solutions to be computed over meshes which are too ne for the serial code, as described in Table6 . Speci cally, on a serial machine, the most memory-optimal imple- mentation allowed for the solution of a problem with over 11 million degrees of freedom; the parallel implementation managed to double the problem size to over 22 million degrees of freedom.... In PAGE 29: ... Speci cally, on a serial machine, the most memory-optimal imple- mentation allowed for the solution of a problem with over 11 million degrees of freedom; the parallel implementation managed to double the problem size to over 22 million degrees of freedom. Table 7 compares runtimes of the serial and parallel codes for the smaller do- mains in Table6 . The timings shown are for runs with nal time 2 ms.... ..."

### Table 3: Algorithm 4.3 | O(p ln(p)n3), coarse grain parallelism.

2001

Cited by 11

### Table 3: Algorithm 4.3 | O(p ln(p)n3), coarse grain parallelism.

2001

Cited by 11

### Table 3 points out the time saved by the coarse-grain approach. The last column indicates the speed-up factor due to the coarse-grain approach and underlines the fact that using 8 processors we get an speed-up factor of about 5.

"... In PAGE 17: ... It is obtained by adding the gluing time and the elementary sub-block time. Table3 . Speed-up obtained with the coarse-grain parallelization.... ..."

### Table 2: The performance of different implementations of multilevel k-way partitioning algorithm. This table shows the performance of the MPI- and SHMEM-based parallel algorithm, of the coarse-grain parallel multilevel refinement algorithm, and of the serial algorithm on an SGI workstation. In the case of the results of the parallel algorithms, for each graph, the performance is shown for 16-, 32-, 64-, and 128-way partitions on 16, 32, 64, and 128 processors, respectively. All times are in seconds.

1997

"... In PAGE 9: ... of Edges Description AUTO 448695 3314611 3D Finite element mesh MDUAL 258569 513132 Dual of a 3D Finite element mesh MDUAL2 988605 1947069 Dual of a 3D Finite element mesh Table 1: Various graphs used in evaluating the parallel multilevel k-way graph partitioning algorithm. Table2 shows the performance of various implementations of the multilevel k-way partitioning algorithm. The first two subtables show the performance of the coarse-grain and SHMEM-based parallel partitioning algorithms, respec-... In PAGE 10: ... Also, because the coarse-grain implementation is memory efficient, this increases the amount of time spent in the algorithm to set-up the appropriate data structures. The third subtable in Table2 shows the performance achieved by the coarse-grain parallel multilevel refinement algorithm. These results were obtained by using as the initial graph distribution, the partitioning obtained by the parallel multilevel k-way partitioning algorithm.... ..."

Cited by 25

### Table 2: The performance of different implementations of multilevel k-way partitioning algorithm. This table shows the performance of the MPI- and SHMEM-based parallel algorithm, of the coarse-grain parallel multilevel refinement algorithm, and of the serial algorithm on an SGI workstation. In the case of the results of the parallel algorithms, for each graph, the performance is shown for 16-, 32-, 64-, and 128-way partitions on 16, 32, 64, and 128 processors, respectively. All times are in seconds.

1997

"... In PAGE 9: ... of Edges Description AUTO 448695 3314611 3D Finite element mesh MDUAL 258569 513132 Dual of a 3D Finite element mesh MDUAL2 988605 1947069 Dual of a 3D Finite element mesh Table 1: Various graphs used in evaluating the parallel multilevel k-way graph partitioning algorithm. Table2 shows the performance of various implementations of the multilevel k-way partitioning algorithm. The first two subtables show the performance of the coarse-grain and SHMEM-based parallel partitioning algorithms, respec-... In PAGE 10: ... Also, because the coarse-grain implementation is memory efficient, this increases the amount of time spent in the algorithm to set-up the appropriate data structures. The third subtable in Table2 shows the performance achieved by the coarse-grain parallel multilevel refinement algorithm. These results were obtained by using as the initial graph distribution, the partitioning obtained by the parallel multilevel k-way partitioning algorithm.... ..."

Cited by 25

### Table 2: The performance of different implementations of multilevel k-way partitioning algorithm. This table shows the performance of the MPI- and SHMEM-based parallel algorithm, of the coarse-grain parallel multilevel refinement algorithm, and of the serial al- gorithm on an SGI workstation. In the case of the results of the parallel algorithms, for each graph, the performance is shown for 16-, 32-, 64-, and 128-way partitions on 16, 32, 64, and 128 processors, respectively. All times are in seconds.

1997

"... In PAGE 6: ... of Edges Description AUTO 448695 3314611 3D Finite element mesh MDUAL 258569 513132 Dual of a 3D Finite element mesh MDUAL2 988605 1947069 Dual of a 3D Finite element mesh Table 1: Various graphs used in evaluating the parallel multilevel k-way graph partitioning algorithm. Table2 shows the performance of various implementations of the multilevel k-way partitioning algorithm. The first... In PAGE 7: ... Also, because the coarse-grain imple- mentation is memory efficient, this increases the amount of time spent in the algorithm to set-up the appropriate data structures. The third subtable in Table2 shows the performance achieved by the coarse-grain parallel multilevel refinement algorithm. These results were obtained by using as the initial graph distribution, the partitioning obtained by the parallel multilevel k-way partitioning algorithm.... ..."

Cited by 25