### Table 1: Numerical results for out-of-core construction. Tests performed on a net- work of 16 PCs.

2005

"... In PAGE 6: ... Preprocessing. Table1 lists numerical results for our out-of-core preprocessing method for all the test datasets. The tests were exe- cuted on a moderately loaded network of 16 PCs running Linux 2.... ..."

Cited by 11

### Table 1: Numerical results for out-of-core construction. Tests performed on a net- work of 16 PCs.

"... In PAGE 6: ... Preprocessing. Table1 lists numerical results for our out-of-core preprocessing method for all the test datasets. The tests were exe- cuted on a moderately loaded network of 16 PCs running Linux 2.... ..."

### Table 1: Numerical results for out-of-core construction. Tests performed on a net- work of PCs. All times are in seconds.

2004

"... In PAGE 6: ...ork of PCs. All times are in seconds. Preprocessing. Table1 lists numerical results for our out-of- core preprocessing method for a number of runs on all the test datasets. The tests were executed on a moderately loaded network of PCs running Linux 2.... ..."

Cited by 24

### Table 2: Extra I#2FO #28relativetojLj#29 and read grain size #28in MBytes#29 for out-of-core methods.

1997

"... In PAGE 9: ... 5.1 I#2FO Performance The #0Crst four columns of Table2 show the amount of extra I#2FO performed by the PPLL, PPLL U , and MF methods, over and above the amount required to write L to disk once. An entry of 1:0 means that the number of matrix entries read and written during the factorization is 2jLj.... In PAGE 10: ... The table shows extra I#2FO for the method and the average read grain size. Comparing this table to Table2 , we #0Cnd that I#2FO volumes decrease signi#0Ccantly for several problems. Extra I#2FO for algorithm PPLL U on matrix PRODPLAN drops from 11.... ..."

### Table 3: Timings in seconds for the main phases of out-of-core LU factoriza-

1996

"... In PAGE 23: ...hen reading, so wewould not expect Eq. 22 to hold. Thus, the version of the algorithm that stores the matrix in pivoted form is expected to be faster. This is borne out by the timings presented in Table3 for an 8 #02 8 process mesh. These timings are directly comparable with those of Table 2, and show that the version of the algorithm that stores the matrix in pivoted form is faster by 10-15#25.... In PAGE 25: ... Thus, for the parameters of Table 4 the M = 5000 and M = 8000 cases #0Ct in core, so we just read in the whole matrix, factorize it using the standard ScaLAPACK routine P GETRF, and then write it out again. In Table 4 it takes about 58 seconds to perform an in-core factoriza- tion of a 5000 #02 5000 matrix, compared with 191 seconds for an out-of-core factorization #28see Table3 #29. The M = 8000 case in Table 4 failed, presum- ably because PFS was not able to handle the need to simultaneously read 8Mbytes from each of 64 separate #0Cles.... In PAGE 25: ... The M = 8000 case in Table 4 failed, presum- ably because PFS was not able to handle the need to simultaneously read 8Mbytes from each of 64 separate #0Cles. The M = 10000 case ran success- fully out-of-core, and the results in Table 4 should be compared with those in Table3 , from whichwe observe that increasing n g increases the time for I#2FO and factorization, but decreases the times for all other phases of the algorithm. The increase in I#2FO is an unexpected result since increasing n g should decrease the I#2FO cost.... In PAGE 27: ... Communication overhead, together with the #0Doating-point operation count, determines the performance of the computational phases of the algorithm as n g changes. The failure of the M = 8000 case in Table3 prompted us to devise a second way of implementing logically distributed #0Cles. Instead of opening a separate #0Cle for each process, the new method opens a single #0Cle and divides it into blocks, assigning one block to each process.... ..."

Cited by 19

### Table 3: Timings in seconds for the main phases of out-of-core LU factoriza-

1996

"... In PAGE 22: ...hen reading, so wewould not expect Eq. 22 to hold. Thus, the version of the algorithm that stores the matrix in pivoted form is expected to be faster. This is borne out by the timings presented in Table3 for an 8 #02 8 process mesh. These timings are directly comparable with those of Table 2, and show that the version of the algorithm that stores the matrix in pivoted form is faster by 10-15#25.... In PAGE 24: ... The M = 8000 case in Table 4 failed, presum- ably because PFS was not able to handle the need to simultaneously read 8Mbytes from each of 64 separate #0Cles. The M = 10000 case ran success- fully out-of-core, and the results in Table 4 should be compared with those in Table3 , from whichwe observe that increasing n g increases the time for I#2FO and factorization, but decreases the times for all other phases of the algorithm. The increase in I#2FO is an unexpected result since increasing n g should decrease the I#2FO cost.... In PAGE 26: ... Communication overhead, together with the #0Doating-point operation count, determines the performance of the computational phases of the algorithm as n g changes. The failure of the M = 8000 case in Table3 prompted us to devise a second way of implementing logically distributed #0Cles. Instead of opening a separate #0Cle for each process, the new method opens a single #0Cle and divides it into blocks, assigning one block to each process.... ..."

Cited by 19

### Table 1: Simplification results of running QSlim [5], Memoryless Simplification [8], and the out-of-core method (OoCS). All results were gathered on a 195 MHz R10000 SGI Origin with 4 GB of RAM and a standard SCSI disk drive.

2000

"... In PAGE 3: ... We applied two levels of Loop subdivision to the blade model to increase its triangle count by a factor of 16, thus making it more challenging to simplify. Table1 includes the triangles counts, memory usage, and timing results of simplifying these models using our method as well as the in-core methods presented in [5, 8]. While being much more memory efficient than these two methods, our new algorithm is also orders of magnitude faster.... ..."

Cited by 102

### Table 1: Simplification results of running QSlim [5], Memoryless Simplification [8], and the out-of-core method (OoCS). All results were gathered on a 195 MHz R10000 SGI Origin with 4 GB of RAM and a standard SCSI disk drive.

"... In PAGE 3: ... We applied two levels of Loop subdivision to the blade model to increase its triangle count by a factor of 16, thus making it more challenging to simplify. Table1 includes the triangles counts, memory usage, and timing results of simplifying these models using our method as well as the in-core methods presented in [5, 8]. While being much more memory efficient than these two methods, our new algorithm is also orders of magnitude faster.... ..."

### Table 1: Simplification results of running QSlim [5], Memoryless Simplification [8], and the out-of-core method (OoCS). All results were gathered on a 195 MHz R10000 SGI Origin with 4 GB of RAM and a standard SCSI disk drive.

"... In PAGE 3: ... We applied two levels of Loop subdivision to the blade model to increase its triangle count by a factor of 16, thus making it more challenging to simplify. Table1 includes the triangles counts, memory usage, and timing results of simplifying these models using our method as well as the in-core methods presented in [5, 8]. While being much more memory efficient than these two methods, our new algorithm is also orders of magnitude faster.... ..."