Results 1 - 10
of
96,489
Table XVI. Comparison of Memory System Interference for Parallel and Multiprogrammed Workloads
in Converting Thread-Level Parallelism to Instruction-Level Parallelism via Simultaneous Multithreading
1997
Cited by 110
Table XVI. Comparison of Memory System Interference for Parallel and Multiprogrammed Workloads
in Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading
1997
Cited by 110
Table XVI. Comparison of Memory System Interference for Parallel and Multiprogrammed Workloads
in Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading
1997
Cited by 110
Table 10 Number of iterations, CPU times, Tp, and speedups, Sp, obtained for the computation of the s =10 leftmost eigenpairs of the six matrices with enlarged FSAI preconditioning on the CLX parallel computer
"... In PAGE 18: ... The theo- retical maximum bandwidth allowed is 200 MB/s. The results, reported in Table10 , gives a further evidence that the proposed 18 L.... ..."
Table 1: Evolution of scheduling algorithms with parallel and distributed computing systems
2006
"... In PAGE 4: ... Looking back at such efforts, we find that scheduling algorithms are evolving with the architecture of parallel and distributed systems. Table1 captures some important features of parallel and distributed systems and typical scheduling algorithms they adopt. ... ..."
Table 1: Evolution of scheduling algorithms with parallel and distributed computing systems
2006
"... In PAGE 4: ... Looking back at such efforts, we find that scheduling algorithms are evolving with the architecture of parallel and distributed systems. Table1 captures some important features ... ..."
Table 1. Steps of parallel computation
1995
"... In PAGE 7: ... Finally the computation of the slave portion xs corresponding to an eigenvector u of the slave problem can be done in parallel as well (Kssj ? p( u)Mssj)(xs)j = ?(Ksmj ? p( u)Msmj) u; j = 1; ; r : 4 Substructuring and parallel processes To each substructure we attach one process named `Sj apos; and with the master infor- mation we associate one further process called `Ma apos;. These processes work in parallel as shown in Table1 . For the necessary communication each `Sj apos; is connected to `Ma apos; directly or indirectly.... In PAGE 7: ... A detailed description is contained in [14]. Table1 shows how this parallel eigensolver which consists of the processes `Ma apos; and `R1 apos;,.... In PAGE 11: ... 6 Numerical results The parallel concept was tested on a distributed memory PARSYTEC transputer system equipped with T800 INMOS transputers (25MHz, 4 MB RAM) under the distributed operating system `helios apos;. Since each processor has a multiprocessing capability we were able to execute more than one process from Table1 on every processor which turned out to be extremly important for a good load balancing of the system. We do not discuss the mapping of the process topology to the processor network.... In PAGE 13: ... Table 4). For the parallel solution of the matrix eigenvalue problem via condensation and improvement using the Rayleigh functional according to Table1 we proposed in [16] the following proceeding.... ..."
Cited by 10
Table 1. Steps of parallel computation
1995
"... In PAGE 7: ... Finally the computation of the slave portion xs corresponding to an eigenvector u of the slave problem can be done in parallel as well (Kssj ? p( u)Mssj)(xs)j = ?(Ksmj ? p( u)Msmj) u; j = 1; ; r : 4 Substructuring and parallel processes To each substructure we attach one process named `Sj apos; and with the master infor- mation we associate one further process called `Ma apos;. These processes work in parallel as shown in Table1 . For the necessary communication each `Sj apos; is connected to `Ma apos; directly or indirectly.... In PAGE 7: ... A detailed description is contained in [14]. Table1 shows how this parallel eigensolver which consists of the processes `Ma apos; and `R1 apos;,.... In PAGE 11: ... 6 Numerical results The parallel concept was tested on a distributed memory PARSYTEC transputer system equipped with T800 INMOS transputers (25MHz, 4 MB RAM) under the distributed operating system `helios apos;. Since each processor has a multiprocessing capability we were able to execute more than one process from Table1 on every processor which turned out to be extremly important for a good load balancing of the system. We do not discuss the mapping of the process topology to the processor network.... In PAGE 13: ... Table 4). For the parallel solution of the matrix eigenvalue problem via condensation and improvement using the Rayleigh functional according to Table1 we proposed in [16] the following proceeding.... ..."
Cited by 10
Tables I, II and III summarize the results for the cubic capacitor, bus crossing and woven bus structures, respectively, employing upto eight processors. It is observed that on one processor the grid con- volution algorithm takes between 50-70% of the total CPU time for larger problems. When the computation of the convolution is signifi- cant, goodspeedupsandparallel efficienciesare obtained asexpected. Typical results include a speedup of about 5 on 8 processors and a parallel efficiency of about 60%. The speedup on two processors is
Table 1: Run time, speedup and efficiency for p-processor steady state solution for the FMS model with k=7. Results are presented for an AP3000 distributed memory parallel computer and a PC cluster.
2002
"... In PAGE 5: ... Setting k (the number of unprocessed parts in the system) to 7 results in the underlying Markov chain of the GSPN having 1 639 440 tangible states and produces 13 552 968 off-diagonal entries in its generator matrix Q. Table1 summarises the performance of the implementation on a distributed memory parallel computer and a cluster of workstations. The parallel computer is a Fujitsu AP3000 which has 60 processing nodes (each with an UltraSparc 300MHz processor and 256MB RAM) connected by a 2D wraparound mesh network.... ..."
Results 1 - 10
of
96,489