• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 10,101
Next 10 →

Table IV. Criteria that relate interaction with artifact Criterion Reference Division Interactive: each participant has a parallel electronic communication channel towards a group memory.

in Refining Temporal Criteria to Classify Collaborative Systems
by Hector B. Antillanca, David, A. Fuller 1999
Cited by 1

Table 1 summarizes the breakdown of execution time. Some of the above categories can be further divided depending on the objective of profiling. A fine grained profiling tool may want to include categories like cache misses and page fault overheads. In our implementation, we provide a finer division of communication overhead. The point is that the non-scalable code can be classified into meaningful categories for program profiling.

in A Performance Debugging Tool for High Performance Fortran Programs
by Takashi Suzuoka, Jaspal Subhlok, Thomas Gross
"... In PAGE 4: ... Table1 : Processor states in parallel execution We believe that it is important that the profiler provide a precise measurement of these categories and not just a summary judgement. In general, a simple verdict (such as poor scalability, load imbalance, or poor mapping of distributed arrays) cannot be made accurately as there can be many possible causes of poor performance.... ..."

Table 9.4: Forbidden Operators for Communication Objects

in Implementation of the Coordination Language C&Co
by Alexander Forst

Table 2: Communication Overheads

in Parallelizing molecular dynamics programs for distributed memory machines
by Yuan-shin Hwang, Raja Das, Joel H. Saltz, Bernard Brooks 1995
"... In PAGE 12: ... Several regular and irregular data partitioning methods have been implemented to compare the communication overheads. Table2 presents average communication times of different data partitioning methods from 16 to 128 processors. Atom decomposition was used as the iteration partitioning algorithm.... In PAGE 12: ... BLOCK divides an array into contiguous chunks of size N=P and assigns one block to each processor, whereas CYCLIC specifies a round-robin division of an array and assigns every P th element to the same processor. Table2 shows that both BLOCK and CYCLIC do not exploit locality and, therefore, cause higher communication overheads. Weighted BLOCK divides an array into contiguous chunks with different sizes so that each chunk would have the same amount of computational work.... ..."
Cited by 47

Table 2: Communication Overheads

in Parallelizing Molecular Dynamics Programs for Distributed Memory Machines: An Application of the CHAOS Runtime Support Library
by Yuan-shin Hwang, Raja Das, Joel H. Saltz, Milan Hodoscek, Milan Hodo S Cek, Bernard Brooks
"... In PAGE 12: ... Several regular and irregular data partitioning methods have been implemented to compare the communication overheads. Table2 presents average communication times of different data partitioning methods from 16 to 128 processors. Atom decomposition was used as the iteration partitioning algorithm.... In PAGE 12: ... BLOCK divides an array into contiguous chunks of size N=P and assigns one block to each processor, whereas CYCLIC specifies a round-robin division of an array and assigns every Pth element to the same processor. Table2 shows that both BLOCK and CYCLIC do not exploit locality and, therefore, cause higher communication overheads. Weighted BLOCK divides an array into contiguous chunks with different sizes so that each chunk would have the same amount of computational work.... ..."

Table 2: Communication Overheads

in Parallelizing Molecular Dynamics Programs for Distributed Memory Machines: An Application of the Chaos Runtime Support Library
by Yuan-shin Hwang, Raja Das, Joel Saltz, Bernard Brooks, Milan Hodo Scek
"... In PAGE 12: ... Several regular and irregular data partitioning methods have been implemented to compare the communication overheads. Table2 presents average communication times of different data partitioning methods from 16 to 128 processors. Atom decomposition was used as the iteration partitioning algorithm.... In PAGE 12: ... BLOCK divides an array into contiguous chunks of size N=P and assigns one block to each processor, whereas CYCLIC specifies a round-robin division of an array and assigns every Pth element to the same processor. Table2 shows that both BLOCK and CYCLIC do not exploit locality and, therefore, 1Not available due to memory limitation of iPSC/860... ..."

Table 2: Communication Overheads

in Parallelizing Molecular Dynamics Programs for Distributed Memory Machines: An Application of the Chaos Runtime Support Library
by Yuan-Shin Hwang, Raja Das, Joel Saltz, Bernard Brooks, Milan Hodo Scek
"... In PAGE 12: ... Several regular and irregular data partitioning methods have been implemented to comparethe communication overheads. Table2 presents average communication times of different data partitioning methods from 16 to 128 processors. Atom decomposition was used as the iteration partitioning algorithm.... In PAGE 12: ... BLOCK divides an array into contiguous chunks of size a194a37a210a87a211 and assigns one block to each processor, whereas CYCLIC specifies a round-robin division of an array and assigns every a211a69a218a110a219 element to the same processor. Table2 shows that both BLOCK and CYCLIC do not exploit locality and, therefore, 1Not available due to memory limitation of iPSC/860... ..."

Table 3: Communication and synchronisation cost for data distributions with p = 100

in Scientific computing on bulk synchronous parallel architectures
by R. H. Bisseling, W. F. Mccoll Y 1994
"... In PAGE 21: ... It does not perform very well on small problems and even for larger problems there are superior distributions, such as the diagonal quot; distribution, which imposes an equal division of the matrix diagonal over the processors and hence causes a good load balance in the summation of partial sums. The results of Table3 show that it is quite hard to achieve a low communication cost for general sparse matrices, i.e.... ..."
Cited by 81

Table 3: Communication and synchronisation cost for data distributions with p = 100

in Scientific Computing on Bulk Synchronous Parallel Architectures
by R. H. Bisseling, W. F. Mccoll
"... In PAGE 21: ... It does not perform very well on small problems and even for larger problems there are superior distributions, such as the diagonal quot; distribution, which imposes an equal division of the matrix diagonal over the processors and hence causes a good load balance in the summation of partial sums. The results of Table3 show that it is quite hard to achieve a low communication cost for general sparse matrices, i.e.... ..."

Table 3: Divisibility of the discriminants

in Elliptic curves and primality proving
by A. O. L. Atkin, F. Morain Yz, C. F. Gauss 1993
"... In PAGE 12: ... is a norm in K (provided that (;D=N) = +1). This yields Table 2. Let S be a nite set of primes (here 4 and 8 are assumed to be distinct primes). We de ne N p (S) to be the number of D in D which are divisible by at least one prime of S: This quantity is tabulated in Table3 . From the above results, it is quite clear that bad numbers are those which are quadratic nonresidue modulo small primes, suchasN ;1 mod 12, which kill o one third of our discriminants.... ..."
Cited by 124
Next 10 →
Results 1 - 10 of 10,101
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University