• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 11,070
Next 10 →

Table 1: Sparse Cholesky, 494bps|E ects of private cache miss rates.

in Communication Mechanisms in Shared Memory Multiprocessors
by Gregory T. Byrd, Bruce A. Delagi, Michael J. Flynn
"... In PAGE 4: ... approach less attractive than the lock mechanism by itself. Varying Private Miss Rate Table1 shows the e ects of varying the miss rate for private data and instructions between 0% and 10%. The model for private ( rst-level) cache misses is a statistical one, which makes no distinction be- tween instruction and data references.... ..."

Table 6. Throughput, response time, and hit rate for the browsing mix for 4 Web servers, with a single shared cache, a private cache on each Web server, and a two-level cache consisting of a private cache on each front-end and a shared cache on a dedicated machine

in Transparent Caching with Strong Consistency in Dynamic Content Web Sites
by Cristiana Amza, Emmanuel Cecchet, Alan L. Cox, Julie Marguerite, Gokul Soundararajan, Willy Zwaenepoel 2005
"... In PAGE 10: ...ocating the cache on the front-end vs. locating it elsewhere. This result has to be re- examined when there are multiple front-ends, because of the need to enforce consis- tency between the front-end caches and because of the fact that each front-end cache only sees the traffic going through its front-end. Table6 shows throughput, response time and hit rate for the browsing mix for a single shared cache on a dedicated ma- chine, a private cache on each of the front-ends, and a two-level cache consisting of a private cache on each front-end and a shared cache on a dedicated machine. Table 7... ..."
Cited by 2

Table 7. Throughput, response time, and hit rate for the shopping mix for 4 Web servers, with a single shared cache, a private cache on each Web server, and a two-level cache consisting of a private cache on each front-end and a shared cache on a dedicated machine

in Transparent Caching with Strong Consistency in Dynamic Content Web Sites
by Cristiana Amza, Emmanuel Cecchet, Alan L. Cox, Julie Marguerite, Gokul Soundararajan, Willy Zwaenepoel 2005
Cited by 2

Table 6. Throughput, response time, and hit rate for the shopping mix for 4 Web servers, with a single shared cache, a private cache on each Web server, and a two-level cache consisting of a private cache on each front-end and a shared cache on a dedicated machine

in Transparent Caching and Consistency in Dynamic Content Web Sites
by Cristiana Amza

Table 7. Throughput, response time, and hit rate for the brows- ing mix for 4 Web servers, with a single shared cache, a private cache on each Web server, and a two-level cache consisting of a private cache on each front-end and a shared cache on a dedi- cated machine

in Abstract Transparent Caching with Strong Consistency in Dynamic Content Web Sites
by Cristiana Amza

Table 1. Benchmarks used in our experiments, their input parameters, and cache energy consumptions (for a private cache based system). We observe that in L1 leakage and dynamic energy consumptions are of similar magnitude, whereas in L2 dynamic energy consumption dominates (due to the leakage control mechanism).

in CCC: Crossbar Connected Caches for Reducing Energy Consumption of On-Chip Multiprocessors
by Lin Li, N. Vijaykrishnan, Mahmut Kandemir, Mahmut K, Mary Jane Irwin, Ismail Kadayif
"... In PAGE 4: ... We use a set of benchmarks from the SPLASH-2 suite [13]: barnes, ocean1, ocean2, radix, raytrace, and water. The important characteristics of these benchmarks are listed in Table1 . These codes differ from each other in their degree of instruction and data sharing (as pointed out earlier).... ..."

Table II gives further insight into the individual work- loads that are consolidated in this study. The percentage of misses to the last level of private cache that result in an on-chip cache to cache transfer are given as well as the number of cache line sized blocks that are touched during the simulation. These workloads exhibit a range of misses that are satisfied by cache-to-cache transfers and different working set sizes; combining workloads gives insight into the different stresses placed on an architecture.

in An evaluation of server consolidation workloads for multi-core designs
by Natalie Enright Jerger, Dana Vantrease, Mikko Lipasti 2007
Cited by 4

Table 3: Important working sets and their growth rates. DS represents the data set size and C is the number of cores. Working set sizes are taken from Figure 3. Values for native input set are analytically derived estimates. Working sets that grow proportional to the number of cores C are aggregated private working sets and can be split up to fit into correspondingly smaller, private caches.

in The PARSEC benchmark suite: Characterization and architectural implications
by Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, Kai Li
"... In PAGE 16: ... Our results are pre- sented in Figure 3. In Table3 we summarize the important characteristics of the identified working sets. Most work- loads exhibit well-defined working sets with clearly identifi- able points of inflection.... In PAGE 17: ... Data assumes a shared 4-way associative cache with 64 byte lines. WS1 and WS2 refer to important working sets which we analyze in more detail in Table3 . Cache requirements of PARSEC benchmark programs can reach hundreds of megabytes.... In PAGE 19: ... Figure 6 shows a large amount of writes to shared data, but contrary to intuition its share di- minishes rapidly as the number of cores is increased. This effect is caused by a growth of the working sets of x264: Table3 shows that both working set WS1 and WS2 grow pro- portional to the number of cores. WS1 is mostly composed of thread-private data and is the one which is used more intensely.... ..."

Table 5-2. Microarchitecture configuration. single processor core (PE) slipstream memory hierarchy private L1 instr. cache (see memory hier. column) size = 64 KB caches

in SLIPSTREAM PROCESSORS
by Zachary Robert, Slipstream Processors
"... In PAGE 10: ...able 4-4. IR-misprediction rate, recovery latency, slack, and delay buffer length............................................ 45 Table5 -1.... In PAGE 10: ...able 5-1. Qualitative comparisons of duplication and recovery methods. ........................................................ 64 Table5 -2.... In PAGE 10: ...able 5-2. Microarchitecture configuration....................................................................................................... 67 Table5 -3.... In PAGE 73: ...63 5.4 Qualitative Comparisons of Duplication and Recovery Methods Table5 -1 summarizes the advantages, disadvantages, and required hardware support of the two memory duplication methods (top-half) and three memory recovery methods (bottom-half). Notice the four useful measurements introduced in Sections 5.... In PAGE 73: ... Results in Section 5.6 quantify much of the information summarized in Table5 -1. Note that the cache-based value prediction technique is not listed in Table 5-1, but is used in conjunction with either invalidation-based recovery model to reduce the performance impact of recovery-induced misses.... In PAGE 73: ...kipped-write relate to recovery. Results in Section 5.6 quantify much of the information summarized in Table 5-1. Note that the cache-based value prediction technique is not listed in Table5 -1, but is used in conjunction with either invalidation-based recovery model to reduce the performance impact of recovery-induced misses. Figure 5-4 shows the original slipstream microarchitecture with software-based memory duplication.... In PAGE 74: ...64 Table5 -1. Qualitative comparisons of duplication and recovery methods.... In PAGE 76: ... The functional simulator checks retired R-stream control flow and data flow outcomes. Microarchitecture parameters are listed in Table5 -2. The top-left portion of the table lists parameters for individual processors within a CMP.... In PAGE 77: ...nv.-dirty, or inv./inv.-dirty with value prediction The Simplescalar [5] compiler and ISA are used. We use eight SPEC2000 integer benchmarks compiled with -O3 optimization and run with ref input datasets ( Table5 -3). The first billion instructions are skipped and then 100 million instructions are simulated.... In PAGE 78: ...68 have to maintain several full memory images to measure the number of stale, self-repair, persistent-stale, and persistent-skipped-write references (this is a statistics-gathering issue). Table5 -3. Benchmarks.... ..."

Table 2. Contemporary RISC processors features

in unknown title
by unknown authors
Next 10 →
Results 1 - 10 of 11,070
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University