• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 1,135
Next 10 →

An adaptive, nonuniform cache structure for wire-delay dominated on-chip caches

by Changkyu Kim, Doug Burger, Stephen W. Keckler - In International Conference on Architectural Support for Programming Languages and Operating Systems , 2002
"... Growing wire delays will force substantive changes in the designs of large caches. Traditional cache architectures assume that each level in the cache hierarchy has a single, uniform access time. Increases in on-chip communication delays will make the hit time of large on-chip caches a function of a ..."
Abstract - Cited by 314 (39 self) - Add to MetaCart
within the same level of the cache. We show that, for multi-megabyte level-two caches, an adaptive, dynamic NUCA design achieves 1.5 times the IPC of a Uniform Cache Architecture of any size, outperforms the best static NUCA scheme by 11%, outperforms the best three-level hierarchywhile using less

An Adaptive, Non-Uniform Cache Structure for . . .

by Changkyu Kim, Doug Burger, Stephen W. Keckler - IN PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS , 2002
"... Growing wire delays will force substantive changes in the designs of large caches. Traditional cache architectures assume that each level in the cache hierarchy has a single, uniform access time. Increases in on-chip communication delays will make the hit time of large on-chip caches a function of a ..."
Abstract - Add to MetaCart
within the same level of the cache. We show that, for multi-megabyte level-two caches, an adaptive, dynamic NUCA design achieves 1.5 times the IPC of a Uniform Cache Architecture of any size, outperforms the best static NUCA scheme by 11%, outperforms the best three-level hierarchy-- while using less

Effective Use of The Level-Two Cache for Skewed Tiling

by Purdue E-pubs, Yonghong Song, Zhiyuan Li, Yonghong Song, Zhiyuan Li, Yonghong Song, Zhiyuan Li , 2001
"... Tiling is a well-known loop transformation technique to enhance temporal data locality. In our previous work, we have developed a skewed tiling technique for relaxation codes, which requires to apply loop skewing before loop tiling. In this paper, we study how to effectively usc the level-two cache ..."
Abstract - Add to MetaCart
Tiling is a well-known loop transformation technique to enhance temporal data locality. In our previous work, we have developed a skewed tiling technique for relaxation codes, which requires to apply loop skewing before loop tiling. In this paper, we study how to effectively usc the level-two cache

An Analysis of Adding a Backside Level-Two Cache to an Existing Microprocessor

by Kathryn Hammel, Kathryn Christine Hammel, B. S. Computer Engineering, Kathryn Christine Hammel, Supervisor Lizy, Kurian John
"... Copyright by ..."
Abstract - Add to MetaCart
Copyright by

Prefetching using Markov predictors

by Doug Joseph, Dirk Grunwald - In ISCA , 1997
"... Prefetching is one approach to reducing the latency of memory op-erations in modem computer systems. In this paper, we describe the Markov prefetcher. This prefetcher acts as an interface between the on-chip and off-chip cache, and can be added to existing com-puter designs. The Markov prefetcher is ..."
Abstract - Cited by 308 (1 self) - Add to MetaCart
by the processor. In our cycle-level simulations, the Markov Prefetcher reduces the overall execution stalls due to in-struction and data memory operations by an average of 54 % for various commercial benchmarks while only using two thrds the memory of a demand-fetch cache organization. 1

Tempest and Typhoon: User-level Shared Memory

by Steven K. Reinhardt, James R. Larus, David A. Wood - In Proceedings of the 21st Annual International Symposium on Computer Architecture , 1994
"... Future parallel computers must efficiently execute not only hand-coded applications but also programs written in high-level, parallel programming languages. Today’s machines limit these programs to a single communication paradigm, either message-passing or shared-memory, which results in uneven perf ..."
Abstract - Cited by 309 (27 self) - Add to MetaCart
-programmable, user-level processor in the network interface. We demonstrate the utility of Tempest with two examples. First, the Stache protocol uses Tempest’s finegrain access control mechanisms to manage part of a processor’s local memory as a large, fully-associative cache for remote data. We simulated Typhoon

The filter cache: An energy efficient memory structure

by Johnson Kin, Munish Gupta, William H. Mangione-smith - In Proceedings of the 1997 International Symposium on Microarchitecture , 1997
"... Most modern microprocessors employ one or two levels of on-chip caches in order to improve performance. These caches are typically implemented with static RAM cells and often occupy a large portion of the chip area. Not surprisingly, these caches often consume a significant amount of power. In many ..."
Abstract - Cited by 222 (4 self) - Add to MetaCart
Most modern microprocessors employ one or two levels of on-chip caches in order to improve performance. These caches are typically implemented with static RAM cells and often occupy a large portion of the chip area. Not surprisingly, these caches often consume a significant amount of power. In many

UNIX Disk Access Patterns

by Chris Ruemmler, John Wilkes , 1993
"... Disk access patterns are becoming ever more important to understand as the gap between processor and disk performance increases. The study presented here is a detailed characterization of every lowlevel disk access generated by three quite different systems over a two month period. The contributions ..."
Abstract - Cited by 277 (20 self) - Add to MetaCart
of write caching at the disk level, we found that using a small non-volatile cache at each disk allowed writes to be serviced considerably faster than with a regular disk. In particular, short bursts of writes go much faster -- and such bursts are common: writes rarely come singly. Adding even 8KB of non

Evaluating Stream Buffers as a Secondary Cache Replacement

by Subbarao Palacharla - In Proceedings of the 21st Annual International Symposium on Computer Architecture , 1994
"... Today’s commodity microprocessors require a low latency memory system to achieve high sustained performance. The conventional high-performance memory system provides fast data access via a large secondary cache. But large secondary caches can be expensive, particularly in large-scale parallel system ..."
Abstract - Cited by 204 (0 self) - Add to MetaCart
systems with many processors (and thus many caches). We evaluate a memory system design that can be both cost-effective as well as provide better performance, particularly for scientific workloads: a single level of (on-chip) cache backed up only by Jouppi’s stream buffers [10] and a main memory

The HP AutoRAID hierarchical storage system

by John Wilkes, Richard Golding, Carl Staelin, Tim Sullivan - ACM Transactions on Computer Systems , 1995
"... Configuring redundant disk arrays is a black art. To configure an array properly, a system administrator must understand the details of both the array and the workload it will support. Incorrect understanding of either, or changes in the workload over time, can lead to poor performance. We present a ..."
Abstract - Cited by 263 (15 self) - Add to MetaCart
a solution to this problem: a two-level storage hierarchy implemented inside a single diskarray controller. In the upper level of this hierarchy, two copies of active data are stored to provide full redundancy and excellent performance. In the lower level, RAID 5 parity protection is used to provide
Next 10 →
Results 1 - 10 of 1,135
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University