• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 11 - 20 of 4,465
Next 10 →

The SGI Origin: A ccNUMA highly scalable server

by James Laudon, Daniel Lenoski - In Proceedings of the 24th International Symposium on Computer Architecture (ISCA’97 , 1997
"... The SGI Origin 2000 is a cache-coherent non-uniform memory access (ccNUMA) multiprocessor designed and manufactured by Silicon Graphics, Inc. The Origin system was designed from the ground up as a multiprocessor capable of scaling to both small and large processor counts without any bandwidth, laten ..."
Abstract - Cited by 497 (0 self) - Add to MetaCart
The SGI Origin 2000 is a cache-coherent non-uniform memory access (ccNUMA) multiprocessor designed and manufactured by Silicon Graphics, Inc. The Origin system was designed from the ground up as a multiprocessor capable of scaling to both small and large processor counts without any bandwidth

Cache Equalizer: A Placement Mechanism for Chip Multiprocessor Distributed Shared Caches

by Mohammad Hammoud, Sangyeun Cho, Rami G. Melhem
"... This paper describes Cache Equalizer (CE), a novel distributed cache management scheme for large-scale chip multiprocessors (CMPs). Our work is motivated by large asymmetry in cache sets ’ usages. CE decouples the physical locations of cache blocks from their addresses for the sake of reducing misse ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
This paper describes Cache Equalizer (CE), a novel distributed cache management scheme for large-scale chip multiprocessors (CMPs). Our work is motivated by large asymmetry in cache sets ’ usages. CE decouples the physical locations of cache blocks from their addresses for the sake of reducing

C-AMTE: A Location Mechanism for Flexible Cache Management in Chip Multiprocessors

by Mohammad Hammoud, Sangyeun Cho, Rami Melhem , 2009
"... This paper describes Constrained Associative-Mapping-of-Tracking-Entries (C-AMTE), a scalable mechanism to facilitate flexible and efficient distributed cache management in large-scale chip multiprocessors (CMPs). C-AMTE enables fast locating of cache blocks in CMP cache schemes that employ one-to-o ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
This paper describes Constrained Associative-Mapping-of-Tracking-Entries (C-AMTE), a scalable mechanism to facilitate flexible and efficient distributed cache management in large-scale chip multiprocessors (CMPs). C-AMTE enables fast locating of cache blocks in CMP cache schemes that employ one

Efficient Cache Coherence Protocol in Tiled Chip Multiprocessors

by Alberto Ros, Manuel E. Acacio, Jose ́ M. Garćıa
"... Abstract — Although directory-based cache coher-ence protocols are the best choice when designing large-scale chip multiprocessors (CMPs), they in-troduce indirection to access directory information, which negatively impacts performance. In this work, we present DiCo-CMP, a cache coherence protocol ..."
Abstract - Add to MetaCart
Abstract — Although directory-based cache coher-ence protocols are the best choice when designing large-scale chip multiprocessors (CMPs), they in-troduce indirection to access directory information, which negatively impacts performance. In this work, we present DiCo-CMP, a cache coherence protocol

Sparcle: An Evolutionary Processor Design for Large-Scale Multiprocessors

by Anant Agarwal, John Kubiatowicz, David Kranz, Beng-Hong Lim, Donald Yeung, Godfrey D'Souza, Mike Parkin - IEEE MICRO , 1993
"... Sparcle is a processor chip developed jointly by MIT, LSI Logic, and SUN Microsystems, by evolving an existing RISC architecture towards a processor suited for large-scale multiprocessors. Sparcle supports three multiprocessor mechanisms: fast context switching, fast, user-level message handling, a ..."
Abstract - Cited by 112 (21 self) - Add to MetaCart
Sparcle is a processor chip developed jointly by MIT, LSI Logic, and SUN Microsystems, by evolving an existing RISC architecture towards a processor suited for large-scale multiprocessors. Sparcle supports three multiprocessor mechanisms: fast context switching, fast, user-level message handling

The Stanford FLASH multiprocessor

by Jeffrey Kuskin, David Ofelt, Mark Heinrich, John Heinlein, Richard Simoni, Kourosh Gharachorloo, John Chapin, David Nakahira, Joel Baxter, Mark Horowitz, Anoop Gupta, Mendel Rosenblum, John Hennessy - In Proceedings of the 21st International Symposium on Computer Architecture , 1994
"... The FLASH multiprocessor efficiently integrates support for cache-coherent shared memory and high-performance message passing, while minimizing both hardware and software overhead. Each node in FLASH contains a microprocessor, a portion of the machine’s global memory, a port to the interconnection n ..."
Abstract - Cited by 349 (20 self) - Add to MetaCart
The FLASH multiprocessor efficiently integrates support for cache-coherent shared memory and high-performance message passing, while minimizing both hardware and software overhead. Each node in FLASH contains a microprocessor, a portion of the machine’s global memory, a port to the interconnection

Address Remapping for Static NUCA in NoC-based Degradable Chip-Multiprocessors

by Ying Wang, Lei Zhang, Yinhe Han, Huawei Li, Xiaowei Li
"... Abstract—Large scale Chip-Multiprocessors (CMPs) generally employ Network-on-Chip (NoC) to connect the last level cache (LLC), which is generally organized as distributed NUCA (non-uniform cache access) arrays for scalability and efficiency. On the other hand, aggressive technology scaling induces s ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract—Large scale Chip-Multiprocessors (CMPs) generally employ Network-on-Chip (NoC) to connect the last level cache (LLC), which is generally organized as distributed NUCA (non-uniform cache access) arrays for scalability and efficiency. On the other hand, aggressive technology scaling induces

Computer Science- Research and Development manuscript No. (will be inserted by the editor) Predictive Analysis of a Hydrodynamics Application on Large-Scale CMP Clusters

by J. A. Davis, G. R. Mudalige, S. D. Hammond, J. A. Herdman, I. Miller, S. A. Jarvis, G. R. Mudalige, J. A. Herdman, I. Miller
"... Abstract We present the development of a predictive performance model for the high-performance computing code Hydra, a hydrodynamics benchmark developed and maintained by the United Kingdom Atomic Weapons Establishment (AWE). The developed model elucidates the parallel computation of Hydra, with whi ..."
Abstract - Add to MetaCart
, with which it is possible to predict its run-time and scaling performance on varying large-scale chip multiprocessor (CMP) clusters. A key feature of the model is its granularity; with the model we are able to separate the contributing costs, including computation, point-topoint communications, collectives

Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors

by Todd Mowry, Anoop Gupta - Journal of Parallel and Distributed Computing , 1991
"... The large latency of memory accesses is a major obstacle in obtaining high processor utilization in large scale shared-memory multiprocessors. Although the provision of coherent caches in many recent machines has alleviated the problem somewhat, cache misses still occur frequently enough that they s ..."
Abstract - Cited by 302 (18 self) - Add to MetaCart
The large latency of memory accesses is a major obstacle in obtaining high processor utilization in large scale shared-memory multiprocessors. Although the provision of coherent caches in many recent machines has alleviated the problem somewhat, cache misses still occur frequently enough

Disco: Running commodity operating systems on scalable multiprocessors

by Edouard Bugnion, Scott Devine, Mendel Rosenblum - ACM Transactions on Computer Systems , 1997
"... In this paper we examine the problem of extending modern operating systems to run efficiently on large-scale shared memory multiprocessors without a large implementation effort. Our approach brings back an idea popular in the 1970s, virtual machine monitors. We use virtual machines to run multiple c ..."
Abstract - Cited by 253 (10 self) - Add to MetaCart
In this paper we examine the problem of extending modern operating systems to run efficiently on large-scale shared memory multiprocessors without a large implementation effort. Our approach brings back an idea popular in the 1970s, virtual machine monitors. We use virtual machines to run multiple
Next 10 →
Results 11 - 20 of 4,465
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University