• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 8,703
Next 10 →

Studying Multicore Processor Scaling via Reuse Distance Analysis

by Meng-ju Wu, Minshu Zhao, Donald Yeung
"... The trend for multicore processors is towards increasing numbers of cores, with 100s of cores–i.e. large-scale chip multiprocessors (LCMPs)–possible in the future. The key to realizing the potential of LCMPs is the cache hierarchy, so studying how memory performance will scale is crucial. Reuse dist ..."
Abstract - Cited by 10 (2 self) - Add to MetaCart
The trend for multicore processors is towards increasing numbers of cores, with 100s of cores–i.e. large-scale chip multiprocessors (LCMPs)–possible in the future. The key to realizing the potential of LCMPs is the cache hierarchy, so studying how memory performance will scale is crucial. Reuse

ABSTRACT Title of dissertation: Studying the Impact of Multicore Processor Scaling on Cache Coherence Directories via Reuse Distance Analysis

by unknown authors
"... Directories are one key part of a processor’s cache coherence hardware, and constitute one of the main bottlenecks in multicore processor scaling, e.g. core count and cache size scaling. Many research effects have tried to improve the scalability of the directory, but most of them only simulate a fe ..."
Abstract - Add to MetaCart
Directories are one key part of a processor’s cache coherence hardware, and constitute one of the main bottlenecks in multicore processor scaling, e.g. core count and cache size scaling. Many research effects have tried to improve the scalability of the directory, but most of them only simulate a

Studying the Impact of Multicore Processor Scaling on Directory Techniques via Reuse Distance Analysis

by Minshu Zhao, Donald Yeung
"... Abstract—Researchers have proposed numerous directory techniques to address multicore scalability whose behavior de-pends on the CPU’s particular configuration, e.g. core count and cache size. As CPUs continue to scale, it is essential to explore the directory’s architecture dependences. However, th ..."
Abstract - Add to MetaCart
Abstract—Researchers have proposed numerous directory techniques to address multicore scalability whose behavior de-pends on the CPU’s particular configuration, e.g. core count and cache size. As CPUs continue to scale, it is essential to explore the directory’s architecture dependences. However

Real-Time Dynamic Voltage Scaling for Low-Power Embedded Operating Systems

by Padmanabhan Pillai, Kang G. Shin , 2001
"... In recent years, there has been a rapid and wide spread of nontraditional computing platforms, especially mobile and portable computing devices. As applications become increasingly sophisticated and processing power increases, the most serious limitation on these devices is the available battery lif ..."
Abstract - Cited by 501 (4 self) - Add to MetaCart
life. Dynamic Voltage Scaling (DVS) has been a key technique in exploiting the hardware characteristics of processors to reduce energy dissipation by lowering the supply voltage and operating frequency. The DVS algorithms are shown to be able to make dramatic energy savings while providing

Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors

by John M. Mellor-crummey, Michael L. Scott - ACM Transactions on Computer Systems , 1991
"... Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in shared-memory parallel programs. Unfortunately, typical implementations of busy-waiting tend to produce large amounts of memory and interconnect contention, introducing performance bottlenecks that become marke ..."
Abstract - Cited by 573 (32 self) - Add to MetaCart
markedly more pronounced as applications scale. We argue that this problem is not fundamental, and that one can in fact construct busy-wait synchronization algorithms that induce no memory or interconnect contention. The key to these algorithms is for every processor to spin on separate locally

Active Messages: a Mechanism for Integrated Communication and Computation

by Thorsten Von Eicken, David E. Culler, Seth Copen Goldstein, Klaus Erik Schauser , 1992
"... The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high com ..."
Abstract - Cited by 1054 (75 self) - Add to MetaCart
The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high

Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks

by Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly - In EuroSys , 2007
"... Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad applica-tion combines computational “vertices ” with communica-tion “channels ” to form a dataflow graph. Dryad runs the application by executing the vertices of this graph on a set of availa ..."
Abstract - Cited by 762 (27 self) - Add to MetaCart
simultaneously on multi-ple computers, or on multiple CPU cores within a computer. The application can discover the size and placement of data at run time, and modify the graph as the computation pro-gresses to make efficient use of the available resources. Dryad is designed to scale from powerful multi-core sin

The SPLASH-2 programs: Characterization and methodological considerations

by Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, Anoop Gupta - INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE , 1995
"... The SPLASH-2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed shared-address-space multiprocessors. In this context, this paper has two goals. One is to quantitatively characterize the SPLASH-2 programs in terms of fundamental propertie ..."
Abstract - Cited by 1420 (12 self) - Add to MetaCart
scale with problem size and the number of processors. The other, related goal is methodological: to assist people who will use the programs in architectural evaluations to prune the space of application and machine parameters in an informed and meaningful way. For example, by characterizing the working

The SGI Origin: A ccNUMA highly scalable server

by James Laudon, Daniel Lenoski - In Proceedings of the 24th International Symposium on Computer Architecture (ISCA’97 , 1997
"... The SGI Origin 2000 is a cache-coherent non-uniform memory access (ccNUMA) multiprocessor designed and manufactured by Silicon Graphics, Inc. The Origin system was designed from the ground up as a multiprocessor capable of scaling to both small and large processor counts without any bandwidth, laten ..."
Abstract - Cited by 497 (0 self) - Add to MetaCart
The SGI Origin 2000 is a cache-coherent non-uniform memory access (ccNUMA) multiprocessor designed and manufactured by Silicon Graphics, Inc. The Origin system was designed from the ground up as a multiprocessor capable of scaling to both small and large processor counts without any bandwidth

Scalable molecular dynamics with NAMD.

by James C Phillips , Rosemary Braun , Wei Wang , James Gumbart , Emad Tajkhorshid , Elizabeth Villa , Christophe Chipot , Robert D Skeel , Laxmikant Kalé , Klaus Schulten - J Comput Chem , 2005
"... Abstract: NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD scales to hundreds of processors on high-end parallel platforms, as well as tens of processors on low-cost commodity clusters, and also runs on individual desktop and la ..."
Abstract - Cited by 849 (63 self) - Add to MetaCart
Abstract: NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD scales to hundreds of processors on high-end parallel platforms, as well as tens of processors on low-cost commodity clusters, and also runs on individual desktop
Next 10 →
Results 1 - 10 of 8,703
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University