• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 316
Next 10 →

Tempest and Typhoon: User-level Shared Memory

by Steven K. Reinhardt, James R. Larus, David A. Wood - In Proceedings of the 21st Annual International Symposium on Computer Architecture , 1994
"... Future parallel computers must efficiently execute not only hand-coded applications but also programs written in high-level, parallel programming languages. Today’s machines limit these programs to a single communication paradigm, either message-passing or shared-memory, which results in uneven perf ..."
Abstract - Cited by 309 (27 self) - Add to MetaCart
performance. This paper addresses this problem by defining an interface, Tempest, that exposes low-level communication and memory-system mechanisms so programmers and compilers can customize policies for a given application. Typhoon is a proposed hardware platform that implements these mechanisms with a fully

McRT-STM: a High Performance Software Transactional Memory System for a Multi-Core Runtime

by Bratin Saha, Ali-reza Adl-tabatabai, Richard L. Hudson, Chi Cao Minh, Benjamin Hertzberg - In Proc. of the 11th ACM Symp. on Principles and Practice of Parallel Programming , 2006
"... Applications need to become more concurrent to take advantage of the increased computational power provided by chip level multiprocessing. Programmers have traditionally managed this concurrency using locks (mutex based synchronization). Unfortunately, lock based synchronization often leads to deadl ..."
Abstract - Cited by 241 (14 self) - Add to MetaCart
RT-STM exports interfaces that can be used from C/C++ programs directly or as a target for compilers translating higher level linguistic constructs. We present a detailed performance analysis of various STM design tradeoffs such as pessimistic versus optimistic concurrency, undo logging versus write buffering

Compiler-directed Data Prefetching in Multiprocessors with Memory Hierarchies

by Edward H. Gornish, Elana D. Granston, Alexander V. Veidenbaum - In International Conference on Supercomputing , 1990
"... Memory hierarchies are used by multiprocessor systems to reduce large memory access times. It is necessary to automatically manage such a hierarchy, to obtain effective memory utilization. In this paper, we discuss the various issues involved in obtaining an optimal memory management strategy for a ..."
Abstract - Cited by 92 (7 self) - Add to MetaCart
memory hierarchy. We present an algorithm for finding the earliest point in a program that a block of data can be prefetched. This determination is based on the control and data dependences in the program. Such a method is an integral part of more general memory management algorithms. We demonstrate our

Compiler-directed page coloring for multiprocessors

by Edouard Bugnion, Jennifer M. Anderson, Todd C. Mowry, Mendel Rosenblum, Monica S. Lam - In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII , 1996
"... This paper presents a new technique, compiler-directed page coloring, that eliminates conflict misses in multiprocessor applications. It enables applications to make better use of the increased aggregate cache size available in a multiprocessor. This technique uses the compiler’s knowledge of the ac ..."
Abstract - Cited by 66 (8 self) - Add to MetaCart
of numeric programs. We used the SimOS machine simulator to analyze the applications and isolate their performance bottlenecks. We also validated these results on a real machine, an eight-processor 350MHz Digital AlphaServer. Compiler-directed page coloring leads to significant performance improvements

DyC: An Expressive Annotation-Directed Dynamic Compiler for C

by Brian Grant , Markus Mock, Matthai Philipose, Craig Chambers, Susan J. Eggers
"... We present the design of DyC, a dynamic-compilation system for C based on run-time specialization. Directed by a few declarative user annotations that specify the variables and code on which dynamic compilation should take place, a binding-time analysis computes the set of run-time constants at each ..."
Abstract - Cited by 110 (4 self) - Add to MetaCart
We present the design of DyC, a dynamic-compilation system for C based on run-time specialization. Directed by a few declarative user annotations that specify the variables and code on which dynamic compilation should take place, a binding-time analysis computes the set of run-time constants

Memory-Hierarchy Management

by Steve Carr , 1994
"... The trend in high-performance microprocessor design is toward increasing computational power on the chip. Microprocessors can now process dramatically more data per machine cycle than previous models. Unfortunately, memory speeds have not kept pace. The result is an imbalance between computation spe ..."
Abstract - Cited by 56 (14 self) - Add to MetaCart
is a step in the wrong direction. Compilers, not programmers, should handle machine-specific implementation details. To this end, this thesis develops and experiments with compiler algorithms that manage the memory hierarchy of a machine for floating-point intensive numerical codes. Specifically, we

The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor

by Anant Agarwal, David Chaiken, David Kranz, John Kubiatowicz, Kiyoshi Kurihara, Gino Maa, Dan Nussbaum, Mike Parkin, Donald Yeung - In Proceedings of Workshop on Scalable Shared Memory Multiprocessors , 1991
"... The Alewife multiprocessor project focuses on the architecture and design of a large-scale parallel machine. The machine uses a low-dimensional direct interconnection network to provide scalable communication bandwidth, while allowing the exploitation of locality. Despite its distributed-memory arch ..."
Abstract - Cited by 148 (25 self) - Add to MetaCart
architecture, Alewife allows efficient shared-memory programming through a multilayered approach to locality management. A new scalable cache-coherence scheme called LimitLESS directories allows the use of caches for reducing communication latency and network bandwidth requirements. Alewife also employs run

Data-centric Multi-level Blocking

by Induprakas Kodukula, Nawaaz Ahmed, d Keshav Pingali , 1997
"... We present a simple and novel framework for generating blocked codes for high-performance machines with a memory hierarchy. Unlike traditional compiler techniques like tiling, which are based on reasoning about the control flow of programs, our techniques are based on reasoning directly about the fl ..."
Abstract - Cited by 155 (10 self) - Add to MetaCart
We present a simple and novel framework for generating blocked codes for high-performance machines with a memory hierarchy. Unlike traditional compiler techniques like tiling, which are based on reasoning about the control flow of programs, our techniques are based on reasoning directly about

Compiler-Directed Scratchpad Memory Management via Graph Coloring

by Lian Li, Hui Feng, Jingling Xue
"... Scratchpad memory (SPM), a fast on-chip SRAM managed by software, is widely used in embedded systems. This paper introduces a general-purpose compiler approach, called memory coloring, to assign static data aggregates such as arrays and structs in a program to an SPM. The novelty of this approach li ..."
Abstract - Cited by 5 (1 self) - Add to MetaCart
Scratchpad memory (SPM), a fast on-chip SRAM managed by software, is widely used in embedded systems. This paper introduces a general-purpose compiler approach, called memory coloring, to assign static data aggregates such as arrays and structs in a program to an SPM. The novelty of this approach

Iterative Compilation and Performance Prediction for Numerical Applications

by Grigori G. Fursin , 2004
"... As the current rate of improvement in processor performance far exceeds the rate of memory performance, memory latency is the dominant overhead in many performance critical applications. In many cases, automatic compiler-based approaches to improving memory performance are limited and programmers fr ..."
Abstract - Cited by 17 (10 self) - Add to MetaCart
and there are no simple criteria to stop optimisations i.e. when optimal memory performance has been achieved or sufficiently approached. This thesis presents a platform independent optimisation approach for numerical applications based on iterative feedback-directed program restructuring using a new reasonably fast
Next 10 →
Results 1 - 10 of 316
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University