Results 1 -
7 of
7
The Uniform Memory Hierarchy Model of Computation
- Algorithmica
, 1992
"... The Uniform Memory Hierarchy (UMH) model introduced in this paper captures performance-relevant aspects of the hierarchical nature of computer memory. It is used to quantify architectural requirements of several algorithms and to ratify the faster speeds achieved by tuned implementations that use im ..."
Abstract
-
Cited by 108 (9 self)
- Add to MetaCart
The Uniform Memory Hierarchy (UMH) model introduced in this paper captures performance-relevant aspects of the hierarchical nature of computer memory. It is used to quantify architectural requirements of several algorithms and to ratify the faster speeds achieved by tuned implementations that use improved data-movement strategies. A sequential computer's memory is modelled as a sequence hM 0 ; M 1 ; :::i of increasingly large memory modules. Computation takes place in M 0 . Thus, M 0 might model a computer's central processor, while M 1 might be cache memory, M 2 main memory, and so on. For each module M U , a bus B U connects it with the next larger module M U+1 . All buses may be active simultaneously. Data is transferred along a bus in fixed-sized blocks. The size of these blocks, the time required to transfer a block, and the number of blocks that fit in a module are larger for modules farther from the processor. The UMH model is parameterized by the rate at which the blocksizes i...
Modeling Parallel Computers as Memory Hierarchies
- In Proc. Programming Models for Massively Parallel Computers
, 1993
"... A parameterized generic model that captures the features of diverse computer architectures would facilitate the development of portable programs. Specific models appropriate to particular computers are obtained by specifying parameters of the generic model. A generic model should be simple, and for ..."
Abstract
-
Cited by 41 (6 self)
- Add to MetaCart
A parameterized generic model that captures the features of diverse computer architectures would facilitate the development of portable programs. Specific models appropriate to particular computers are obtained by specifying parameters of the generic model. A generic model should be simple, and for each machine that it is intended to represent, it should have a reasonably accurate specific model. The Parallel Memory Hierarchy (PMH) model of computation uses a single mechanism to model the costs of both interprocessor communication and memory hierarchy traffic. A computer is modeled as a tree of memory modules with processors at the leaves. All data movement takes the form of block transfers between children and their parents. This paper assesses the strengths and weaknesses of the PMH model as a generic model. 1 Introduction The raw computing power of multiprocessor computers is exploding. The challenge is to create software that can take advantage of this computing power. The diversit...
Large-Scale Sorting in Uniform Memory Hierarchies
, 1992
"... We present several efficient algorithms for sorting on the uniform memory hierarchy (UMH), introduced by Alpern, Carter, and Feig, and its parallelization P-UMH. We give optimal and nearly-optimal algorithms for a wide range of bandwidth degradations, including a parsimonious algorithm for constant ..."
Abstract
-
Cited by 24 (5 self)
- Add to MetaCart
We present several efficient algorithms for sorting on the uniform memory hierarchy (UMH), introduced by Alpern, Carter, and Feig, and its parallelization P-UMH. We give optimal and nearly-optimal algorithms for a wide range of bandwidth degradations, including a parsimonious algorithm for constant bandwidth. We also develop optimal sorting algorithms for all bandwidths for other versions of UMH and P-UMH, including natural restrictions we introduce called RUMH and P-RUMH, which more closely correspond to current programming languages.
Towards a Model for Portable Parallel Performance: Exposing the Memory Hierarchy
, 1992
"... The challenge of building a program that attains high performance on a variety of parallel computers is formidable. Actually, attaining high performance on a variety of sequential computers is challenging. Indeed, its hard enough to get high performance on a single sequential computer. Constructing ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
The challenge of building a program that attains high performance on a variety of parallel computers is formidable. Actually, attaining high performance on a variety of sequential computers is challenging. Indeed, its hard enough to get high performance on a single sequential computer. Constructing a high-performance program requires detailed knowledge of the computer 's architectural features --- its memory hierarchy in particular. This knowledge constitutes a detailed, albeit informal, model of computation against which the performance program is written. Similar characteristics must be considered in building a portable high-performance program but the appropriate details are elusive and often unavailable when the program is written. In order to support this type of programming, we call for a generic model. Such a model is parameterized by machine parameters. Judicious specification of these parameters results in a specific model that should capture the performance-relevant features...
Using Visualization To Understand The Behavior Of Computer Systems
, 2001
"... As computer systems continue to grow rapidly in both complexity and scale, developers need tools to help them understand the behavior and performance of these systems. While information visualization is a promising technique, most existing computer systems visualizations have focused on very specifi ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
As computer systems continue to grow rapidly in both complexity and scale, developers need tools to help them understand the behavior and performance of these systems. While information visualization is a promising technique, most existing computer systems visualizations have focused on very specific problems and data sources, limiting their applicability. This dissertation introduces Rivet, a general-purpose environment for the development of computer systems visualizations. Rivet can be used for both real-time and post-mortem analyses of data from a wide variety of sources. The modular architecture of Rivet enables sophisticated visualizations to be assembled using simple building blocks representing the data, the visual representations, and the mappings between them. The implementation of Rivet enables the rapid prototyping of visualizations through a scripting language interface while still providing high-performance graphics and data management. The effectiveness of Rivet as a tool for computer systems analysis is demonstrated through a collection of case studies. Visualizations created using Rivet have been used to display: (a) line-by-line execution data from the SUIF Explorer interactive parallelizing compiler, enabling programmers to maximize the parallel speedups of their applications; (b) detailed memory system utilization data from the FlashPoint memory profiler, providing insights on both sequential and parallel program bottlenecks; (c) the behavior of applications running on superscalar processors, allowing developers to take full advantage of these complex CPUs; and (d) the real-time performance of computer systems and clusters, drawing attention to interesting or anomalous behavior. In addition to these focused examples, Rivet has been also used in co...
A Case-study in Performance Programming: Seismic Migration
, 1991
"... This paper discusses methods which can be used to achieve improved computing performance from the available hardware. We illustrate general principles with a case study of the development and tuning of a seismic migration program on an IBM RISC System/6000 computer. We improved the uniprocessor perf ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
This paper discusses methods which can be used to achieve improved computing performance from the available hardware. We illustrate general principles with a case study of the development and tuning of a seismic migration program on an IBM RISC System/6000 computer. We improved the uniprocessor performance from 8 to 26 Mflops on a 20 MHertz machine and restructured the code to run on a network of workstations. The principles can be applied to many scientific codes and are relevant to attaining high performance on a wide class of machines, including parallel and distributed systems. 1 INTRODUCTION Progress in numerically intensive computation (NIC) has relied largely on the ability of application experts to devise better algorithms and on the ability of engineers to build more powerful computing machinery and to connect more machines together in more powerful configurations. Unfortunately, there have not been comparable advances in the art and science of efficiently using these machine...
Minimizing the Input/Output Bottleneck
, 1992
"... this paper, we assume that all graphs are undirected, an assumption that may not hold for certain applications such as hypertext and object-oriented databases. One important assumption of our model is that data may be multiply represented in blocks. This is a stronger assumption than that used, for ..."
Abstract
- Add to MetaCart
this paper, we assume that all graphs are undirected, an assumption that may not hold for certain applications such as hypertext and object-oriented databases. One important assumption of our model is that data may be multiply represented in blocks. This is a stronger assumption than that used, for example, by external

