Results 1  10
of
116
Tempest and Typhoon: Userlevel Shared Memory
 In Proceedings of the 21st Annual International Symposium on Computer Architecture
, 1994
"... Future parallel computers must efficiently execute not only handcoded applications but also programs written in highlevel, parallel programming languages. Today’s machines limit these programs to a single communication paradigm, either messagepassing or sharedmemory, which results in uneven perf ..."
Abstract

Cited by 291 (25 self)
 Add to MetaCart
Future parallel computers must efficiently execute not only handcoded applications but also programs written in highlevel, parallel programming languages. Today’s machines limit these programs to a single communication paradigm, either messagepassing or sharedmemory, which results in uneven performance. This paper addresses this problem by defining an interface, Tempest, that exposes lowlevel communication and memorysystem mechanisms so programmers and compilers can customize policies for a given application. Typhoon is a proposed hardware platform that implements these mechanisms with a fullyprogrammable, userlevel processor in the network interface. We demonstrate the utility of Tempest with two examples. First, the Stache protocol uses Tempest’s finegrain access control mechanisms to manage part of a processor’s local memory as a large, fullyassociative cache for remote data. We simulated Typhoon on the Wisconsin Wind Tunnel and found that Stache running on Typhoon performs comparably (±30%) to an allhardware Dir N NB cachecoherence protocol for five sharedmemory programs. Second, we illustrate how programmers or compilers can use Tempest’s flexibility to exploit an application’s sharing patterns with a custom protocol. For the EM3D application, the custom protocol improves performance up to 35 % over the allhardware protocol.
The Landscape of Parallel Computing Research: A View from Berkeley
 TECHNICAL REPORT, UC BERKELEY
, 2006
"... All rights reserved. ..."
Point Set Surfaces
, 2001
"... We advocate the use of point sets to represent shapes. We provide a definition of a smooth manifold surface from a set of points close to the original surface. The definition is based on local maps from differential geometry, which are approximated by the method of moving least squares (MLS). We pre ..."
Abstract

Cited by 240 (34 self)
 Add to MetaCart
We advocate the use of point sets to represent shapes. We provide a definition of a smooth manifold surface from a set of points close to the original surface. The definition is based on local maps from differential geometry, which are approximated by the method of moving least squares (MLS). We present tools to increase or decrease the density of the points, thus, allowing an adjustment of the spacing among the points to control the fidelity of the representation. To display the point set surface, we introduce a novel point rendering technique. The idea is to evaluate the local maps according to the image resolution. This results in high quality shading effects and smooth silhouettes at interactive frame rates.
Prefuse: A toolkit for interactive information visualization
 In ACM Human Factors in Computing Systems (CHI
, 2005
"... In this demonstration we present prefuse, an extensible user interface toolkit for building interactive information visualization applications, including nodelink diagrams, containment diagrams, and visualizations of unstructured (edgefree) data such as scatter plots and timelines. prefuse data in ..."
Abstract

Cited by 212 (4 self)
 Add to MetaCart
In this demonstration we present prefuse, an extensible user interface toolkit for building interactive information visualization applications, including nodelink diagrams, containment diagrams, and visualizations of unstructured (edgefree) data such as scatter plots and timelines. prefuse data into visual forms and then manipulating visual data in aggregate, including layout, animation, and distortion routines. The result is a platform for creating scalable, highlyinteractive visualizations of large data sets in a modular and principled fashion. We have used prefuse to implement both novel and existing visualizations, validating the toolkit’s power and expressiveness.
Computing and Rendering Point Set Surfaces
, 2002
"... We advocate the use of point sets to represent shapes. We provide a definition of a smooth manifold surface from a set of points close to the original surface. The definition is based on local maps from differential geometry, which are approximated by the method of moving least squares (MLS). The co ..."
Abstract

Cited by 167 (20 self)
 Add to MetaCart
We advocate the use of point sets to represent shapes. We provide a definition of a smooth manifold surface from a set of points close to the original surface. The definition is based on local maps from differential geometry, which are approximated by the method of moving least squares (MLS). The computation of points on the surface is local, which results in an outofcore technique that can handle any point set.
Spectral Partitioning Works: Planar graphs and finite element meshes
 In IEEE Symposium on Foundations of Computer Science
, 1996
"... Spectral partitioning methods use the Fiedler vectorthe eigenvector of the secondsmallest eigenvalue of the Laplacian matrixto find a small separator of a graph. These methods are important components of many scientific numerical algorithms and have been demonstrated by experiment to work extr ..."
Abstract

Cited by 144 (8 self)
 Add to MetaCart
Spectral partitioning methods use the Fiedler vectorthe eigenvector of the secondsmallest eigenvalue of the Laplacian matrixto find a small separator of a graph. These methods are important components of many scientific numerical algorithms and have been demonstrated by experiment to work extremely well. In this paper, we show that spectral partitioning methods work well on boundeddegree planar graphs and finite element meshes the classes of graphs to which they are usually applied. While naive spectral bisection does not necessarily work, we prove that spectral partitioning techniques can be used to produce separators whose ratio of vertices removed to edges cut is O( p n) for boundeddegree planar graphs and twodimensional meshes and O i n 1=d j for wellshaped ddimensional meshes. The heart of our analysis is an upper bound on the secondsmallest eigenvalues of the Laplacian matrices of these graphs. 1. Introduction Spectral partitioning has become one of the mos...
Efficient Support for Irregular Applications on DistributedMemory Machines
, 1995
"... Irregular computation problems underlie many important scientific applications. Although these problems are computationally expensive, and so would seem appropriate for parallel machines, their irregular and unpredictable runtime behavior makes this type of parallel program difficult to write and a ..."
Abstract

Cited by 87 (13 self)
 Add to MetaCart
Irregular computation problems underlie many important scientific applications. Although these problems are computationally expensive, and so would seem appropriate for parallel machines, their irregular and unpredictable runtime behavior makes this type of parallel program difficult to write and adversely affects runtime performance. This paper explores three issues  partitioning, mutual exclusion, and data transfer  crucial to the efficient execution of irregular problems on distributedmemory machines. Unlike previous work, we studied the same programs running in three alternative systems on the same hardware base (a Thinking Machines CM5): the CHAOS irregular application library, Transparent Shared Memory (TSM), and eXtensible Shared Memory (XSM). CHAOS and XSM performed equivalently for all three applications. Both systems were somewhat (13%) to significantly faster (991%) than TSM.
Special Purpose Parallel Computing
 Lectures on Parallel Computation
, 1993
"... A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing ..."
Abstract

Cited by 77 (5 self)
 Add to MetaCart
A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing [365] demonstrated that, in principle, a single general purpose sequential machine could be designed which would be capable of efficiently performing any computation which could be performed by a special purpose sequential machine. The importance of this universality result for subsequent practical developments in computing cannot be overstated. It showed that, for a given computational problem, the additional efficiency advantages which could be gained by designing a special purpose sequential machine for that problem would not be great. Around 1944, von Neumann produced a proposal [66, 389] for a general purpose storedprogram sequential computer which captured the fundamental principles of...
Load Balancing and Data Locality in Adaptive Hierarchical Nbody Methods: BarnesHut, Fast Multipole, and Radiosity
 Journal Of Parallel and Distributed Computing
, 1995
"... processes, are increasingly being used to solve largescale problems in a variety of scientific/engineering domains. Applications that use these methods are challenging to parallelize effectively, however, owing to their nonuniform, dynamically changing characteristics and their need for longrang ..."
Abstract

Cited by 62 (2 self)
 Add to MetaCart
processes, are increasingly being used to solve largescale problems in a variety of scientific/engineering domains. Applications that use these methods are challenging to parallelize effectively, however, owing to their nonuniform, dynamically changing characteristics and their need for longrange communication.
The Cilk System for Parallel Multithreaded Computing
, 1996
"... Although costeffective parallel machines are now commercially available, the widespread use of parallel processing is still being held back, due mainly to the troublesome nature of parallel programming. In particular, it is still diiticult to build eiticient implementations of parallel applications ..."
Abstract

Cited by 42 (2 self)
 Add to MetaCart
Although costeffective parallel machines are now commercially available, the widespread use of parallel processing is still being held back, due mainly to the troublesome nature of parallel programming. In particular, it is still diiticult to build eiticient implementations of parallel applications whose communication patterns are either highly irregular or dependent upon dynamic information. Multithreading has become an increasingly popular way to implement these dynamic, asynchronous, concurrent programs. Cilk (pronounced "silk") is our Cbased multithreaded computing system that provides provably good performance guarantees. This thesis describes the evolution of the Cilk language and runtime system, and describes applications which affected the evolution of the system.