Results 1 - 10
of
102
H.: A sorting classification of parallel rendering
- IEEE Computer Graphics and Applications
, 1994
"... communication traffic between sortfirst processors rendering NCGA ÒheadÓ Picture-Level benchmark [1]. Arrow color indicates the number of primitives transferred between processors between these two successive frames. Range is 0 (black) to 800 (white) using a heated-object spectrum. We describe three ..."
Abstract
-
Cited by 193 (2 self)
- Add to MetaCart
communication traffic between sortfirst processors rendering NCGA ÒheadÓ Picture-Level benchmark [1]. Arrow color indicates the number of primitives transferred between processors between these two successive frames. Range is 0 (black) to 800 (white) using a heated-object spectrum. We describe three broad classes of parallel rendering methods, based on where the sort from object-space to screen space occurs. These classes encompass most feedforward parallel software and hardware rendering architectures that have been described to date. After introducing the classes, we perform a coarse analysis of the aggregate processing and communication costs of each and identify constraints they impose on the rendering application. The aim is to provide a conceptual model of the tradeoffs between the approaches as an aid to designers and implementers of high-performance, parallel rendering systems.
Cg: A system for programming graphics hardware in a c-like language
- ACM Transactions on Graphics
, 2003
"... ..."
A user-programmable vertex engine
- In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (ACM SIGGRAPH 2001
"... In this paper we describe the design, programming interface, and implementation of a very efficient user-programmable vertex engine. The vertex engine of NVIDIA’s GeForce3 GPU evolved from a highly tuned fixed-function pipeline requiring considerable knowledge to program. Programs operate only on a ..."
Abstract
-
Cited by 157 (1 self)
- Add to MetaCart
In this paper we describe the design, programming interface, and implementation of a very efficient user-programmable vertex engine. The vertex engine of NVIDIA’s GeForce3 GPU evolved from a highly tuned fixed-function pipeline requiring considerable knowledge to program. Programs operate only on a stream of independent vertices traversing the pipe. Embedded in the broader fixed function pipeline, our approach preserves parallelism sacrificed by previous approaches. The programmer is presented with a straightforward programming model, which is supported by transparent multi-threading and bypassing to preserve parallelism and performance. In the remainder of the paper we discuss the motivation behind our design and contrast it with previous work. We present the programming model, the instruction set selection process, and
Brook for GPUs: Stream Computing on Graphics Hardware
- ACM TRANSACTIONS ON GRAPHICS
, 2004
"... In this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming coprocessor. We present a compiler and runtime system that abstracts and virtua ..."
Abstract
-
Cited by 114 (7 self)
- Add to MetaCart
In this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming coprocessor. We present a compiler and runtime system that abstracts and virtualizes many aspects of graphics hardware. In addition, we present an analysis of the effectiveness of the GPU as a compute engine compared to the CPU, to determine when the GPU can outperform the CPU for a particular algorithm. We evaluate our system with five applications, the SAXPY and SGEMV BLAS operators, image segmentation, FFT, and ray tracing. For these applications, we demonstrate that our Brook implementations perform comparably to hand-written GPU code and up to seven times faster than their CPU counterparts.
WireGL: A Scalable Graphics System for Clusters
- Computer Graphics (Proceedings of SIGGRAPH 01
, 2001
"... We describe WireGL, a system for scalable interactive rendering on a cluster of workstations. WireGL provides the familiar OpenGL API to each node in a cluster, virtualizing multiple graphics accelerators into a sort-first parallel renderer with a parallel interface. We also describe techniques for ..."
Abstract
-
Cited by 93 (2 self)
- Add to MetaCart
We describe WireGL, a system for scalable interactive rendering on a cluster of workstations. WireGL provides the familiar OpenGL API to each node in a cluster, virtualizing multiple graphics accelerators into a sort-first parallel renderer with a parallel interface. We also describe techniques for reassembling an output image from a set of tiles distributed over a cluster. Using flexible display management, WireGL can drive a variety of output devices, from standalone displays to tiled display walls. By combining the power of virtual graphics, the familiarity and ordered semantics of OpenGL, and the scalability of clusters, we are able to create time-varying visualizations that sustain rendering performance over 70,000,000 triangles per second at interactive refresh rates using 16 compute nodes and 16 rendering nodes.
A Real-Time Procedural Shading System for Programmable Graphics Hardware
, 2001
"... Real-time graphics hardware is becoming programmable, but this programmable hardware is complex and difficult to use given current APIs. Higher-level abstractions would both increase programmer productivity and make programs more portable. However, it is challenging to raise the abstraction level wh ..."
Abstract
-
Cited by 75 (8 self)
- Add to MetaCart
Real-time graphics hardware is becoming programmable, but this programmable hardware is complex and difficult to use given current APIs. Higher-level abstractions would both increase programmer productivity and make programs more portable. However, it is challenging to raise the abstraction level while still providing high performance. We have developed a real-time procedural shading language system designed to achieve this goal. Our system is organized around multiple computation frequencies. For example, computations may be associated with vertices or with fragments/pixels. Our system’s shading language provides a unified interface that allows a single procedure to include operations from more than one computation frequency. Internally, our system virtualizes limited hardware resources to allow for arbitrarily-complex computations. We map operations to graphics hardware if possible, or to the host CPU as a last resort. This mapping is performed by compiler back-end modules associated with each computation frequency. Our system can map vertex operations to either programmable vertex hardware or to the host CPU, and can map fragment operations to either programmable fragment hardware or to multipass OpenGL. By carefully designing all the components of the system, we are able to generate highly-optimized code. We demonstrate our system running in real-time on a variety of hardware.
A Shading Language on Graphics Hardware: The PixelFlow Shading System
, 1998
"... Over the years, there have been two main branches of computer graphics image-synthesis research; one focused on interactivity, the other on image quality. Procedural shading is a powerful tool, commonly used for creating high-quality images and production animation. A key aspect of most procedural s ..."
Abstract
-
Cited by 73 (10 self)
- Add to MetaCart
Over the years, there have been two main branches of computer graphics image-synthesis research; one focused on interactivity, the other on image quality. Procedural shading is a powerful tool, commonly used for creating high-quality images and production animation. A key aspect of most procedural shading is the use of a shading language, which allows a high-level description of the color and shading of each surface. However, shading languages have been beyond the capabilities of the interactive graphics hardware community. We have created a parallel graphics multicomputer, PixelFlow, that can render images at 30 frames per second using a shading language. This is the first system to be able to support a shading language in real-time. In this paper, we describe some of the techniques that make this possible. CR Categories and Subject Descriptors: D.3.2 [Language Classifications] Specialized Application Languages; I.3.1 [Computer Graphics] Hardware Architecture; I.3.3 [Computer Graphic...
B.: A predictor-corrector technique for visualizing unsteady flow
- IEEE Transactions on Visualization and Computer Graphics
, 1995
"... We present a method for visualizing unsteady flow by displaying its vortices. The vortices are identified by using a vorticity-predictor pressure-corrector scheme that follows vortex cores. The cross-sections of a vortex at each point along the core can be represented by a Fourier series. A vortex c ..."
Abstract
-
Cited by 57 (0 self)
- Add to MetaCart
We present a method for visualizing unsteady flow by displaying its vortices. The vortices are identified by using a vorticity-predictor pressure-corrector scheme that follows vortex cores. The cross-sections of a vortex at each point along the core can be represented by a Fourier series. A vortex can be faithfully reconstructed from the series as a simple quadrilateral mesh, or its reconstruction can be enhanced to indicate helical motion. The mesh can reduce the representation of the flow features by a factor of one thousand or more compared with the volumetric dataset. With this amount of reduction it is possible to implement an interactive system on a graphics workstation to permit a viewer to examine, in three dimensions, the evolution of the vortical structures in a complex, unsteady flow.
Pipeline Rendering: Interaction And Realism Through Hardware-Based Multi-Pass Rendering
, 1996
"... While large investments are made in sophisticated graphics hardware, most realistic rendering is still performed off-line using ray trace or radiosity systems. A coordinated use of hardware-provided bitplanes and rendering pipelines can, however, approximate ray trace quality illumination effects in ..."
Abstract
-
Cited by 56 (1 self)
- Add to MetaCart
While large investments are made in sophisticated graphics hardware, most realistic rendering is still performed off-line using ray trace or radiosity systems. A coordinated use of hardware-provided bitplanes and rendering pipelines can, however, approximate ray trace quality illumination effects in a user-interactive environment, as well as provide the tools necessary for a user to declutter such a complex scene. A variety of common ray trace and radiosity illumination effects are presented using multi-pass rendering in a pipeline architecture. We provide recursive reflections through the use of secondary viewpoints, and present a method for using a homogeneous 2-D projective image mapping to extend this method for refractive transparent surfaces. This paper then introduces the Dual Z-buffer, or DZ-buffer, an evolutionary hardware extension which, along with current framebuffer functions such as stencil planes and accumulation buffers, provides the hardware platform to render non-refr...
The Sort-First Rendering Architecture for High-Performance Graphics
- In Proceedings of the 1995 Symposium on Interactive 3D Graphics
, 1995
"... Interactive graphics applications have long been challenging graphics system designers by demanding machines that can provide ever increasing polygon rendering performance. Another trend in interactive graphics is the growing use of display devices with pixel counts well beyond what is usually consi ..."
Abstract
-
Cited by 47 (0 self)
- Add to MetaCart
Interactive graphics applications have long been challenging graphics system designers by demanding machines that can provide ever increasing polygon rendering performance. Another trend in interactive graphics is the growing use of display devices with pixel counts well beyond what is usually considered "high resolution." If we examine the architectural space of high-performance rendering systems, we discover only one architectural class that promises to deliver high polygon performance with very-high-resolution displays and do so in an efficient manner. It is known as "sort-first." We investigate the sort-first architecture, starting with a comparison to its architectural class mates (sort-middle and sort-last). We find that sort-first has an inherent ability to take advantage of the frame-to-frame coherence found in interactive applications. We examine this ability through simulation with a set of test applications and show how it reduces sort-first's communication needs and theref...

