Results 1 - 10
of
47
A survey of general-purpose computation on graphics hardware
, 2007
"... The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability, have made graphics hardware acompelling platform for computationally demanding tasks in awide variety of application domains. In this report, we describe, summarize, and analyze the l ..."
Abstract
-
Cited by 231 (11 self)
- Add to MetaCart
The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability, have made graphics hardware acompelling platform for computationally demanding tasks in awide variety of application domains. In this report, we describe, summarize, and analyze the latest research in mapping general-purpose computation to graphics hardware. We begin with the technical motivations that underlie general-purpose computation on graphics processors (GPGPU) and describe the hardware and software developments that have led to the recent interest in this field. We then aim the main body of this report at two separate audiences. First, we describe the techniques used in mapping general-purpose computation to graphics hardware. We believe these techniques will be generally useful for researchers who plan to develop the next generation of GPGPU algorithms and techniques. Second, we survey and categorize the latest developments in general-purpose application development on graphics hardware.
Sparse matrix solvers on the GPU: conjugate gradients and multigrid
- ACM Trans. Graph
, 2003
"... Permission to make digital/hard copy of part of all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given ..."
Abstract
-
Cited by 174 (2 self)
- Add to MetaCart
Permission to make digital/hard copy of part of all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission
Brook for GPUs: Stream Computing on Graphics Hardware
- ACM TRANSACTIONS ON GRAPHICS
, 2004
"... In this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming coprocessor. We present a compiler and runtime system that abstracts and virtua ..."
Abstract
-
Cited by 114 (7 self)
- Add to MetaCart
In this paper, we present Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming coprocessor. We present a compiler and runtime system that abstracts and virtualizes many aspects of graphics hardware. In addition, we present an analysis of the effectiveness of the GPU as a compute engine compared to the CPU, to determine when the GPU can outperform the CPU for a particular algorithm. We evaluate our system with five applications, the SAXPY and SGEMV BLAS operators, image segmentation, FFT, and ray tracing. For these applications, we demonstrate that our Brook implementations perform comparably to hand-written GPU code and up to seven times faster than their CPU counterparts.
Photon Mapping on Programmable Graphics Hardware
- GRAPHICS HARDWARE
, 2003
"... We present a modified photon mapping algorithm capable of running entirely on GPUs. Our implementation uses breadth-first photon tracing to distribute photons using the GPU. The photons are stored in a grid-based photon map that is constructed directly on the graphics hardware using one of two met ..."
Abstract
-
Cited by 112 (4 self)
- Add to MetaCart
We present a modified photon mapping algorithm capable of running entirely on GPUs. Our implementation uses breadth-first photon tracing to distribute photons using the GPU. The photons are stored in a grid-based photon map that is constructed directly on the graphics hardware using one of two methods: the first method is a multipass technique that uses fragment programs to directly sort the photons into a compact grid. The second method uses a single rendering pass combining a vertex program and the stencil buffer to route photons to their respective grid cells, producing an approximate photon map. We also present an efficient method for locating the nearest photons in the grid, which makes it possible to compute an estimate of the radiance at any surface location in the scene. Finally, we describe a breadth-first stochastic ray tracer that uses the photon map to simulate full global illumination directly on the graphics hardware. Our implementation demonstrates that current graphics hardware is capable of fully simulating global illumination with progressive, interactive feedback to the user.
A Multigrid Solver for Boundary Value Problems Using Programmable Graphics Hardware
- In Graphics Hardware 2003
, 2003
"... We present a method for using programmable graphics hardware to solve a variety of boundary value problems. The time-evolution of such problems is frequently governed by partial differential equations, which are used to describe a wide range of dynamic phenomena including heat transfer and fluid mec ..."
Abstract
-
Cited by 73 (3 self)
- Add to MetaCart
We present a method for using programmable graphics hardware to solve a variety of boundary value problems. The time-evolution of such problems is frequently governed by partial differential equations, which are used to describe a wide range of dynamic phenomena including heat transfer and fluid mechanics. The need to solve these equations efficiently arises in many areas of computational science. Finite difference methods are commonly used for solving partial differential equations; we show that this approach can be mapped onto a modern graphics processor. We demonstrate an implementation of the multigrid method, a fast and popular approach to solving boundary value problems, on two modern graphics architectures. Our initial tests with available hardware show speedups of roughly 15x compared to traditional software implementation. This work presents a novel use of computer hardware and raises the intriguing possibility that we can make the inexpensive power of modern commodity graphics hardware accessible to and useful for the simulation commuuity.
Physically-Based Visual Simulation on Graphics Hardware
, 2002
"... In this paper, we present a method for real-time visual simulation of diverse dynamic phenomena using programmable graphics hardware. The simulations we implement use an extension of cellular automata known as the coupled map lattice (CML). CML represents the state of a dynamic system as continuous ..."
Abstract
-
Cited by 73 (5 self)
- Add to MetaCart
In this paper, we present a method for real-time visual simulation of diverse dynamic phenomena using programmable graphics hardware. The simulations we implement use an extension of cellular automata known as the coupled map lattice (CML). CML represents the state of a dynamic system as continuous values on a discrete lattice. In our implementation we store the lattice values in a texture, and use pixel-level programming to implement simple next-state computations on lattice nodes and their neighbors. We apply these computations successively to produce interactive visual simulations of convection, reaction-diffusion, and boiling. We have built an interactive framework for building and experimenting with CML simulations running on graphics hardware, and have integrated them into interactive 3D graphics applications.
RPU: A Programmable Ray Processing Unit for Realtime Ray Tracing
- ACM Trans. Graph
, 2005
"... with shadows and refractions), a Conference room (5.5 fps, without shadows), reflective and refractive Spheres-RT in an office (4.5 fps), and UT2003 a scene from a current computer game (7.5 fps, precomputed illumination). Recursive ray tracing is a simple yet powerful and general approach for accur ..."
Abstract
-
Cited by 55 (3 self)
- Add to MetaCart
with shadows and refractions), a Conference room (5.5 fps, without shadows), reflective and refractive Spheres-RT in an office (4.5 fps), and UT2003 a scene from a current computer game (7.5 fps, precomputed illumination). Recursive ray tracing is a simple yet powerful and general approach for accurately computing global light transport and rendering high quality images. While recent algorithmic improvements and optimized parallel software implementations have increased ray tracing performance to realtime levels, no compact and programmable hardware solution has been available yet. This paper describes the architecture and a prototype implementation of a single chip, fully programmable Ray Processing Unit (RPU). It combines the flexibility of general purpose CPUs with the efficiency of current GPUs for data parallel computations. This design allows for realtime ray tracing of dynamic scenes with programmable material, geometry, and illumination shaders. Although, running at only 66 MHz the prototype FPGA implementation already renders images at up to 20 frames per second, which in many cases beats the performance of highly optimized software running on multi-GHz desktop CPUs. The performance and efficiency of the proposed architecture is analyzed using a variety of benchmark scenes.
Radiosity on graphics hardware
- In Proceedings of the 2004 Conference on Graphics Interface
, 2004
"... Radiosity is a widely used technique for global illumination. Typically the computation is performed offline and the result is viewed interactively. We present a technique for computing radiosity, including an adaptive subdivision of the model, using graphics hardware. Since our goal is to run at in ..."
Abstract
-
Cited by 47 (5 self)
- Add to MetaCart
Radiosity is a widely used technique for global illumination. Typically the computation is performed offline and the result is viewed interactively. We present a technique for computing radiosity, including an adaptive subdivision of the model, using graphics hardware. Since our goal is to run at interactive rates, we exploit the computational power and programmability of modern graphics hardware. Using our system on current hardware, we have been able to compute and display a radiosity solution for a 10,000 element scene in less than one second. Key words: Graphics Hardware, Global Illumination. 1
Ray Tracing on Programmable Graphics Hardware
, 2002
"... Recently a breakthrough has occurred in graphics hardware: fixed function pipelines have been replaced with programmable vertex and fragment processors. In the near future, the graphics pipeline is likely to evolve into a general programmable stream processor capable of more than simply feed-forward ..."
Abstract
-
Cited by 44 (1 self)
- Add to MetaCart
Recently a breakthrough has occurred in graphics hardware: fixed function pipelines have been replaced with programmable vertex and fragment processors. In the near future, the graphics pipeline is likely to evolve into a general programmable stream processor capable of more than simply feed-forward triangle rendering.
Real-time kd-tree construction on graphics hardware
- ACM Transactions on Graphics
, 2008
"... We present an algorithm for constructing kd-trees on GPUs. This algorithm achieves real-time performance by exploiting the GPU’s streaming architecture at all stages of kd-tree construction. Unlike previous parallel kd-tree algorithms, our method builds tree nodes completely in BFS (breadth-first se ..."
Abstract
-
Cited by 26 (4 self)
- Add to MetaCart
We present an algorithm for constructing kd-trees on GPUs. This algorithm achieves real-time performance by exploiting the GPU’s streaming architecture at all stages of kd-tree construction. Unlike previous parallel kd-tree algorithms, our method builds tree nodes completely in BFS (breadth-first search) order. We also develop a special strategy for large nodes at upper tree levels so as to further exploit the fine-grained parallelism of GPUs. For these nodes, we parallelize the computation over all geometric primitives instead of nodes at each level. Finally, in order to maintain kd-tree quality, we introduce novel schemes for fast evaluation of node split costs. As far as we know, ours is the first real-time kd-tree algorithm on the GPU. The kd-trees built by our algorithm are of comparable quality as those constructed by off-line CPU algorithms. In terms of speed, our algorithm is significantly faster than well-optimized single-core CPU algorithms and competitive with multi-core CPU algorithms. Our algorithm provides a general way for handling dynamic scenes on the GPU. We demonstrate the potential of our algorithm in applications involving dynamic scenes, including GPU ray tracing, interactive photon mapping, and point cloud modeling.

