Results 1 - 10
of
33
A survey of general-purpose computation on graphics hardware
, 2007
"... The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability, have made graphics hardware acompelling platform for computationally demanding tasks in awide variety of application domains. In this report, we describe, summarize, and analyze the l ..."
Abstract
-
Cited by 231 (11 self)
- Add to MetaCart
The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability, have made graphics hardware acompelling platform for computationally demanding tasks in awide variety of application domains. In this report, we describe, summarize, and analyze the latest research in mapping general-purpose computation to graphics hardware. We begin with the technical motivations that underlie general-purpose computation on graphics processors (GPGPU) and describe the hardware and software developments that have led to the recent interest in this field. We then aim the main body of this report at two separate audiences. First, we describe the techniques used in mapping general-purpose computation to graphics hardware. We believe these techniques will be generally useful for researchers who plan to develop the next generation of GPGPU algorithms and techniques. Second, we survey and categorize the latest developments in general-purpose application development on graphics hardware.
Linear Algebra Operators for GPU Implementation of Numerical Algorithms
- ACM Transactions on Graphics
, 2003
"... In this work, the emphasis is on the development of strategies to realize techniques of numerical computing on the graphics chip. In particular, the focus is on the acceleration of techniques for solving sets of algebraic equations as they occur in numerical simulation. We introduce a framework for ..."
Abstract
-
Cited by 195 (9 self)
- Add to MetaCart
In this work, the emphasis is on the development of strategies to realize techniques of numerical computing on the graphics chip. In particular, the focus is on the acceleration of techniques for solving sets of algebraic equations as they occur in numerical simulation. We introduce a framework for the implementation of linear algebra operators on programmable graphics processors (GPUs), thus providing the building blocks for the design of more complex numerical algorithms. In particular, we propose a stream model for arithmetic operations on vectors and matrices that exploits the intrinsic parallelism and efficient communication on modern GPUs. Besides performance gains due to improved numerical computations, graphics algorithms benefit from this model in that the transfer of computation results to the graphics processor for display is avoided. We demonstrate the effectiveness of our approach by implementing direct solvers for sparse matrices, and by applying these solvers to multi-dimensional finite difference equations, i.e. the 2D wave equation and the incompressible Navier-Stokes equations.
Real-Time Consensus-Based Scene Reconstruction using Commodity Graphics Hardware
, 2002
"... that effectively combines a plane-sweeping algorithm with view synthesis for real-time, on-line 3D scene acquisition and view synthesis. Using real-time imagery from a few calibrated cameras, our method can generate new images from nearby viewpoints, estimate a dense depth map from the current viewp ..."
Abstract
-
Cited by 44 (3 self)
- Add to MetaCart
that effectively combines a plane-sweeping algorithm with view synthesis for real-time, on-line 3D scene acquisition and view synthesis. Using real-time imagery from a few calibrated cameras, our method can generate new images from nearby viewpoints, estimate a dense depth map from the current viewpoint, or create a textured triangular mesh. We can do each of these without any prior geometric information or requiring any user interaction, in real time and on line. The heart of our method is to use programmable Pixel Shader technology to square intensity differences between reference image pixels, and then to choose final colors (or depths) that correspond to the minimum difference, i.e. the most consistent color.
Implementing Lattice Boltzmann Computation on Graphics Hardware
, 2003
"... LBM is a physically-based approach that simulates the microscopic movement of fluid particles by simple, identical and local rules. We accelerate the computation of the LBM on general-purpose graphics hardware, by grouping particle packets into 2D textures and mapping the Boltzmann equations complet ..."
Abstract
-
Cited by 39 (8 self)
- Add to MetaCart
LBM is a physically-based approach that simulates the microscopic movement of fluid particles by simple, identical and local rules. We accelerate the computation of the LBM on general-purpose graphics hardware, by grouping particle packets into 2D textures and mapping the Boltzmann equations completely to the rasterization and frame buffer operations. We apply stitching and packing to further improve the performance. In addition, we propose techniques, namely range scaling and range separation, that systematically transform variables into the range required by graphics hardware and thus prevent overflow. These approaches can be extended to a compiler that automatically translates general calculations to operations on graphics hardware.
Nonlinear Diffusion in Graphics Hardware
- In Proceedings of EG/IEEE TCVG Symposium on Visualization VisSym ’01
, 2001
"... Multiscale methods have proved to be successful tools in image denoising, edge enhancement and shape recovery. They are based on the numerical solution of a nonlinear diffusion problem where a noisy or damaged image which has to be smoothed or restorated is considered as initial data. Here a novel a ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
Multiscale methods have proved to be successful tools in image denoising, edge enhancement and shape recovery. They are based on the numerical solution of a nonlinear diffusion problem where a noisy or damaged image which has to be smoothed or restorated is considered as initial data. Here a novel approach is presented which will soon be capable to ensure real time performance of these methods. It is based on an implementation of a corresponding finite element scheme in texture hardware of modern graphics engines. The method regards vectors as textures and represents linear algebra operations as texture processing operations. Thus, the resulting performance can profit from the superior bandwidth and the build in parallelism of the graphics hardware. Here the concept of this approach is introduced and perspectives are outlined picking up the basic Perona Malik model on 2D images. 1
Using Graphics Cards for Quantized FEM Computations
- in IASTED Visualization, Imaging and Image Processing Conference
, 2001
"... Graphics cards exercise increasingly more computing power and are highly optimized for high data transfer volumes. In contrast typical workstations perform badly when data exceeds their processor caches. Performance of scientific computations very often is wrecked by this deficiency. Here we present ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
Graphics cards exercise increasingly more computing power and are highly optimized for high data transfer volumes. In contrast typical workstations perform badly when data exceeds their processor caches. Performance of scientific computations very often is wrecked by this deficiency. Here we present a novel approach by shifting the computational load from the CPU to the graphics card. We represent data in images and operations on vectors in graphics operations on images. Broad access to graphics memory and parallel processing of image operands thus turns the graphics card into an ultrafast vector coprocessor. The presented strategy opens up a wide area of numerical applications for hardware acceleration. The implementations of Finite Element solvers for the linear heat equation and the anisotropic diffusion method in image processing underline its practicability. We explain the vector processor usage of graphics cards in detail. An extensive correspondence of vector and graphics operations is given and the decomposition of complex operations into hardware supported is explicated. We also sketch the realization of arbitrary number formats in graphics hardware and the consequences of the restricted precision. Finally, we propose slight modifications and extensions which would further improve computational benefits and extend the range of applicability of the proposed approach. Computing in image processing at ms for an Jacobi iteration on images is exemplarily depicted as an ideal field, where Finite Element methods can be greatly accelerated and ultimate number precision is not required.
Accelerating Morphological Analysis with Graphics Hardware
- In Workshop on Vision, Modelling, and Visualization VMV ’00
, 2000
"... Direct volume rendering is a common means of visualizing three-dimensional data nowadays. It is, however, still a very time consuming process to create informative and visual appealing images. Semi-automatic volume analysis procedures as morphological operators are a promising approach to improve th ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
Direct volume rendering is a common means of visualizing three-dimensional data nowadays. It is, however, still a very time consuming process to create informative and visual appealing images. Semi-automatic volume analysis procedures as morphological operators are a promising approach to improve the overall visualization cycle. However, these operators need quite some time for computation, reducing their usefulness for interactive visualization. Modern graphics hardware, on the other hand, has all necessary functions for doing hardware based morphological filtering. As the problem is mainly memory bandwidth bound, a solution based on graphics hardware can significantly reduce computation time in the filtering step, as the graphics hardware typically has much broader and faster memory paths. When using graphics hardware for mathematical computations, accuracy is usually quite a problem. However, morphological operators map so well onto the graphics pipeline, that no loss of accuracy ...
Hardware Accelerated Wavelet Transformations
- In Proceedings of EG/IEEE TCVG Symposium on Visualization VisSym ’00
, 2000
"... . Wavelets and related multiscale representations are important means for edge detection and processing as well as for segmentation and registration. Due to the computational complexity of these approaches no interactive visualization of the extraction process is possible nowadays. By using the hard ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
. Wavelets and related multiscale representations are important means for edge detection and processing as well as for segmentation and registration. Due to the computational complexity of these approaches no interactive visualization of the extraction process is possible nowadays. By using the hardware of modern graphics workstations for accelerating wavelet decomposition and reconstruction we realize a first important step for removing lags in the visualization cycle. 1 Introduction Feature extraction has been proven to be a useful utility for segmentation and registration in volume visualization [7, 13]. Many edge detection algorithms used in this step employ wavelets or related basis functions for the internal representation of the volume. Additionally, wavelets can be used for fast volume visualization [5] using the Fourier rendering approach [8, 12]. Wavelet analysis is a mainly memory bound problem. Graphics hardware on the other hand regularly has memory systems that can be ad...
Hardware-based nonlinear filtering and segmentation using high-level shading languages
- in Proceedings of IEEE Visualization
, 2003
"... Non-linear filtering is an important task for volume analysis. This paper presents hardware-based implementations of various nonlinear filters for volume smoothing with edge preservation. The Cg high-level shading language is used in combination with latest PC consumer graphics hardware. Filtering i ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
Non-linear filtering is an important task for volume analysis. This paper presents hardware-based implementations of various nonlinear filters for volume smoothing with edge preservation. The Cg high-level shading language is used in combination with latest PC consumer graphics hardware. Filtering is divided into pervertex and per-fragment stages. In both stages we propose techniques to increase the filtering performance. The vertex program pre-computes texture coordinates in order to address all contributing input samples of the operator mask. Thus additional computations are avoided in the fragment program. The presented fragment programs preserve cache coherence, exploit 4D vector arithmetic, and internal fixed point arithmetic to increase performance. We show the applicability of non-linear filters as part of a GPU-based segmentation pipeline. The resulting binary mask is compressed and decompressed in the graphics memory on-the-fly.
C.: Real-time motion estimation and visualization on graphics cards
- In: Proc. IEEE Visualization
, 2004
"... We present a tool for real-time visualization of motion features in 2D image sequences. The motion is estimated through an eigenvector analysis of the spatio-temporal structure tensor at every pixel location. This approach is computationally demanding but allows reliable velocity estimates as well a ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
We present a tool for real-time visualization of motion features in 2D image sequences. The motion is estimated through an eigenvector analysis of the spatio-temporal structure tensor at every pixel location. This approach is computationally demanding but allows reliable velocity estimates as well as quality indicators for the obtained results. We use a 2D color map and a region of interest selector for the visualization of the velocities. On the selected velocities we apply a hierarchical smoothing scheme which allows the choice of the desired scale of the motion field. We demonstrate several examples of test sequences in which some persons are moving with different velocities than others. These persons are visually marked in the real-time display of the image sequence. The tool is also applied to angiography sequences to emphasize the blood flow and its distribution. An efficient processing of the data streams is achieved by mapping the operations onto the stream architecture of standard graphics cards. The card receives the images and performs both the motion estimation and visualization, taking advantage of the parallelism in the graphics processor and the superior memory bandwidth. The integration of data processing and visualization also saves on unnecessary data transfers and thus allows the real-time analysis of 320x240 images. We expect that on the newest generation of graphics hardware our tool could run in real time for the standard VGA format.

