Results 1 - 10
of
24
Larrabee: a many-core x86 architecture for visual computing
- In SIGGRAPH ’08: ACM SIGGRAPH 2008 papers
, 2008
"... Abstract 123 This paper presents a many-core visual computing architecture code named Larrabee, a new software rendering pipeline, a manycore programming model, and performance analysis for several applications. Larrabee uses multiple in-order x86 CPU cores that are augmented by a wide vector proces ..."
Abstract
-
Cited by 103 (6 self)
- Add to MetaCart
Abstract 123 This paper presents a many-core visual computing architecture code named Larrabee, a new software rendering pipeline, a manycore programming model, and performance analysis for several applications. Larrabee uses multiple in-order x86 CPU cores that are augmented by a wide vector processor unit, as well as some fixed function logic blocks. This provides dramatically higher performance per watt and per unit of area than out-of-order CPUs on highly parallel workloads. It also greatly increases the flexibility and programmability of the architecture as compared to standard GPUs. A coherent on-die 2 nd level cache allows efficient inter-processor communication and high-bandwidth local data access by CPU cores. Task scheduling is performed entirely with software in Larrabee, rather than in fixed function logic. The customizable software graphics rendering pipeline for this
RPU: A Programmable Ray Processing Unit for Realtime Ray Tracing
- ACM Trans. Graph
, 2005
"... with shadows and refractions), a Conference room (5.5 fps, without shadows), reflective and refractive Spheres-RT in an office (4.5 fps), and UT2003 a scene from a current computer game (7.5 fps, precomputed illumination). Recursive ray tracing is a simple yet powerful and general approach for accur ..."
Abstract
-
Cited by 55 (3 self)
- Add to MetaCart
with shadows and refractions), a Conference room (5.5 fps, without shadows), reflective and refractive Spheres-RT in an office (4.5 fps), and UT2003 a scene from a current computer game (7.5 fps, precomputed illumination). Recursive ray tracing is a simple yet powerful and general approach for accurately computing global light transport and rendering high quality images. While recent algorithmic improvements and optimized parallel software implementations have increased ray tracing performance to realtime levels, no compact and programmable hardware solution has been available yet. This paper describes the architecture and a prototype implementation of a single chip, fully programmable Ray Processing Unit (RPU). It combines the flexibility of general purpose CPUs with the efficiency of current GPUs for data parallel computations. This design allows for realtime ray tracing of dynamic scenes with programmable material, geometry, and illumination shaders. Although, running at only 66 MHz the prototype FPGA implementation already renders images at up to 20 frames per second, which in many cases beats the performance of highly optimized software running on multi-GHz desktop CPUs. The performance and efficiency of the proposed architecture is analyzed using a variety of benchmark scenes.
The irregular Z-buffer: Hardware acceleration for irregular data structures
, 2005
"... The classical Z-buffer visibility algorithm samples a scene at regularly spaced points on an image plane. Previously, we introduced an extension of this algorithm called the irregular Z-buffer that permits sampling of the scene from arbitrary points on the image plane. These sample points are stored ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
The classical Z-buffer visibility algorithm samples a scene at regularly spaced points on an image plane. Previously, we introduced an extension of this algorithm called the irregular Z-buffer that permits sampling of the scene from arbitrary points on the image plane. These sample points are stored in a two-dimensional spatial data structure. Here we present a set of architectural enhancements to the classical Z-buffer acceleration hardware which supports efficient execution of the irregular Z-buffer. These enhancements enable efficient parallel construction and query of certain irregular data structures, including the grid of linked lists used by our algorithm. The enhancements include flexible atomic read-modify-write units located near the memory controller, an internal routing network between these units and the fragment processors, and a MIMD fragment processor design. We simulate the performance of this new architecture and demonstrate that it can be used to render high-quality shadows in geometrically complex scenes at interactive frame rates. We also discuss other uses of the irregular Z-buffer algorithm and the implications of our architectural changes in the design of chip-multiprocessors.
Queried virtual shadow maps
- In Proc. ���¦� Computational Natural Language Learning Workshop
, 2007
"... Figure 1: Left: shadow map reparametrization techniques (lightspace perspective shadow maps is used here) alone cannot guarantee subpixel accuracy (leading to perspective aliasing in the lower right corner and projection aliasing on the slope in the middle of the scene), even with a 4096 2 shadow ma ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Figure 1: Left: shadow map reparametrization techniques (lightspace perspective shadow maps is used here) alone cannot guarantee subpixel accuracy (leading to perspective aliasing in the lower right corner and projection aliasing on the slope in the middle of the scene), even with a 4096 2 shadow map. Right: Queried Virtual Shadow Maps prevent both types of undersampling artifacts. Shadowing scenes by shadow mapping has long suffered from the fundamental problem of undersampling artifacts due to too low shadow map resolution, leading to so-called perspective and projection aliasing. In this paper we present a new real-time shadow mapping algorithm capable of shadowing large scenes by virtually increasing the resolution of the shadow map beyond the GPU hardware limit. We start with a brute force approach that uniformly increases the resolution of the whole shadow map. We then introduce a smarter version which greatly increases runtime performance while still being GPU-friendly. The algorithm contains an easy to use performance/quality-tradeoff parameter, making it tunable to a wide range of graphics hardware.
Warping and Partitioning for Low Error Shadow Maps
- EUROGRAPHICS SYMPOSIUM ON RENDERING (2006) TOMAS AKENINE-MÖLLER AND WOLFGANG HEIDRICH (EDITORS)
, 2006
"... We evaluate several shadow map algorithms based on warping and partitioning using the maximum perspective aliasing error over the entire view frustum. With respect to our error metric, we show that a range of warping parameters corresponding to several previous warping algorithms have the same error ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
We evaluate several shadow map algorithms based on warping and partitioning using the maximum perspective aliasing error over the entire view frustum. With respect to our error metric, we show that a range of warping parameters corresponding to several previous warping algorithms have the same error. We also analyze several partitioning schemes to determine which produces the least maximum error using the least number of partitions. Finally, we show how warping and partitioning can be combined for interactive rendering of low error shadows in scenes with a high depth range.
Exponential Shadow Maps
"... Figure 1: A backyard scene rendered with a 2k × 2k shadow map and 5 × 5 Gauss filtering using (from left to right, statistics include mip-map memory): CSMs ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Figure 1: A backyard scene rendered with a 2k × 2k shadow map and 5 × 5 Gauss filtering using (from left to right, statistics include mip-map memory): CSMs
Ray-Specialized Acceleration Structures for Ray Tracing
"... Figure 1: Ray tracing acceleration structures can be made more efficient by choosing split planes that are parallel or nearly-parallel to the rays being traced (subfigure d). For rays that share a common or near-common origin, this choice can be made most simply by building an acceleration structure ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Figure 1: Ray tracing acceleration structures can be made more efficient by choosing split planes that are parallel or nearly-parallel to the rays being traced (subfigure d). For rays that share a common or near-common origin, this choice can be made most simply by building an acceleration structure that uses axis-aligned split planes specified in a space transformed by a perspective projection (subfigure b). The key to efficient ray tracing is the use of effective acceleration data structures. Traditionally, acceleration structures have been constructed under the assumption that rays approach from any direction with equal probability. However, we observe that for any particular frame the system has significant knowledge about the rays, especially eye rays and hard/soft shadow rays. In this paper we demonstrate that by using this information in conjunction with an appropriate acceleration structure – a set of one or more perspective grids – that ray tracing performance can be significantly improved over prior approaches. This acceleration structure can easily be rebuilt per frame, and provides significantly improved performance for rays originating at or near particular points such as the eye point and the light source(s), without sacrificing the ability to trace arbitrary rays. We demonstrate true real-time frame rates on a game-like scene rendered on an eight-core desktop PC at 1920x1200 resolution for primary visibility, and hard shadows, along with lower frame rates for Monte Carlo soft shadows. In particular, we demonstrate the fastest hard shadow ray-tracing results that we are aware of. We argue that the perspective grid acceleration structure provides insight into why the Z buffer algorithm is faster than traditional ray tracing and shows there is a useful continuum of visibility algorithms between the two traditional approaches.
Subdivided shadow maps
, 2005
"... map, b) TSM with 4K×4K shadow map, and c) 1K×1K subdivided shadow map. This configuration with a small angle between the light and view directions is difficult for prior methods. Even with the largest shadow map that can be allocated on current hardware, TSMs are not able to match the quality of sub ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
map, b) TSM with 4K×4K shadow map, and c) 1K×1K subdivided shadow map. This configuration with a small angle between the light and view directions is difficult for prior methods. Even with the largest shadow map that can be allocated on current hardware, TSMs are not able to match the quality of subdivided shadow maps for this view. We present a technique for reducing perspective aliasing error in shadow maps. From the viewpoint of the light, the scene is first split into subdivisions defined by the visible faces of the camera frustum. The frustum subdivisions may be further subdivided along their corresponding faces. We apply a separate shadow map warp to each resulting subdivision. This produces significantly less error than applying a single shadow map warp to the whole scene We layout the subdivisions in rectangular regions within a single shadow map, using the maximum error of each subdivision to assign larger regions to subdivisions with higher error. Our method runs well on commodity graphics hardware and is easy to integrate into existing shadow map systems. We are able to achieve interactive performance (8-25 fps) on a power plant model (12M triangles), a double eagle tanker model (82M triangles), and the St. Matthew model (370M triangles) running on a PC with a GeForce 7800 GTX. We observed significantly less aliasing compared to prior shadow map warping algorithms. 1
Resolution-matched shadow maps
- ACM Transactions on Graphics
, 2007
"... This paper presents resolution-matched shadow maps (RMSM), a modified adaptive shadow map (ASM) algorithm, that is practical for interactive rendering of dynamic scenes. Adaptive shadow maps, which build a quadtree of shadow samples to match the projected resolution of each shadow texel in eye space ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper presents resolution-matched shadow maps (RMSM), a modified adaptive shadow map (ASM) algorithm, that is practical for interactive rendering of dynamic scenes. Adaptive shadow maps, which build a quadtree of shadow samples to match the projected resolution of each shadow texel in eye space, offer a robust solution to projective and perspective aliasing in shadow maps. However, their use for interactive dynamic scenes is plagued by an expensive iterative edge-finding algorithm that takes a highly variable amount of time per frame and is not guaranteed to converge to a correct solution. This paper introduces a simplified algorithm that is up to ten times faster than ASMs, has more predictable performance, and delivers more accurate shadows. Our main contribution is the observation that it is more efficient to forgo the iterative refinement analysis in favor of generating all shadow texels requested by the pixels in the eye-space image. The practicality of this approach is based on the insight that, for surfaces continuously visible from the eye, adjacent eye-space pixels map to adjacent shadow texels in quadtree shadow space. This means that the number of contiguous regions of shadow texels (which can be efficiently generated with a rasterizer) is proportional to the number of continuously visible surfaces in the scene. Moreover, these regions can be coalesced to further reduce the number of render passes required to shadow an image. The secondary contribution of this paper is demonstrating the design and use of data-parallel algorithms inseparably mixed with traditional graphics programming to implement a novel interactive rendering algorithm. For the scenes described in this paper, we achieve 60–80 frames per second on static scenes and 20–60 frames per second on dynamic scenes for 5122 and 10242 images with a maximum effective shadow resolution of 32, 7682 texels. Categories and Subject Descriptors: I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Realism—Color, shading, shadowing, and texture

