Results 1  10
of
30
Larrabee: a manycore x86 architecture for visual computing
 In SIGGRAPH ’08: ACM SIGGRAPH 2008 papers
, 2008
"... Abstract 123 This paper presents a manycore visual computing architecture code named Larrabee, a new software rendering pipeline, a manycore programming model, and performance analysis for several applications. Larrabee uses multiple inorder x86 CPU cores that are augmented by a wide vector proces ..."
Abstract

Cited by 149 (8 self)
 Add to MetaCart
Abstract 123 This paper presents a manycore visual computing architecture code named Larrabee, a new software rendering pipeline, a manycore programming model, and performance analysis for several applications. Larrabee uses multiple inorder x86 CPU cores that are augmented by a wide vector processor unit, as well as some fixed function logic blocks. This provides dramatically higher performance per watt and per unit of area than outoforder CPUs on highly parallel workloads. It also greatly increases the flexibility and programmability of the architecture as compared to standard GPUs. A coherent ondie 2 nd level cache allows efficient interprocessor communication and highbandwidth local data access by CPU cores. Task scheduling is performed entirely with software in Larrabee, rather than in fixed function logic. The customizable software graphics rendering pipeline for this
RPU: A Programmable Ray Processing Unit for Realtime Ray Tracing
 ACM Trans. Graph
, 2005
"... with shadows and refractions), a Conference room (5.5 fps, without shadows), reflective and refractive SpheresRT in an office (4.5 fps), and UT2003 a scene from a current computer game (7.5 fps, precomputed illumination). Recursive ray tracing is a simple yet powerful and general approach for accur ..."
Abstract

Cited by 69 (3 self)
 Add to MetaCart
with shadows and refractions), a Conference room (5.5 fps, without shadows), reflective and refractive SpheresRT in an office (4.5 fps), and UT2003 a scene from a current computer game (7.5 fps, precomputed illumination). Recursive ray tracing is a simple yet powerful and general approach for accurately computing global light transport and rendering high quality images. While recent algorithmic improvements and optimized parallel software implementations have increased ray tracing performance to realtime levels, no compact and programmable hardware solution has been available yet. This paper describes the architecture and a prototype implementation of a single chip, fully programmable Ray Processing Unit (RPU). It combines the flexibility of general purpose CPUs with the efficiency of current GPUs for data parallel computations. This design allows for realtime ray tracing of dynamic scenes with programmable material, geometry, and illumination shaders. Although, running at only 66 MHz the prototype FPGA implementation already renders images at up to 20 frames per second, which in many cases beats the performance of highly optimized software running on multiGHz desktop CPUs. The performance and efficiency of the proposed architecture is analyzed using a variety of benchmark scenes.
The irregular Zbuffer: Hardware acceleration for irregular data structures
, 2005
"... The classical Zbuffer visibility algorithm samples a scene at regularly spaced points on an image plane. Previously, we introduced an extension of this algorithm called the irregular Zbuffer that permits sampling of the scene from arbitrary points on the image plane. These sample points are stored ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
The classical Zbuffer visibility algorithm samples a scene at regularly spaced points on an image plane. Previously, we introduced an extension of this algorithm called the irregular Zbuffer that permits sampling of the scene from arbitrary points on the image plane. These sample points are stored in a twodimensional spatial data structure. Here we present a set of architectural enhancements to the classical Zbuffer acceleration hardware which supports efficient execution of the irregular Zbuffer. These enhancements enable efficient parallel construction and query of certain irregular data structures, including the grid of linked lists used by our algorithm. The enhancements include flexible atomic readmodifywrite units located near the memory controller, an internal routing network between these units and the fragment processors, and a MIMD fragment processor design. We simulate the performance of this new architecture and demonstrate that it can be used to render highquality shadows in geometrically complex scenes at interactive frame rates. We also discuss other uses of the irregular Zbuffer algorithm and the implications of our architectural changes in the design of chipmultiprocessors.
Warping and Partitioning for Low Error Shadow Maps
 EUROGRAPHICS SYMPOSIUM ON RENDERING (2006) TOMAS AKENINEMÖLLER AND WOLFGANG HEIDRICH (EDITORS)
, 2006
"... We evaluate several shadow map algorithms based on warping and partitioning using the maximum perspective aliasing error over the entire view frustum. With respect to our error metric, we show that a range of warping parameters corresponding to several previous warping algorithms have the same error ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
We evaluate several shadow map algorithms based on warping and partitioning using the maximum perspective aliasing error over the entire view frustum. With respect to our error metric, we show that a range of warping parameters corresponding to several previous warping algorithms have the same error. We also analyze several partitioning schemes to determine which produces the least maximum error using the least number of partitions. Finally, we show how warping and partitioning can be combined for interactive rendering of low error shadows in scenes with a high depth range.
Queried virtual shadow maps
 In Proc. ���¦� Computational Natural Language Learning Workshop
, 2007
"... Figure 1: Left: shadow map reparametrization techniques (lightspace perspective shadow maps is used here) alone cannot guarantee subpixel accuracy (leading to perspective aliasing in the lower right corner and projection aliasing on the slope in the middle of the scene), even with a 4096 2 shadow ma ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Figure 1: Left: shadow map reparametrization techniques (lightspace perspective shadow maps is used here) alone cannot guarantee subpixel accuracy (leading to perspective aliasing in the lower right corner and projection aliasing on the slope in the middle of the scene), even with a 4096 2 shadow map. Right: Queried Virtual Shadow Maps prevent both types of undersampling artifacts. Shadowing scenes by shadow mapping has long suffered from the fundamental problem of undersampling artifacts due to too low shadow map resolution, leading to socalled perspective and projection aliasing. In this paper we present a new realtime shadow mapping algorithm capable of shadowing large scenes by virtually increasing the resolution of the shadow map beyond the GPU hardware limit. We start with a brute force approach that uniformly increases the resolution of the whole shadow map. We then introduce a smarter version which greatly increases runtime performance while still being GPUfriendly. The algorithm contains an easy to use performance/qualitytradeoff parameter, making it tunable to a wide range of graphics hardware.
Exponential Shadow Maps
"... Figure 1: A backyard scene rendered with a 2k × 2k shadow map and 5 × 5 Gauss filtering using (from left to right, statistics include mipmap memory): CSMs ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Figure 1: A backyard scene rendered with a 2k × 2k shadow map and 5 × 5 Gauss filtering using (from left to right, statistics include mipmap memory): CSMs
Soft irregular shadow mapping: fast, highquality, and robust soft shadows
 IN I3D ’09: PROCEEDINGS OF THE 2009 SYMPOSIUM ON INTERACTIVE 3D GRAPHICS AND GAMES (NEW
"... We introduce a straightforward, robust, and efficient algorithm for rendering highquality soft shadows in dynamic scenes. Each frame, points in the scene visible from the eye are inserted into a spatial acceleration structure. Shadow umbrae are computed by sampling the scene from the light at the i ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
We introduce a straightforward, robust, and efficient algorithm for rendering highquality soft shadows in dynamic scenes. Each frame, points in the scene visible from the eye are inserted into a spatial acceleration structure. Shadow umbrae are computed by sampling the scene from the light at the image plane coordinates given by the stored points. Penumbrae are computed at the same set of points, per silhouette edge, in two steps. First, the set of points affected by a given edge is estimated from the expected lightview screenspace bounds of the corresponding penumbra. Second, the actual overlap between these points and the penumbra is computed analytically directly from the occluding geometry. The umbral and penumbral sources of occlusion are then combined to determine the degree of shadow at the eyeview pixel corresponding to each sample point. An implementation of this algorithm for the Larrabee architecture yields from 27 to 33 frames per second in simulation for scenes from a modern game, and produces significantly higher image quality than other recent methods in the realtime domain.
RaySpecialized Acceleration Structures for Ray Tracing
"... Figure 1: Ray tracing acceleration structures can be made more efficient by choosing split planes that are parallel or nearlyparallel to the rays being traced (subfigure d). For rays that share a common or nearcommon origin, this choice can be made most simply by building an acceleration structure ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Figure 1: Ray tracing acceleration structures can be made more efficient by choosing split planes that are parallel or nearlyparallel to the rays being traced (subfigure d). For rays that share a common or nearcommon origin, this choice can be made most simply by building an acceleration structure that uses axisaligned split planes specified in a space transformed by a perspective projection (subfigure b). The key to efficient ray tracing is the use of effective acceleration data structures. Traditionally, acceleration structures have been constructed under the assumption that rays approach from any direction with equal probability. However, we observe that for any particular frame the system has significant knowledge about the rays, especially eye rays and hard/soft shadow rays. In this paper we demonstrate that by using this information in conjunction with an appropriate acceleration structure – a set of one or more perspective grids – that ray tracing performance can be significantly improved over prior approaches. This acceleration structure can easily be rebuilt per frame, and provides significantly improved performance for rays originating at or near particular points such as the eye point and the light source(s), without sacrificing the ability to trace arbitrary rays. We demonstrate true realtime frame rates on a gamelike scene rendered on an eightcore desktop PC at 1920x1200 resolution for primary visibility, and hard shadows, along with lower frame rates for Monte Carlo soft shadows. In particular, we demonstrate the fastest hard shadow raytracing results that we are aware of. We argue that the perspective grid acceleration structure provides insight into why the Z buffer algorithm is faster than traditional ray tracing and shows there is a useful continuum of visibility algorithms between the two traditional approaches.
Resolutionmatched shadow maps
 ACM Transactions on Graphics
, 2007
"... This paper presents resolutionmatched shadow maps (RMSM), a modified adaptive shadow map (ASM) algorithm, that is practical for interactive rendering of dynamic scenes. Adaptive shadow maps, which build a quadtree of shadow samples to match the projected resolution of each shadow texel in eye space ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
This paper presents resolutionmatched shadow maps (RMSM), a modified adaptive shadow map (ASM) algorithm, that is practical for interactive rendering of dynamic scenes. Adaptive shadow maps, which build a quadtree of shadow samples to match the projected resolution of each shadow texel in eye space, offer a robust solution to projective and perspective aliasing in shadow maps. However, their use for interactive dynamic scenes is plagued by an expensive iterative edgefinding algorithm that takes a highly variable amount of time per frame and is not guaranteed to converge to a correct solution. This paper introduces a simplified algorithm that is up to ten times faster than ASMs, has more predictable performance, and delivers more accurate shadows. Our main contribution is the observation that it is more efficient to forgo the iterative refinement analysis in favor of generating all shadow texels requested by the pixels in the eyespace image. The practicality of this approach is based on the insight that, for surfaces continuously visible from the eye, adjacent eyespace pixels map to adjacent shadow texels in quadtree shadow space. This means that the number of contiguous regions of shadow texels (which can be efficiently generated with a rasterizer) is proportional to the number of continuously visible surfaces in the scene. Moreover, these regions can be coalesced to further reduce the number of render passes required to shadow an image. The secondary contribution of this paper is demonstrating the design and use of dataparallel algorithms inseparably mixed with traditional graphics programming to implement a novel interactive rendering algorithm. For the scenes described in this paper, we achieve 60–80 frames per second on static scenes and 20–60 frames per second on dynamic scenes for 5122 and 10242 images with a maximum effective shadow resolution of 32, 7682 texels. Categories and Subject Descriptors: I.3.7 [Computer Graphics]: ThreeDimensional Graphics and Realism—Color, shading, shadowing, and texture