Results 1 - 10
of
10
SIMD Ray Stream Tracing -- SIMD Ray Traversal with Generalized Ray Packets and On-the-fly Re-Ordering
, 2007
"... Achieving high performance on modern CPUs requires efficient utilization of SIMD units. Doing so requires that algorithms are able to take full advantage of the SIMD width offered and to not waste SIMD instructions on low utilization cases. Ray tracers exploit SIMD extensions through packet tracing. ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Achieving high performance on modern CPUs requires efficient utilization of SIMD units. Doing so requires that algorithms are able to take full advantage of the SIMD width offered and to not waste SIMD instructions on low utilization cases. Ray tracers exploit SIMD extensions through packet tracing. This re-casts the ray tracing algorithm into a SIMD framework, but high SIMD efficiency is only achieved for moderately complex scenes, and highly coherent packets. In this paper, we present a stream programming oriented traversal algorithm that processes streams of rays in SIMD fashion; the algorithm is motivated by breadth-first ray traversal and implicitly re-orders streams of rays on the fly by removing deactivated rays after each traversal step using a stream compaction step. This improves SIMD efficiency in the presence of complex scenes and diverging packets, and is, in particular, designed for potential wider-than-four SIMD architectures with scatter/gather support.
Interactive Isosurface Ray Tracing of Time-Varying Tetrahedral Volumes
- SCI INSTITUTE, UNIVERSITY OF UTAH
, 2007
"... We describe a system for interactively rendering isosurfaces of tetrahedral finite-element scalar fields using coherent ray tracing techniques on the CPU. By employing state-of-the art methods in polygonal ray tracing, namely aggressive packet/frustum traversal of a bounding volume hierarchy, we can ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
We describe a system for interactively rendering isosurfaces of tetrahedral finite-element scalar fields using coherent ray tracing techniques on the CPU. By employing state-of-the art methods in polygonal ray tracing, namely aggressive packet/frustum traversal of a bounding volume hierarchy, we can accomodate large and time-varying unstructured data. In conjunction with this efficiency structure, we introduce a novel technique for intersecting ray packets with tetrahedral primitives. Ray tracing is flexible, allowing for dynamic changes in isovalue and time step, visualization of multiple isosurfaces, shadows, and depth-peeling transparency effects. The resulting system offers the intuitive simplicity of isosurfacing, guaranteed-correct visual results, and ultimately a scalable, dynamic and consistently interactive solution for visualizing unstructured volumes.
StreamRay: A Stream Filtering Architecture for Coherent Ray Tracing
"... The wide availability of commodity graphics processors has made real-time graphics an intrinsic component of the human/computer interface. These graphics cores accelerate the z-buffer algorithm and provide a highly interactive experience at a relatively low cost. However, many applications in entert ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The wide availability of commodity graphics processors has made real-time graphics an intrinsic component of the human/computer interface. These graphics cores accelerate the z-buffer algorithm and provide a highly interactive experience at a relatively low cost. However, many applications in entertainment, science, and industry require high quality lighting effects such as accurate shadows, reflection, and refraction. These effects can be difficult to achieve with z-buffer algorithms but are straightforward to implement using ray tracing. Although ray tracing is computationally more complex, the algorithm exhibits excellent scaling and parallelism properties. Nevertheless, ray tracing memory access patterns are difficult to predict and the parallelism speedup promise is therefore hard to achieve.
C.: Adaptive Ray Packet Reordering
- In Proceedings of the 2008 IEEE/EG Symposium on Interactive Ray Tracing
, 2008
"... Figure 1: Our three test scenes rendered with two bounces of diffuse path tracing and one light sample per bounce (six rays per path) at 64 paths per pixel. Our reordering method maintains high SIMD utilization even for these incoherent ray distributions, achieving 1.2M rays per second for the confe ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Figure 1: Our three test scenes rendered with two bounces of diffuse path tracing and one light sample per bounce (six rays per path) at 64 paths per pixel. Our reordering method maintains high SIMD utilization even for these incoherent ray distributions, achieving 1.2M rays per second for the conference scene (far right) (1.5 × faster than BVH packet traversal and single ray traversal for 4-wide SIMD). As SIMD width increases, our SIMD speedup increases as well providing more than a 6 × reduction in box tests compared to a single ray implementation for 16-wide SIMD. Modern high-performance ray tracers use large ray packets and SIMD instruction sets to decrease both the computational and bandwidth cost compared to a single ray implementation. Current global illumination renderers, however, are still based around single ray implementations and interfaces. The presumption is that while packets have been shown to work well for highly coherent rays, in the presence of less coherent secondary ray distributions the gains of both packet and SIMD techniques dwindle rapidly. With low enough coherence, performance can be reduced to being as slow as reasonable single ray code – if not worse – so the benefit of packets for a global illumination system is assumed to be next to none. With SIMD width expanding in future architectures, leaving SIMD units underutilized means a massive loss in performance compared to the maximum performance achievable. In this paper, we present a method for recovering packet and SIMD coherence for incoherent secondary ray distributions through demand-driven reordering of rays into more coherent packets. We demonstrate that the reordering overhead is outweighed by the increased coherence within a prototypical implementation in the Manta realtime ray tracer among a wide variety of ray distributions, including diffuse path tracing. 1
RTSL: a Ray Tracing Shading Language
"... images were rendererd with the Manta interactive ray tracer (left) and the batch Monte Carlo renderer Galileo (right). We present a new domain-specific programming language suitable for extending both interactive and non-interactive ray tracing systems. This language, called “ray tracing shading lan ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
images were rendererd with the Manta interactive ray tracer (left) and the batch Monte Carlo renderer Galileo (right). We present a new domain-specific programming language suitable for extending both interactive and non-interactive ray tracing systems. This language, called “ray tracing shading language ” (RTSL), builds on the GLSL language that is a part of the OpenGL specification and familiar to GPU programmers. This language allows a programmer to implement new cameras, primitives, textures, lights, and materials that can be used in multiple rendering systems. RTSL presents a single-ray interface that is easy to program for novice programmers. Through an advanced compiler, packetbased SIMD-optimized code can be generated that is performance competitive with hand-optimized code. This language and compiler combination allows sophisticated primitives, materials and textures to realize the performance gains possible by SIMD and ray packets without the low-level programming burden. In addition to the packet-based Manta system, the compiler targets two additional rendering systems to exercise this flexibility: the PBRT system and the batch Monte Carlo renderer Galileo. 1
BART Museum
"... Figure 1: Real-time Whitted ray tracing on a single, affordable workstation is now possible. All images were rendered at 1024 × 1024 on a dual quad-core system (8 cores total) with 2.0 GHz Intel Xeon processors. In this paper, we explore large ray packet algorithms for acceleration structure travers ..."
Abstract
- Add to MetaCart
Figure 1: Real-time Whitted ray tracing on a single, affordable workstation is now possible. All images were rendered at 1024 × 1024 on a dual quad-core system (8 cores total) with 2.0 GHz Intel Xeon processors. In this paper, we explore large ray packet algorithms for acceleration structure traversal and frustum culling in the context of Whitted ray tracing, and examine how these methods respond to varying ray packet size, scene complexity, and ray recursion complexity. We offer a new algorithm for acceleration structure traversal which is robust to degrading coherence and a new method for generating frustum bounds around reflection and refraction ray packets. We compare, adjust, and finally compose the most effective algorithms into a real-time Whitted ray tracer. With the aid of multi-core CPU technology, our system renders complex scenes with reflections, refractions, and/or point-light shadows anywhere from 4–20 FPS. 1
Coupled Use of BSP and BVH Trees in Order to Exploit Ray Bundle Performance
"... The use of SIMD ray packets [24] has been an important step forward in ray tracing performance. It is the first significant acceleration process based on a strategy that deals with “small ” ray bundles. In order to improve the outcome for wider bundles, very powerful data structures have been used. ..."
Abstract
- Add to MetaCart
The use of SIMD ray packets [24] has been an important step forward in ray tracing performance. It is the first significant acceleration process based on a strategy that deals with “small ” ray bundles. In order to improve the outcome for wider bundles, very powerful data structures have been used. However, these new data structures are not the most powerful when dealing with single rays or SIMD rays. They are therefore unsuitable, and this incompatibility raised the problem of incoherent rays which, such as secondary rays or global illumination rays (i.e. most the rays traced in real life ray tracing) are impossible to process efficiently. An original coupled use of BSP and BVH trees is proposed to overcome this disadvantage. Each tree, used individually, can efficiently boost both small ray bundles and wide ray bundles. Used together in a coupled strategy, they can achieve a global speedup of +50%. Index Terms: I.3.7 [Computer Graphics]: Ray Tracing— 1
Row Tracing using Hierarchical Occlusion Maps
"... A new rendering method that ray traces an entire row of the image at a time is introduced. This moves some of the ray tracing computations into a simplified 1D domain and reduces the memory requirements considerably. Visibility determination is performed efficiently using Hierarchical Occlusion Maps ..."
Abstract
- Add to MetaCart
A new rendering method that ray traces an entire row of the image at a time is introduced. This moves some of the ray tracing computations into a simplified 1D domain and reduces the memory requirements considerably. Visibility determination is performed efficiently using Hierarchical Occlusion Maps and provides faster renderings than packet ray tracing in general and OpenGL for large scenes. In addition, the algorithm shows near perfect scaling when multi-threaded and works very well with kd-trees and octrees, as implementations demonstrate. Finally, optimal rendering times are reached with trees that are an order of magnitude smaller than those required for regular ray tracing. Index Terms: I.3.6 [Computer Graphics]: Three-Dimensional Graphics and Realism—Raytracing, Hidden line/surface removal
STAR – State of The Art Report State of the Art in Ray Tracing Animated Scenes
, 2007
"... Ray tracing has long been a method of choice for off-line rendering, but traditionally was too slow for interactive use. With faster hardware and algorithmic improvements this has recently changed, and real-time ray tracing is finally within reach. However, real-time capability also opens up new pro ..."
Abstract
- Add to MetaCart
Ray tracing has long been a method of choice for off-line rendering, but traditionally was too slow for interactive use. With faster hardware and algorithmic improvements this has recently changed, and real-time ray tracing is finally within reach. However, real-time capability also opens up new problems that do not exist in an off-line environment. In particular real-time ray tracing offers the opportunity to interactively ray trace moving/animated scene content. This presents a challenge to the data structures that have been developed for ray tracing over the past few decades. Spatial data structures crucial for fast ray tracing must be rebuilt or updated as the scene changes, and this can become a bottleneck for the speed of ray tracing. This bottleneck has received much recent attention by researchers that has resulted in a multitude of different algorithms, data structures, and strategies for handling animated scenes. The effectiveness of techniques for ray tracing dynamic scenes vary dramatically depending on details such as scene complexity, model structure, type of motion, and the coherency of the rays. Consequently, there is so far no approach that is best in all cases, and determining the best technique for a particular problem can be a challenge. In this STAR, we aim to survey the different approaches to ray tracing animated scenes, discussing their strengths and weaknesses, and their relationship to other approaches. The overall goal is to help the reader choose the best approach depending on the situation, and to expose promising areas where there is potential for algorithmic improvements. 1.
Efficient Ray Traced Soft Shadows using . . .
"... Ray tracing has long been considered to be superior to rasterization because its ability to trace arbitrary rays, allowing it to simulate virtually any physical light transport effect by just tracing rays. Yet, to look plausible, extraordinary amounts of rays for effects such as soft shadows are typ ..."
Abstract
- Add to MetaCart
Ray tracing has long been considered to be superior to rasterization because its ability to trace arbitrary rays, allowing it to simulate virtually any physical light transport effect by just tracing rays. Yet, to look plausible, extraordinary amounts of rays for effects such as soft shadows are typically required. This makes the prospects of real-time performance rather remote. Rasterization, in contrast, has a record of producing such effects in real-time through employing specialized and approximate solutions for individual effects. Though ray tracing may still be the right choice for effects like reflections and refractions, using specialized solutions for certain important effects also makes sense for a ray tracer. In this paper, we propose a special solution to ray trace soft shadows that is particularly targeted for Intel’s Larrabee architecture. We use a specialized frustum tracing that traces multiple frusta of specialized “light-weight ” shadow packets in parallel, while generating rays within each frustum on demand. The technique can easily be integrated into any packet ray tracer, and fits well into the wide SIMD and cache-size constraints of the Larrabee architecture. Our technique allows to reach rates of up to several dozen million rays per second per Larrabee core, outperforming traditional packet techniques by up to 6×. This high performance combined with a simple light-weight illumination filtering step allows to achieve real-time soft shadows for game-like scenes. 1

