Results 1 - 10
of
47
Chromium: A Stream-Processing Framework for Interactive Rendering on Clusters
, 2002
"... We describe Chromium, a system for manipulating streams of graphics API commands on clusters of workstations. Chromium's stream filters can be arranged to create sort-first and sort-last parallel graphics architectures that, in many cases, support the same applications while using only commodit ..."
Abstract
-
Cited by 308 (10 self)
- Add to MetaCart
We describe Chromium, a system for manipulating streams of graphics API commands on clusters of workstations. Chromium's stream filters can be arranged to create sort-first and sort-last parallel graphics architectures that, in many cases, support the same applications while using only commodity graphics accelerators. In addition, these stream filters can be extended programmatically, allowing the user to customize the stream transformations performed by nodes in a cluster. Because our stream processing mechanism is completely general, any cluster-parallel rendering algorithm can be either implemented on top of or embedded in Chromium. In this paper, we give examples of real-world applications that use Chromium to achieve good scalability on clusters of workstations, and describe other potential uses of this stream processing technology. By completely abstracting the underlying graphics architecture, network topology, and API command processing semantics, we allow a variety of applications to run in different environments.
WireGL: A Scalable Graphics System for Clusters
- Computer Graphics (Proceedings of SIGGRAPH 01
, 2001
"... We describe WireGL, a system for scalable interactive rendering on a cluster of workstations. WireGL provides the familiar OpenGL API to each node in a cluster, virtualizing multiple graphics accelerators into a sort-first parallel renderer with a parallel interface. We also describe techniques for ..."
Abstract
-
Cited by 130 (3 self)
- Add to MetaCart
(Show Context)
We describe WireGL, a system for scalable interactive rendering on a cluster of workstations. WireGL provides the familiar OpenGL API to each node in a cluster, virtualizing multiple graphics accelerators into a sort-first parallel renderer with a parallel interface. We also describe techniques for reassembling an output image from a set of tiles distributed over a cluster. Using flexible display management, WireGL can drive a variety of output devices, from standalone displays to tiled display walls. By combining the power of virtual graphics, the familiarity and ordered semantics of OpenGL, and the scalability of clusters, we are able to create time-varying visualizations that sustain rendering performance over 70,000,000 triangles per second at interactive refresh rates using 16 compute nodes and 16 rendering nodes.
Sort-last parallel rendering for viewing extremely large data sets on tile displays
- In PVG ’01: Proceedings of the IEEE 2001 symposium on parallel and large-data visualization and graphics
, 2001
"... Due to the impressive price-performance of today’s PCbased graphics accelerator cards, Sandia National Laboratories is attempting to use PC clusters to render extremely large data sets in interactive applications. This paper describes a sort-last parallel rendering system running on a PC cluster tha ..."
Abstract
-
Cited by 37 (1 self)
- Add to MetaCart
(Show Context)
Due to the impressive price-performance of today’s PCbased graphics accelerator cards, Sandia National Laboratories is attempting to use PC clusters to render extremely large data sets in interactive applications. This paper describes a sort-last parallel rendering system running on a PC cluster that is capable of rendering enormous amounts of geometry onto high-resolution tile displays by taking advantage of the spatial coherency that is inherent in our data. Furthermore, it is capable of scaling to larger sized input data or higher resolution displays by increasing the size of the cluster. Our prototype is now capable of rendering 120 million triangles per second on a 12 mega-pixel display.
Scalable Interactive Volume Rendering Using Off-the-Shelf Components
- In Proc. IEEE Symp. Parallel Large-Data Vis. Graphics (PVG) (2001
, 2001
"... This paper describes an application of a second generation implementation of the Sepia architecture (Sepia-2) to interactive volumetric visualization of large rectilinear scalar fields. By employing pipelined associative blending operators in a sort-last configuration a demonstration system with 8 r ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
(Show Context)
This paper describes an application of a second generation implementation of the Sepia architecture (Sepia-2) to interactive volumetric visualization of large rectilinear scalar fields. By employing pipelined associative blending operators in a sort-last configuration a demonstration system with 8 rendering computers sustains 24 to 28 frames per second while interactively rendering large data volumes (1024x256x256 voxels, and 512x512x512 voxels). We believe interactive performance at these frame rates and data sizes is unprecedented. We also believe these results can be extended to other types of structured and unstructured grids and a variety of GL rendering techniques including surface rendering and shadow mapping. We show how to extend our single-stage crossbar demonstration system to multi-stage networks in order to support much larger data sizes and higher image resolutions. This requires solving a dynamic mapping problem for a class of blending operators that includes Porter-Duff compositing operators. CR Categories: C.2.4 [Computer Systems Organization]: Computer-Communication Networks---Distributed Systems; C.2.5 [Computer Systems Organization]: Computer-Communication Networks---Local and Wide Area Networks; C.5.1 [Computer System Implementation]: Large and Medium ("Mainframe") Computers---Super Computers; D.1.3 [Software]: Programming Techniques---Concurrent Programming; I.3.1 [Computing Methodologies ]: Computer Graphics---Hardware Architecture; I.3.2 [Computing Methodologies]: Computer Graphics---Graphics Systems; I.3.3 [Computing Methodologies]: Computer Graphics--- Picture/Image Generation; I.3.7 [Computing Methodologies]: Computer Graphics---Three-Dimensional Graphics and Realism Keywords: sort-last, parallel, cluster, shear-warp, volume rendering, ray-c...
Cots cluster-based sort-last rendering: Performance evaluation and pipelined implementation
- In Proceedings of IEEE Visualization
, 2005
"... Figure 1: Views of the head section (512x512x209) of the visible female CT data with 16 nodes (a space has been left between the subvolumes to highlight their boundaries). Using a 3 years old 32-node COTS cluster, a volume dataset can be rendered at constant 13 frames per second on a 1024 × 768 rend ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
(Show Context)
Figure 1: Views of the head section (512x512x209) of the visible female CT data with 16 nodes (a space has been left between the subvolumes to highlight their boundaries). Using a 3 years old 32-node COTS cluster, a volume dataset can be rendered at constant 13 frames per second on a 1024 × 768 rendering area using 5 nodes. On a 1.5 years old, fully optimized, 5-node COTS cluster, the frame rate obtained for the same rendering area reaches constant 31 frames per second. We truly expect our future work, including further algorithm optimizations and hardware tuning on a modern PC cluster, to provide higher frame rates for bigger datasets (using more nodes) on larger rendering areas. Sort-last parallel rendering is an efficient technique to visualize huge datasets on COTS clusters. The dataset is subdivided and distributed across the cluster nodes. For every frame, each node renders a full resolution image of its data using its local GPU, and the images are composited together using a parallel image compositing algorithm. In this paper, we present a performance evaluation of standard sort-last parallel rendering methods and of the different improvements proposed in the literature. This evaluation is based on a detailed analysis of the different hardware and software components.
Massively parallel volume rendering using 2-3 swap image compositing
- In Proceedings of the 2008 ACM/IEEE Conference on Supercomputing
, 2008
"... The ever-increasing amounts of simulation data produced by scientists demand high-end parallel visualization capability. However, image compositing, which requires interprocessor communication, is often the bottleneck stage for parallel rendering of large volume data sets. Existing image compositing ..."
Abstract
-
Cited by 26 (6 self)
- Add to MetaCart
(Show Context)
The ever-increasing amounts of simulation data produced by scientists demand high-end parallel visualization capability. However, image compositing, which requires interprocessor communication, is often the bottleneck stage for parallel rendering of large volume data sets. Existing image compositing solutions either incur a large number of messages exchanged among processors (such as the direct send method), or limit the number of processors that can be effectively utilized (such as the binary swap method). We introduce a new image compositing algorithm, called 2-3 swap, which combines the flexibility of the direct send method and the optimality of the binary swap method. The 2-3 swap algorithm allows an arbitrary number of processors to be used for compositing, and fully utilizes all participating processors throughout the course of the compositing. We experiment with this image compositing solution on a supercomputer with thousands of processors, and demonstrate its great flexibility as well as scalability. 1.
Parallel and Out-of-core View-dependent Isocontour Visualization Using Random Data Distribution
- PROCEEDINGS OF THE JOINT EUROGRAPHICS-IEEE TCVG SYMPOSIUM ON VISUALIZATION (VISSYM-02
, 2002
"... In this paper we describe a parallel and out-of-core view-dependent isocontour visualization algorithm that efficiently extracts and renders the visible portions of an isosurface from large datasets. The algorithm first creates an occlusion map using ray-casting and nearest neighbors. With the occ ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
In this paper we describe a parallel and out-of-core view-dependent isocontour visualization algorithm that efficiently extracts and renders the visible portions of an isosurface from large datasets. The algorithm first creates an occlusion map using ray-casting and nearest neighbors. With the occlusion map constructed, the visible portion of the isosurface is extracted and rendered. All steps are in a single pass with minimal communication overhead. The
TeraVision: a Distributed, Scalable, High Resolution Graphics Streaming System
- in the proceedings of IEEE Cluster 2004
, 2004
"... In electronically mediated distance collaborations involving scientific data, there is often the need to stream the graphical output of individual computers or entire visualization clusters to remote displays. This paper presents TeraVision as a scalable platform-independent solution which is capabl ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
(Show Context)
In electronically mediated distance collaborations involving scientific data, there is often the need to stream the graphical output of individual computers or entire visualization clusters to remote displays. This paper presents TeraVision as a scalable platform-independent solution which is capable of transmitting multiple synchronized high-resolution video streams between single workstations and/or clusters without requiring any modifications to be made to the source or destination machines. Issues addressed include: how to synchronize individual video streams to form a single larger stream; how to scale and route streams generated by an array of MxN nodes to fit a XxY display; and how TeraVision exploits a variety of transport protocols. Results from experiments conducted over gigabit local-area networks and wide-area networks (between Chicago and Amsterdam), are presented. Finally, we propose the Scalable Adaptive Graphics Environment (SAGE)- an architecture to support future collaborative visualization environments with potentially billions of pixels. 1.
The SAGE Graphics Architecture
, 2002
"... The Scalable, Advanced Graphics Environment (SAGE) is a new high-end, multi-chip rendering architecture. Each single SAGE board can render in excess of 80 million fully lit, textured, antialiased triangles per second. SAGE brings high quality antialiasing filters to video rate hardware for the first ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
The Scalable, Advanced Graphics Environment (SAGE) is a new high-end, multi-chip rendering architecture. Each single SAGE board can render in excess of 80 million fully lit, textured, antialiased triangles per second. SAGE brings high quality antialiasing filters to video rate hardware for the first time. To achieve this, the concept of a frame buffer is replaced by a fully double-buffered sample buffer of between 1 and 16 non-uniformly placed samples per final output pixel. The video output raster of samples is subject to convolution by a 55 programmable reconstruction and bandpass filter that replaces the traditional RAMDAC. The reconstruction filter processes up to 400 samples per output pixel, and supports any radially symmetric filter, including those with negative lobes (full Mitchell-Netravali filter). Each SAGE board comprises four parallel rendering sub-units, and supports up to two video output channels. Multiple SAGE systems can be tiled together to support even higher fill rates, resolutions, and performance.
Real-Time Compression for dynamic 3D Environments
- in ACM Multimedia'03
, 2003
"... ABSTRACT The goal of tele-immersion has long been to enable people at remote locations to share a sense of presence. A tele-immersion system acquires the 3D representation of a collaborator's environment remotely and sends it over the network where it is rendered in the user's environment ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
(Show Context)
ABSTRACT The goal of tele-immersion has long been to enable people at remote locations to share a sense of presence. A tele-immersion system acquires the 3D representation of a collaborator's environment remotely and sends it over the network where it is rendered in the user's environment. Acquisition, reconstruction, transmission, and rendering all have to be done in real-time to create a sense of presence. With added commodity hardware resources, parallelism can increase the acquisition volume and reconstruction data quality while maintaining real-time performance. However this is not as easy for rendering since all of the data need to be combined into a single display. In this paper we present an algorithm to compress data from such 3D environments in real-time to solve this imbalance. We expect the compression algorithm to scale comparably to the acquisition and reconstruction, reduce network transmission bandwidth, and reduce the rendering requirement for real-time performance. We have tested the algorithm using a synthetic office data set and have achieved a 5 to 1 compression for 22 depth streams.