Results 1 - 10
of
14
Chromium: A Stream-Processing Framework for Interactive Rendering on Clusters
, 2002
"... We describe Chromium, a system for manipulating streams of graphics API commands on clusters of workstations. Chromium's stream filters can be arranged to create sort-first and sort-last parallel graphics architectures that, in many cases, support the same applications while using only commodity gra ..."
Abstract
-
Cited by 184 (9 self)
- Add to MetaCart
We describe Chromium, a system for manipulating streams of graphics API commands on clusters of workstations. Chromium's stream filters can be arranged to create sort-first and sort-last parallel graphics architectures that, in many cases, support the same applications while using only commodity graphics accelerators. In addition, these stream filters can be extended programmatically, allowing the user to customize the stream transformations performed by nodes in a cluster. Because our stream processing mechanism is completely general, any cluster-parallel rendering algorithm can be either implemented on top of or embedded in Chromium. In this paper, we give examples of real-world applications that use Chromium to achieve good scalability on clusters of workstations, and describe other potential uses of this stream processing technology. By completely abstracting the underlying graphics architecture, network topology, and API command processing semantics, we allow a variety of applications to run in different environments.
Load Balancing for Multi-Projector Rendering Systems
- in SIGGRAPH/Eurographics Workshop on Graphics Hardware
, 1999
"... Multi-projector systems are increasingly being used to provide large-scale and high-resolution displays for next-generation interactive 3D graphics applications, including large-scale data visualization, immersive virtual environments, and collaborative design. These systems must include a very high ..."
Abstract
-
Cited by 66 (6 self)
- Add to MetaCart
Multi-projector systems are increasingly being used to provide large-scale and high-resolution displays for next-generation interactive 3D graphics applications, including large-scale data visualization, immersive virtual environments, and collaborative design. These systems must include a very high-performance and scalable 3D rendering subsystem in order to generate high-resolution images at real time frame rates. This paper describes a sort-first based parallel rendering system for a scalable display wall system built with a network of PCs, graphics accelerators, and portable projectors. The main challenge is to develop scalable algorithms to partition and assign rendering tasks effectively under the performance and functionality constrains of system area networks, PCs, and commodity 3-D graphics accelerators. We have developed three coarse-grained partitioning algorithms and incorporated them into a working prototype system. This paper describes these algorithms and reports our init...
Hybrid Sort-First and Sort-Last Parallel Rendering with a Cluster of PCs
, 2000
"... We investigate a new hybrid of sort-first and sort-last approach for parallel polygon rendering, using as a target platform a cluster of PCs. Unlike previous methods that statically partition the 3D model and/or the 2D image, our approach performs dynamic, viewdependent and coordinated partitioning ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
We investigate a new hybrid of sort-first and sort-last approach for parallel polygon rendering, using as a target platform a cluster of PCs. Unlike previous methods that statically partition the 3D model and/or the 2D image, our approach performs dynamic, viewdependent and coordinated partitioning of both the 3D model and the 2D image. Using a specific algorithm that follows this approach, we show that it performs better than previous approaches and scales better with both processor count and screen resolution. Overall, our algorithm is able to achieve interactive frame rates with efficiencies of 55.0% to 70.5% during simulations of a system with 64 PCs. While it does have potential disadvantages in client-side processing and in dynamic data management---which also stem from its dynamic, view-dependent nature---these problems are likely to diminish with technology trends in the future. Keywords: Parallel rendering, cluster computing. 1 Introduction The objective of our research is ...
Hierarchical and parallelizable direct volume rendering for irregular and multiple grids
- IEEE Visualization
, 1996
"... A general volume rendering technique is described that efficiently produces images of excellent quality from data defined over irregular grids having a wide variety of formats. Rendering is done in software, eliminating the need for special graphics hardware, as well as any artifacts associated with ..."
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
A general volume rendering technique is described that efficiently produces images of excellent quality from data defined over irregular grids having a wide variety of formats. Rendering is done in software, eliminating the need for special graphics hardware, as well as any artifacts associated with graphics hardware. Images of volumes with about one million cells can be produced in one to several minutes on a workstation with a 150 MHz processor. A significant advantage of this method for applications such as computational fluid dynamics is that it can process multiple intersecting grids. Such grids present problems for most current volume rendering techniques. Also, the wide range of cell sizes (by a factor of 10,000 or more), which is typical of such applications, does not present difficulties, as it does for many techniques. A spatial hierarchical organization makes it possible to access data from a restricted region efficiently. The tree has greater depth in regions of greater detail, determined by the number of cells in the region. It also makes it possible to render useful "preview" images very quickly (about one second for one-million-cell grids) by displaying each region associated with a tree node as one cell. Previews show enough detail to navigate effectively in very large data sets. The algorithmic techniques include use of a k-d tree, with prefixorder partitioning of triangles, to reduce the number of primitives that must be processed for one rendering, coarse-grain parallelism for a shared-memory MIMD architecture, a new perspective transformation that achieves greater numerical accuracy, and a scanline algorithm with depth sorting and a new clipping technique.
An introduction to parallel rendering
- Parallel Computing
, 1997
"... In computer graphics, rendering is the process by which an abstract description of a scene is converted to an image. When the scene is complex, or when high-quality images or high frame rates are required, the rendering process becomes computationally demanding. To provide the necessary levels of pe ..."
Abstract
-
Cited by 35 (2 self)
- Add to MetaCart
In computer graphics, rendering is the process by which an abstract description of a scene is converted to an image. When the scene is complex, or when high-quality images or high frame rates are required, the rendering process becomes computationally demanding. To provide the necessary levels of performance, parallel computing techniques must be brought to bear. Although parallelism has been exploited in computer graphics since the early days of the field, its initial use was primarily in specialized applications. The VLSI revolution of the late 1970Õs and the advent of scalable parallel computers during the late 1980Õs changed this situation. Today, parallel hardware is routinely used in graphics workstations, and numerous software-based rendering systems have been developed for general-purpose parallel architectures. This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering
A Scalable Parallel Cell-Projection Volume Rendering Algorithm for Three-Dimensional Unstructured Data
- IEEE Parallel Rendering Symposium
, 1997
"... Visualizing three-dimensional unstructured data from aerodynamics calculations is challenging because the associated meshes are typically large in size and irregular in both shape and resolution. The goal of this research is to develop a fast, efficient parallel volume rendering algorithm for mas ..."
Abstract
-
Cited by 33 (12 self)
- Add to MetaCart
Visualizing three-dimensional unstructured data from aerodynamics calculations is challenging because the associated meshes are typically large in size and irregular in both shape and resolution. The goal of this research is to develop a fast, efficient parallel volume rendering algorithm for massively parallel distributed-memory supercomputers consisting of a large number of very powerful processors. We use cell-projection instead of ray-casting to provide maximum flexibility in the data distribution and rendering steps. Effective static load balancing is achieved with a round robin distribution of data cells among the processors. A spatial partitioning tree is used to guide the rendering, optimize the image compositing step, and reduce memory consumption. Communication cost is reduced by buffering messages and by overlapping communication with rendering calculations as much as possible. Tests on the IBM SP2 demonstrate that these strategies provide high rendering rates and ...
Irregular grid volume rendering with composition networks
- In Proceedings of IS&T/SPIE Visual Data Exploration and Analysis V
, 1998
"... tetrahedral cell, parallel rendering, parallel compositing, PixelFlow Volumetric irregular grids are the next frontier to conquer in interactive 3D graphics. Visualization algorithms for rectilinear 256 3 data volumes have been optimized to achieve one frame/second to 15 frames/second depending on t ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
tetrahedral cell, parallel rendering, parallel compositing, PixelFlow Volumetric irregular grids are the next frontier to conquer in interactive 3D graphics. Visualization algorithms for rectilinear 256 3 data volumes have been optimized to achieve one frame/second to 15 frames/second depending on the workstation. With equivalent computational resources, irregular grids with millions of cells may take minutes to render for a new viewpoint. The state of the art for graphics rendering, PixelFlow, provides screen and object space parallelism for polygonal rendering. Unfortunately volume rendering of irregular data is at odds with the sort last architecture. I investigate parallel algorithms for direct volume rendering on PixelFlow that generalize to other compositing architectures. Experiments are performed on the Nasa Langley fighter dataset, using the projected tetrahedra approach of Shirley and Tuchman. Tetrahedral sorting is done by the circumscribing sphere approach of Cignoni et al. Key approaches include sortfirst on sort-last, world space subdivision by clipping, rearrangeable linear compositing for any view angle, and static load balancing. The new world space subdivision by clipping provides for efficient and correct rendering of unstructured data by using object space clipping planes. Research results include performance estimates on PixelFlow for irregular grid volume rendering. PixelFlow is estimated to achieve 30 frames/second on irregular grids of 300,000 tetrahedra or 10 million tetrahedra per second.
Parallel Occlusion Culling for Interactive Walkthroughs using Multiple GPUs
, 2002
"... We present a new parallel occlusion culling algorithm for interactive display of large environments. It uses a cluster of three graphics processing units (GPUs) to compute an occlusion representation, cull away occluded objects and render the visible primitives. Moreover, our parallel architecture r ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We present a new parallel occlusion culling algorithm for interactive display of large environments. It uses a cluster of three graphics processing units (GPUs) to compute an occlusion representation, cull away occluded objects and render the visible primitives. Moreover, our parallel architecture reverses the role of two of the GPUs between successive frames to lower the communication overhead. We have combined the occlusion culling algorithm with pre-computed levels-of-detail and use it for interactive display of geometric datasets. The resulting system has been implemented and applied to large environments composed of tens of millions of primitives. In practice, it is able to render such models at interactive rates with little loss in image fidelity. The performance of the overall occlusion culling algorithm is based on the graphics hardware computational power growth curve which has recently outperformed the Moore's Law for general CPU power growth.
Parallel Volume Rendering Unstructured Data: A Distributed Approach
"... The development of e#ective parallel rendering algorithms for unstructured volume data is challenging due to the irregular and adaptive nature of the corresponding meshes. Most of the algorithms developed previously have been mainly for shared-memory architectures. Only a distributed approach can be ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The development of e#ective parallel rendering algorithms for unstructured volume data is challenging due to the irregular and adaptive nature of the corresponding meshes. Most of the algorithms developed previously have been mainly for shared-memory architectures. Only a distributed approach can better meet the computational and memory requirements of the rendering calculations. This paper presents a volume rendering algorithm that distributes both the data and the rendering process among the processors. At each processor, ray-casting of local data is performed independent of the other processors. The global image compositing processes, which require inter-processor communication, are overlapped with the local ray-casting processes to achieve better parallel e#ciency. In theory, this algorithm should attain high parallel e#ciency but its implementation on the Intel Paragon shows otherwise. Besides the added ray-casting overhead, a critical factor is the imbalanced load due to the highly adaptive nature of typical unstructured meshes and the selection of transfer functions. The causes, e#ects and possible cures of the imbalanced load are studied.
A Distributed Memory Algorithm for Volume Rendering
"... Three-dimensional arrays of digital data representing spatial volumes are generated from such diverse fields as the geosciences, space exploration and astrophysics, medical imaging, computational fluid dynamics, molecular modeling, microelectronic field modeling and computer simulation. With current ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Three-dimensional arrays of digital data representing spatial volumes are generated from such diverse fields as the geosciences, space exploration and astrophysics, medical imaging, computational fluid dynamics, molecular modeling, microelectronic field modeling and computer simulation. With current advances in imaging devices and high performance computing, more and more applications will generate volumetric data in the near future. This paper presents a new distributed memory algorithm for volume rendering in a message-passing environment. The algorithm, which uses a slab technique for data partitioning, is a hybrid between the ray-casting and cell projection approaches for volumetric rendering. The results of some scaling experiments using ParaSoft Express on an Intel Paragon at the University of South Carolina are also presented. 1 Introduction Volume rendering is used to show the characteristics of the interior of a solid region as a 2D image. Several approaches to volume renderi...

