Results 1 -
3 of
3
Multi-Level Texture Caching for 3D Graphics Hardware
- IN PROCEEDINGS OF THE 25TH INTERNATIONNAL SYMPOSIUM ON COMPUTER ARCHITECTURE
, 1998
"... Traditional graphics hardware architectures implement what we call the push architecture for texture mapping. Local memory is dedicated to the accelerator for fast local retrieval of texture during rasterization, and the application is responsible for managing this memory. The push architecture has ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
Traditional graphics hardware architectures implement what we call the push architecture for texture mapping. Local memory is dedicated to the accelerator for fast local retrieval of texture during rasterization, and the application is responsible for managing this memory. The push architecture has a bandwidth advantage, but disadvantages of limited texture capacity, escalation of accelerator memory requirements (and therefore cost), and poor memory utilization. The push architecture also requires the programmer to solve the binpacking problem of managing accelerator memory each frame. More recently graphics hardware on PC-class machines has moved to an implementation of what we call the pull architecture. Texture is stored in system memory and downloaded by the accelerator as needed. The pull architecture has advantages of texture capacity, stems the escalation of accelerator memory requirements, and has good memory utilization. It also frees the programmer from accelerator texture memory management. However, the pull architecture suffers escalating requirements for bandwidth from main memory to the accelerator. In this paper we propose multi-level texture caching to provide the accelerator with the bandwidth advantages of the push architecture combined with the capacity advantages of the pull architecture. We have studied the feasibility of 2-level caching and found the following: (1) significant re-use of texture between frames; (2) L2 caching requires significantly less memory than the push architecture; (3) L2 caching requires significantly less bandwidth from host memory than the pull architecture; (4) L2 caching enables implementation of smaller L1 caches that would otherwise bandwidth-limit accelerators on the workloads in this paper. Results suggest that an L2 ...
Hardware for Superior Texture Performance
, 1996
"... Mapping textures onto surfaces of computer-gener-ated objects is a technique which greatly improves the realism of their appearance. Unfortunately, this imposes high computational demands and, even worse, tremendous memory bandwidth require- ments on the graphics system. Tight cost frames in the ind ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Mapping textures onto surfaces of computer-gener-ated objects is a technique which greatly improves the realism of their appearance. Unfortunately, this imposes high computational demands and, even worse, tremendous memory bandwidth require- ments on the graphics system. Tight cost frames in the industry in conjunction with ever increasing user expectations make the design of a powerful texture mapping unit a difficult task. To meet these requirements we follow two different approaches. On the technology side, we observe a rapidly emerging technology which offers the combination of enormous transfer rates and computing power: logic-embedded memories. On the algorithmic side, a common way to reduce data traffic is image compression. Its application to texture mapping, however, is difficult since the decompression must be done at pixel frequency. In this work we will focus on the latter approach, describing the use of a specific compression scheme for texture mapping. It allows the use of a very simple and fast decompression hardware, bringing high performance texture mapping to low-cost systems.
The Setup for Triangle Rasterization
, 1996
"... Integrating the slope and setup calculations for triangles to the rasterizer offloads the host processor from intensive calculations and can significantly increase 3D system performance. The processing on the host is greatly reduced and much less data is passed from the host to the graphics subsyste ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Integrating the slope and setup calculations for triangles to the rasterizer offloads the host processor from intensive calculations and can significantly increase 3D system performance. The processing on the host is greatly reduced and much less data is passed from the host to the graphics subsystem. A setup architecture handling generalized triangle meshes and computing all necessary parameters for a high-end raster pipeline to generate Gouraud shaded, texture- and bumpmapped triangles is described and its benefits on the final bandwidth are shown. To efficiently compute the slopes and color gradients for each triangle, some implementation aspects on division and multiplication pipelines are discussed. The Setup for Triangle Rasterization Anders Kugler University of Tübingen - Computer Graphics Laboratory (1) (1) Universität Tübingen Wilhelm-Schickard-Institut für Informatik Graphisch-Interaktive Systeme Auf der Morgenstelle 10 D-72076 Tübingen - Germany email: kugler@gris.uni-t...

