Results 1  10
of
4,847
The SPLASH2 programs: Characterization and methodological considerations
 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE
, 1995
"... The SPLASH2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed sharedaddressspace multiprocessors. In this context, this paper has two goals. One is to quantitatively characterize the SPLASH2 programs in terms of fundamental propertie ..."
Abstract

Cited by 1399 (12 self)
 Add to MetaCart
properties and architectural interactions that are important to understand them well. The properties we study include the computational load balance, communication to computation ratio and traffic needs, important working set sizes, and issues related to spatial locality, as well as how these properties
The Cost of Balancing Generalized Quadtrees
 In Proceedings of the 3rd Symposium on Solid Modeling and Applications
, 1995
"... A balanced quadtree has no adjacent elements of vastly different size. Refining a quadtree to balance it is a preliminary step in many finite element, mesh generation and computer graphics rendering algorithms. A cost of balancing is the creation of a somewhat larger quadtree. The paper considers se ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
A balanced quadtree has no adjacent elements of vastly different size. Refining a quadtree to balance it is a preliminary step in many finite element, mesh generation and computer graphics rendering algorithms. A cost of balancing is the creation of a somewhat larger quadtree. The paper considers
A survey of generalpurpose computation on graphics hardware
, 2007
"... The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability, have made graphics hardware acompelling platform for computationally demanding tasks in awide variety of application domains. In this report, we describe, summarize, and analyze the l ..."
Abstract

Cited by 545 (18 self)
 Add to MetaCart
The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability, have made graphics hardware acompelling platform for computationally demanding tasks in awide variety of application domains. In this report, we describe, summarize, and analyze the latest research in mapping generalpurpose computation to graphics hardware. We begin with the technical motivations that underlie generalpurpose computation on graphics processors (GPGPU) and describe the hardware and software developments that have led to the recent interest in this field. We then aim the main body of this report at two separate audiences. First, we describe the techniques used in mapping generalpurpose computation to graphics hardware. We believe these techniques will be generally useful for researchers who plan to develop the next generation of GPGPU algorithms and techniques. Second, we survey and categorize the latest developments in generalpurpose application development on graphics hardware.
FAST VOLUME RENDERING USING A SHEARWARP FACTORIZATION OF THE VIEWING TRANSFORMATION
, 1995
"... Volume rendering is a technique for visualizing 3D arrays of sampled data. It has applications in areas such as medical imaging and scientific visualization, but its use has been limited by its high computational expense. Early implementations of volume rendering used bruteforce techniques that req ..."
Abstract

Cited by 541 (2 self)
 Add to MetaCart
Volume rendering is a technique for visualizing 3D arrays of sampled data. It has applications in areas such as medical imaging and scientific visualization, but its use has been limited by its high computational expense. Early implementations of volume rendering used bruteforce techniques that require on the order of 100 seconds to render typical data sets on a workstation. Algorithms with optimizations that exploit coherence in the data have reduced rendering times to the range of ten seconds but are still not fast enough for interactive visualization applications. In this thesis we present a family of volume rendering algorithms that reduces rendering times to one second. First we present a scanlineorder volume rendering algorithm that exploits coherence in both the volume data and the image. We show that scanlineorder algorithms are fundamentally more efficient than commonlyused ray casting algorithms because the latter must perform analytic geometry calculations (e.g. intersecting rays with axisaligned boxes). The new scanlineorder algorithm simply streams through the volume and the image in storage order. We describe variants of the algorithm for both parallel and perspective projections and
Image retrieval: Current techniques, promising directions and open issues
 Journal of Visual Communication and Image Representation
, 1999
"... This paper provides a comprehensive survey of the technical achievements in the research area of image retrieval, especially contentbased image retrieval, an area that has been so active and prosperous in the past few years. The survey includes 100+ papers covering the research aspects of image fea ..."
Abstract

Cited by 492 (14 self)
 Add to MetaCart
This paper provides a comprehensive survey of the technical achievements in the research area of image retrieval, especially contentbased image retrieval, an area that has been so active and prosperous in the past few years. The survey includes 100+ papers covering the research aspects of image feature representation and extraction, multidimensional indexing, and system design, three of the fundamental bases of contentbased image retrieval. Furthermore, based on the stateoftheart technology available now and the demand from realworld applications, open research issues are identified and future promising research directions are suggested. C ○ 1999 Academic Press 1.
The implementation of the cilk5 multithreaded language
 In PLDI ’98: Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
, 1998
"... The fth release of the multithreaded language Cilk uses a provably good \workstealing " scheduling algorithm similar to the rst system, but the language has been completely redesigned and the runtime system completely reengineered. The eciency of the new implementation was aided by a clear st ..."
Abstract

Cited by 493 (30 self)
 Add to MetaCart
The fth release of the multithreaded language Cilk uses a provably good \workstealing " scheduling algorithm similar to the rst system, but the language has been completely redesigned and the runtime system completely reengineered. The eciency of the new implementation was aided by a clear strategy that arose from a theoretical analysis of the scheduling algorithm: concentrate on minimizing overheads that contribute to the work, even at the expense of overheads that contribute to the critical path. Although it may seem counterintuitive to move overheads onto the critical path, this \workrst " principle has led to a portable Cilk5 implementation in which the typical cost of spawning a parallel thread is only between 2 and 6 times the cost of a C function call on a variety of contemporary machines. Many Cilk programs run on one processor with virtually no degradation compared to equivalent C programs. This paper describes how the workrst principle was exploited in the design of Cilk5's compiler and its runtime system. In particular, we present Cilk5's novel \twoclone " compilation strategy and its Dijkstralike mutualexclusion protocol for implementing the ready deque in the workstealing scheduler.
Algorithms for coloring quadtrees
 Algorithmica
, 2002
"... We describe simple linear time algorithms for coloring the squares of balanced and unbalanced quadtrees so that no two adjacent squares are given the same color. If squares sharing sides are defined as adjacent, we color balanced quadtrees with three colors, and unbalanced quadtrees with four colors ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
We describe simple linear time algorithms for coloring the squares of balanced and unbalanced quadtrees so that no two adjacent squares are given the same color. If squares sharing sides are defined as adjacent, we color balanced quadtrees with three colors, and unbalanced quadtrees with four
Searching in metric spaces
, 2001
"... The problem of searching the elements of a set that are close to a given query element under some similarity criterion has a vast number of applications in many branches of computer science, from pattern recognition to textual and multimedia information retrieval. We are interested in the rather gen ..."
Abstract

Cited by 432 (38 self)
 Add to MetaCart
The problem of searching the elements of a set that are close to a given query element under some similarity criterion has a vast number of applications in many branches of computer science, from pattern recognition to textual and multimedia information retrieval. We are interested in the rather general case where the similarity criterion defines a metric space, instead of the more restricted case of a vector space. Many solutions have been proposed in different areas, in many cases without crossknowledge. Because of this, the same ideas have been reconceived several times, and very different presentations have been given for the same approaches. We present some basic results that explain the intrinsic difficulty of the search problem. This includes a quantitative definition of the elusive concept of “intrinsic dimensionality. ” We also present a unified
A rapid hierarchical radiosity algorithm
 Computer Graphics
, 1991
"... This paper presents a rapid hierarchical radiosity algorithm for illuminating scenes containing lar e polygonal patches. The afgorithm constructs a hierarchic“J representation of the form factor matrix by adaptively subdividing patches into su bpatches according to a usersupplied error bound. The a ..."
Abstract

Cited by 412 (11 self)
 Add to MetaCart
This paper presents a rapid hierarchical radiosity algorithm for illuminating scenes containing lar e polygonal patches. The afgorithm constructs a hierarchic“J representation of the form factor matrix by adaptively subdividing patches into su bpatches according to a usersupplied error bound. The algorithm guarantees that all form factors are calculated to the same precision, removing many common image artifacts due to inaccurate form factors. More importantly, the al orithm decomposes the form factor matrix into at most O? n) blocks (where n is the number of elements). Previous radiosity algorithms represented the elementtoelement transport interactions with n2 form factors. Visibility algorithms are given that work well with this approach. Standard techniques for shooting and gathering can be used with the hierarchical representation to solve for equilibrium radiosities, but we also discuss using a brightnessweighted error criteria, in conjunction with multigrldding, to even more rapidly progressively refine the image.
Results 1  10
of
4,847