Results 1  10
of
16
Dynamic octree load balancing using spacefilling curves
, 2003
"... The Zoltan dynamic load balancing library provides applications with a reusable object oriented interface to several load balancing techniques, including coordinate bisection, octree/space filling curve methods, and multilevel graph partitioners. We describe enhancements to Zoltan’s octree load bala ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
The Zoltan dynamic load balancing library provides applications with a reusable object oriented interface to several load balancing techniques, including coordinate bisection, octree/space filling curve methods, and multilevel graph partitioners. We describe enhancements to Zoltan’s octree load balancing procedure and its distributed structures that improve performance of the space filling curve (SFC) traversals by
Processor Allocation on Cplant: Achieving General Processor Locality Using OneDimensional Allocation Strategies
 In Proc. 4th IEEE International Conference on Cluster Computing
, 2002
"... The Computational Plant or Cplant is a commoditybased supercomputer under development at Sandia National Laboratories. This paper describes resourceallocation strategies to achieve processor locality for parallel jobs in Cplant and other supercomputers. Users of Cplant and other Sandia supercomput ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
The Computational Plant or Cplant is a commoditybased supercomputer under development at Sandia National Laboratories. This paper describes resourceallocation strategies to achieve processor locality for parallel jobs in Cplant and other supercomputers. Users of Cplant and other Sandia supercomputers submit parallel jobs to a job queue. When a job is scheduled to run, it is assigned to a set of processors. To obtain maximum throughput, jobs should be allocated to localized clusters of processors to minimize communication costs and to avoid bandwidth contention caused by overlapping jobs.
Tensor product formulation for Hilbert spacefilling curves
 In Proceedings of the 2003 International Conference on Parallel Processing
, 2003
"... We present a tensor product formulation for Hilbert spacefilling curves. Both recursive and iterative formulas are expressed in the paper. We view a Hilbert spacefilling curve as a permutation which maps twodimensional ¥§¦©¨�¥� ¦ data elements stored in the row major or column major order to the ..."
Abstract

Cited by 6 (6 self)
 Add to MetaCart
We present a tensor product formulation for Hilbert spacefilling curves. Both recursive and iterative formulas are expressed in the paper. We view a Hilbert spacefilling curve as a permutation which maps twodimensional ¥§¦©¨�¥� ¦ data elements stored in the row major or column major order to the order of traversing a Hilbert spacefilling curve. The tensor product formula of Hilbert spacefilling curves uses several permutation operations: stride permutation, radix2 Gray permutation, transposition, and antidiagonal transposition. The iterative tensor product formula can be manipulated to obtain the inverse Hilbert permutation. Also, the formulas are directly translated into computer programs which can be used in various applications including Rtree indexing, image processing, and process llocation, etc. Key words: tensor product, block recursive algorithm, Hilbert spacefilling curve, stride
Partitioning and Dynamic Load Balancing for the Numerical Solution of Partial Differential Equations
 Numerical Solution of Partial Differential Equations on Parallel Computers
, 2005
"... lement methods, have workloads that are unpredictable or change during the computation, requiring dynamic load balancers that adjust the decomposition as the computation proceeds. Partitioning approaches attempt to distribute computational work equally, while minimizing interprocessor communication ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
lement methods, have workloads that are unpredictable or change during the computation, requiring dynamic load balancers that adjust the decomposition as the computation proceeds. Partitioning approaches attempt to distribute computational work equally, while minimizing interprocessor communication costs. Communication costs are governed by the amount of data to be shared by cooperating processes (communication volume) and the number of partitions sharing the data (number of messages). Dynamic loadbalancing procedures should also operate in parallel on distributed data, execute quickly, and minimize data movement by making the new data distribution as similar as possible to the existing one. The partitioning problem is defined in more detail in Section 1. Numerous partitioning strategies have been developed. The various strategies are distinguished by tradeo#s between partition quality, amount of data movement, and partitioning speed. Characteristics of an application (e.g., computat
Algorithmic Support for CommodityBased Parallel Computing Systems
, 2003
"... Follows Abstract The Computational Plant or Cplant is a commoditybased distributedmemory supercomputer under development at Sandia National Laboratories. ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Follows Abstract The Computational Plant or Cplant is a commoditybased distributedmemory supercomputer under development at Sandia National Laboratories.
Efficient parallel algorithms for solvent accessible surface area of proteins
 IEEE Trans. Parallel Dist. Syst
"... We present faster sequential and parallel algorithms for computing the solvent accessible surface area (ASA) of protein molecules. The ASA is computed by finding the exposed surface areas of the spheres obtained by increasing the van der Waals ’ radii of the atoms with the van der Waals ’ radius of ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We present faster sequential and parallel algorithms for computing the solvent accessible surface area (ASA) of protein molecules. The ASA is computed by finding the exposed surface areas of the spheres obtained by increasing the van der Waals ’ radii of the atoms with the van der Waals ’ radius of the solvent. Using domain specific knowledge, we show that the number of sphere intersections is only O(n), where n is the number of atoms in the protein molecule. For computing sphere intersections, we present hashbased algorithms that run in O(n) expected sequential time and O expected parallel time and sortbased algorithms that run in worstcase O (n log n) sen log n quential time and O p parallel time. These are significant improvements over previously known algorithms which take O � n2 � � � n2 time sequentially and O p time in parallel. We present a Monte Carlo algorithm for computing the solvent accessible surface area. The basic idea is to generate points uniformly at random on the surface of spheres obtained by increasing the van der Waals ’ radii of the atoms with the van der Waals ’ radius of the solvent molecule and to test the points for accessibility. We also provide error bounds as a function of the sample size. Experimental verification of the algorithms is carried out using an IBM SP2.
LoadBalancing Spatially Located Computations using Rectangular Partitions
 ARXIV
, 2011
"... Distributing spatially located heterogeneous workloads is an important problem in parallel scientific computing. We investigate the problem of partitioning such workloads (represented as a matrix of nonnegative integers) into rectangles, such that the load of the most loaded rectangle (processor) i ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Distributing spatially located heterogeneous workloads is an important problem in parallel scientific computing. We investigate the problem of partitioning such workloads (represented as a matrix of nonnegative integers) into rectangles, such that the load of the most loaded rectangle (processor) is minimized. Since finding the optimal arbitrary rectanglebased partition is an NPhard problem, we investigate particular classes of solutions: rectilinear, jagged and hierarchical. We present a new class of solutions called mway jagged partitions, propose new optimal algorithms for mway jagged partitions and hierarchical partitions, propose new heuristic algorithms, and provide worst case performance analyses for some existing and new heuristics. Moreover, the algorithms are tested in simulation on a wide set of instances. Results show that two of the algorithms we introduce lead to a much better load balance than the stateoftheart algorithms. We also show how to design a twophase algorithm that reaches different time/quality tradeoff.
Mapping with Space Filling Surfaces
, 2006
"... The use of space filling curves for proximityimproving mappings is well known and has found many useful applications in parallel computing. Such curves permit a linear array to be mapped onto a 2(respectively, 3)D structure such that points distance d apart in the linear array are distance O(d 1 2) ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
The use of space filling curves for proximityimproving mappings is well known and has found many useful applications in parallel computing. Such curves permit a linear array to be mapped onto a 2(respectively, 3)D structure such that points distance d apart in the linear array are distance O(d 1 2) (O(d 1 3)) apart in the 2(3)D array and viceversa. We extend the concept of space filling curves to space filling surfaces and show how these surfaces lead to mappings from 2D to 3D so that points at distance d 1 2 on the 2D surface are mapped to points at distance O(d 1 3) in the 3D volume. Three classes of surfaces, associated respectively with the Peano curve, Sierpiński carpet, and the Hilbert curve, are presented. A methodology for using these surfaces to map from 2D to 3D is developed. These results permit efficient execution of 2D computations on processors interconnected in a 3D grid. The space filling surfaces proposed by us are the first such fractal objects to be formally defined and are thus also of intrinsic interest in the context of fractal geometry. Index terms–Fractals, Hilbert curve, proximityimproving mapping, parallel computing, Peano curve, Sierpiński carpet, space filling curves, space filling surfaces.
Abstract From Mesh Generation to Scientific Visualization: An EndtoEnd Approach to Parallel Supercomputing
"... Parallel supercomputing has traditionally focused on the inner kernel of scientific simulations: the solver. The front and back ends of the simulation pipeline—problem description and interpretation of the output—have taken a back seat to the solver when it comes to attention paid to scalability and ..."
Abstract
 Add to MetaCart
Parallel supercomputing has traditionally focused on the inner kernel of scientific simulations: the solver. The front and back ends of the simulation pipeline—problem description and interpretation of the output—have taken a back seat to the solver when it comes to attention paid to scalability and performance, and are often relegated to offline, sequential computation. As the largest simulations move beyond the realm of the terascale and into the petascale, this decomposition in tasks and platforms becomes increasingly untenable. We propose an endtoend approach in which all simulation components—meshing, partitioning, solver, and visualization—are tightly coupled and execute
From Mesh Generation to Scientific Visualization:
 in SC2006
, 2006
"... Parallel supercomputing has typically focused on the inner kernel of scientific simulations: the solver. The front and back ends of the simulation pipelineproblem description and interpretation of the outputhave taken a back seat to the solver when it comes to attention paid to scalability and ..."
Abstract
 Add to MetaCart
Parallel supercomputing has typically focused on the inner kernel of scientific simulations: the solver. The front and back ends of the simulation pipelineproblem description and interpretation of the outputhave taken a back seat to the solver when it comes to attention paid to scalability and performance, and are often relegated to offline, sequential computation. As the largest simulations move beyond the realm of the terascale and into the petascale, this decomposition in tasks and platforms becomes increasingly untenable. We propose an endtoend approach in which all simulation componentsmeshing, partitioning, solver, and visualizationare tightly coupled and execute in parallel with shared data structures and no intermediate I/O. We present our implementation of this new approach in the context of octreebased finite element simulation of earthquake ground motion. Performance evaluation on up to 2048 processors demonstrates the ability of the endtoend approach to overcome the scalability bottlenecks of the traditional approach.