Results 1  10
of
28
R.: From Mesh Generation to Scientific Visualization: An EndtoEnd Approach to Parallel Supercomputing
 In Supercomputing (SC), Proceedings of the ACM/IEEE Conference on (New
"... See next page for additional authors ..."
(Show Context)
Processor Allocation on Cplant: Achieving General Processor Locality Using OneDimensional Allocation Strategies
 In Proc. 4th IEEE International Conference on Cluster Computing
, 2002
"... The Computational Plant or Cplant is a commoditybased supercomputer under development at Sandia National Laboratories. This paper describes resourceallocation strategies to achieve processor locality for parallel jobs in Cplant and other supercomputers. Users of Cplant and other Sandia supercomput ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
(Show Context)
The Computational Plant or Cplant is a commoditybased supercomputer under development at Sandia National Laboratories. This paper describes resourceallocation strategies to achieve processor locality for parallel jobs in Cplant and other supercomputers. Users of Cplant and other Sandia supercomputers submit parallel jobs to a job queue. When a job is scheduled to run, it is assigned to a set of processors. To obtain maximum throughput, jobs should be allocated to localized clusters of processors to minimize communication costs and to avoid bandwidth contention caused by overlapping jobs.
Tensor product formulation for Hilbert spacefilling curves
 In Proceedings of the 2003 International Conference on Parallel Processing
, 2003
"... We present a tensor product formulation for Hilbert spacefilling curves. Both recursive and iterative formulas are expressed in the paper. We view a Hilbert spacefilling curve as a permutation which maps twodimensional ¥§¦©¨�¥� ¦ data elements stored in the row major or column major order to the ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
(Show Context)
We present a tensor product formulation for Hilbert spacefilling curves. Both recursive and iterative formulas are expressed in the paper. We view a Hilbert spacefilling curve as a permutation which maps twodimensional ¥§¦©¨�¥� ¦ data elements stored in the row major or column major order to the order of traversing a Hilbert spacefilling curve. The tensor product formula of Hilbert spacefilling curves uses several permutation operations: stride permutation, radix2 Gray permutation, transposition, and antidiagonal transposition. The iterative tensor product formula can be manipulated to obtain the inverse Hilbert permutation. Also, the formulas are directly translated into computer programs which can be used in various applications including Rtree indexing, image processing, and process llocation, etc. Key words: tensor product, block recursive algorithm, Hilbert spacefilling curve, stride
Partitioning and Dynamic Load Balancing for the Numerical Solution of Partial Differential Equations
 NUMERICAL SOLUTION OF PARTIAL DIFFERENTIAL EQUATIONS ON PARALLEL COMPUTERS
, 2005
"... ..."
(Show Context)
LoadBalancing Spatially Located Computations using Rectangular Partitions
 ARXIV
, 2011
"... Distributing spatially located heterogeneous workloads is an important problem in parallel scientific computing. We investigate the problem of partitioning such workloads (represented as a matrix of nonnegative integers) into rectangles, such that the load of the most loaded rectangle (processor) i ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Distributing spatially located heterogeneous workloads is an important problem in parallel scientific computing. We investigate the problem of partitioning such workloads (represented as a matrix of nonnegative integers) into rectangles, such that the load of the most loaded rectangle (processor) is minimized. Since finding the optimal arbitrary rectanglebased partition is an NPhard problem, we investigate particular classes of solutions: rectilinear, jagged and hierarchical. We present a new class of solutions called mway jagged partitions, propose new optimal algorithms for mway jagged partitions and hierarchical partitions, propose new heuristic algorithms, and provide worst case performance analyses for some existing and new heuristics. Moreover, the algorithms are tested in simulation on a wide set of instances. Results show that two of the algorithms we introduce lead to a much better load balance than the stateoftheart algorithms. We also show how to design a twophase algorithm that reaches different time/quality tradeoff.
Inverse Spacefilling Curve Partitioning of a Global Ocean Model
 IEEE International Parallel & Distributed Processing Symposium, 2630 March 2007, pp 110 350 Oakmead Pkwy, Sunnyvale, CA 94085 Tel: 4089703400 • Fax: 4089703403 www.hpcadvisorycouncil.com © Copyright 2010. HPC Advisory Council. All rights reserved
"... In this paper, we describe how inverse spacefilling curve partitioning is used to increase the simulation rate of a global ocean model. Spacefilling curve partitioning allows for the elimination of load imbalance in the computational grid due to land points. Improved load balance combined with cod ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we describe how inverse spacefilling curve partitioning is used to increase the simulation rate of a global ocean model. Spacefilling curve partitioning allows for the elimination of load imbalance in the computational grid due to land points. Improved load balance combined with code modifications within the conjugate gradient solver significantly increase the simulation rate of the Parallel Ocean Program at high resolution. The simulation rate for a high resolution model nearly doubled from 4.0 to 7.9 simulated years per day on 28,972 IBM Blue Gene/L processors. We also demonstrate that our techniques increase the simulation rate on 7545 Cray XT3 processors from 6.3 to 8.1 simulated years per day. Our results demonstrate how minor code modifications can have significant impact on resulting performance for very large processor counts. 1
Algorithmic Support for CommodityBased Parallel Computing Systems
, 2003
"... Follows Abstract The Computational Plant or Cplant is a commoditybased distributedmemory supercomputer under development at Sandia National Laboratories. ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Follows Abstract The Computational Plant or Cplant is a commoditybased distributedmemory supercomputer under development at Sandia National Laboratories.
Efficient parallel algorithms for solvent accessible surface area of proteins
 IEEE Trans. Parallel Dist. Syst
"... We present faster sequential and parallel algorithms for computing the solvent accessible surface area (ASA) of protein molecules. The ASA is computed by finding the exposed surface areas of the spheres obtained by increasing the van der Waals ’ radii of the atoms with the van der Waals ’ radius of ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
We present faster sequential and parallel algorithms for computing the solvent accessible surface area (ASA) of protein molecules. The ASA is computed by finding the exposed surface areas of the spheres obtained by increasing the van der Waals ’ radii of the atoms with the van der Waals ’ radius of the solvent. Using domain specific knowledge, we show that the number of sphere intersections is only O(n), where n is the number of atoms in the protein molecule. For computing sphere intersections, we present hashbased algorithms that run in O(n) expected sequential time and O expected parallel time and sortbased algorithms that run in worstcase O (n log n) sen log n quential time and O p parallel time. These are significant improvements over previously known algorithms which take O � n2 � � � n2 time sequentially and O p time in parallel. We present a Monte Carlo algorithm for computing the solvent accessible surface area. The basic idea is to generate points uniformly at random on the surface of spheres obtained by increasing the van der Waals ’ radii of the atoms with the van der Waals ’ radius of the solvent molecule and to test the points for accessibility. We also provide error bounds as a function of the sample size. Experimental verification of the algorithms is carried out using an IBM SP2.
From physical model to scientific understanding: An endtoend approach to parallel supercomputing. Working paper
"... endtoend approach to parallel supercomputing ..."
(Show Context)