• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Task Pool Teams for Implementing Irregular Algorithms on Clusters of SMPs (2002)

by J Hippold, G Rünger
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 11
Next 10 →

Clément-Type Interpolation on Spherical Domains -- Interpolation Error Estimates and Application to a Posteriori Error Estimates

by Thomas Apel, Cornelia Pester , 2003
"... In this paper, a mixed boundary value problem for the Laplace-Beltrami operator is considered for spherical domains in R³, i.e. for domains on the unit sphere. These domains are parametrized by spherical coordinates (φ, θ), such that functions on the unit sphere are considered as func ..."
Abstract - Cited by 15 (1 self) - Add to MetaCart
In this paper, a mixed boundary value problem for the Laplace-Beltrami operator is considered for spherical domains in R³, i.e. for domains on the unit sphere. These domains are parametrized by spherical coordinates (φ, θ), such that functions on the unit sphere are considered as functions in these coordinates. Careful investigation leads to the introduction of a proper finite element space corresponding to an isotropic triangulation of the underlying domain on the unit sphere. Error estimates are proven for a Clément-type interpolation operator, where appropriate, weighted norms are used. The estimates are applied to the deduction of a reliable and efficient residual error estimator for the Laplace-Beltrami operator.

Transformation of Hexahedral Finite Element Meshes Into Tetrahedral Meshes . . .

by Thomas Apel, Nico Düvelmeyer , 2003
"... The paper is concerned with algorithms for transforming hexahedral finite element meshes into tetrahedral meshes without introducing new nodes. Known algorithms use only the topological structure of the hexahedral mesh but no geometry information. The paper provides another algorithm which is then e ..."
Abstract - Cited by 11 (0 self) - Add to MetaCart
The paper is concerned with algorithms for transforming hexahedral finite element meshes into tetrahedral meshes without introducing new nodes. Known algorithms use only the topological structure of the hexahedral mesh but no geometry information. The paper provides another algorithm which is then extented such that quality criteria for the splitting of faces are respected.

Parallel Order Reduction via Balanced Truncation for . . .

by José M. Badía, Peter Benner, Rafael Mayo, Enrique S. Quintana-Ortí, Gregorio Quintana-Ortí, Jens Saak , 2005
"... We employ two efficient parallel approaches to reduce a model arising from a semi-discretization of a controlled heat transfer process for optimal cooling of a steel profile. Both algorithms are based on balanced truncation but differ in the numerical method that is used to solve two dual generalize ..."
Abstract - Cited by 9 (3 self) - Add to MetaCart
We employ two efficient parallel approaches to reduce a model arising from a semi-discretization of a controlled heat transfer process for optimal cooling of a steel profile. Both algorithms are based on balanced truncation but differ in the numerical method that is used to solve two dual generalized Lyapunov equations, which is the major computational task. Experimental results on a cluster of Intel Xeon processors compare the efficacy of the parallel model reduction algorithms.

The inf-sup condition for the Bernardi-Fortin-Raugel element on anisotropic meshes

by Thomas Apel, Serge Nicaise , 2003
"... On a large class of two-dimensional anisotropic meshes, the inf-sup condition (stability) is proved for the triangular and quadrilateral finite element pairs suggested by Bernardi/Raugel and Fortin. As a consequence the pairs P 2 P 0 , Q 2 P 0 , and Q 2 P 0 turn out to be stable independent of th ..."
Abstract - Cited by 9 (0 self) - Add to MetaCart
On a large class of two-dimensional anisotropic meshes, the inf-sup condition (stability) is proved for the triangular and quadrilateral finite element pairs suggested by Bernardi/Raugel and Fortin. As a consequence the pairs P 2 P 0 , Q 2 P 0 , and Q 2 P 0 turn out to be stable independent of the aspect ratio of the elements.

The Robustness Of The Hierarchical A Posteriori Error Estimator For Reaction-Diffusion Equation On Anisotropic Meshes

by Serguei Grosman , 2004
"... Singularly perturbed reaction-diffusion problems exhibit in general solutions with anisotropic features, e.g. strong boundary and/or interior layers. This anisotropy is reflected in the discretization by using meshes with anisotropic elements. The quality of the numerical solution rests on the r ..."
Abstract - Cited by 9 (0 self) - Add to MetaCart
Singularly perturbed reaction-diffusion problems exhibit in general solutions with anisotropic features, e.g. strong boundary and/or interior layers. This anisotropy is reflected in the discretization by using meshes with anisotropic elements. The quality of the numerical solution rests on the robustness of the a posteriori error estimator with respect to both the perturbation parameters of the problem and the anisotropy of the mesh. The simplest

A Comparison of Task Pools for Dynamic Load Balancing of Irregular Algorithms

by Matthias Korch, et al. , 2004
"... Since a static work distribution does not allow for satisfactory speed-ups of parallel irregular algorithms, there is a need for a dynamic distribution of work and data that can be adapted to the runtime behavior of the algorithm. Task pools are data structures which can distribute tasks dynamically ..."
Abstract - Cited by 9 (0 self) - Add to MetaCart
Since a static work distribution does not allow for satisfactory speed-ups of parallel irregular algorithms, there is a need for a dynamic distribution of work and data that can be adapted to the runtime behavior of the algorithm. Task pools are data structures which can distribute tasks dynamically to different processors where each task specifies computations to be performed and provides the data for these computations. This paper discusses the characteristics of taskbased algorithms and describes the implementation of selected types of task pools for shared-memory multiprocessors. Several task pools have been implemented in C with POSIX threads and in Java. The task pools differ in the data structures to store the tasks, the mechanism to achieve load balance, and the memory manager used to store the tasks. Runtime experiments have been performed on three different shared-memory systems using a synthetic algorithm, the hierarchical radiosity method, and a volume rendering algorithm.

Necessary and sufficient conditions for the regularity of a planar Coons map

by Maharavo Randrianarivony, Guido Brunnett , 2004
"... ..."
Abstract - Cited by 8 (1 self) - Add to MetaCart
Abstract not found

A Super-Programming Technique for Large Sparse Matrix Multiplication on PC Clusters

by Dejiang Jin, Sotirios G. Ziavras - on PC clusters, IEICE Trans. Info. Systems E87-D , 2004
"... The multiplication of large spare matrices is a basic operation for many scientific and engineering applications. There exist some high-performance library routines for this operation. They are often optimized based on the target architecture. The PC cluster computing paradigm has recently emerged a ..."
Abstract - Cited by 4 (3 self) - Add to MetaCart
The multiplication of large spare matrices is a basic operation for many scientific and engineering applications. There exist some high-performance library routines for this operation. They are often optimized based on the target architecture. The PC cluster computing paradigm has recently emerged as a viable alternative for high-performance, low-cost computing. In this paper, we apply our super-programming approach [24] to study the load balance and runtime management overhead for implementing parallel large matrix multiplication on PC clusters. For a parallel environment, it is essential to partition the entire operation into tasks and assign them to individual processing elements. Most of the existing approaches partition the given sub-matrices based on some kinds of workload estimation. For dense matrices on some architectures estimations may be accurate. For sparse matrices on PC, however, the workloads of block operations may not necessarily depend on the size of data. The workloads may not be well estimated in advance. Any approach other than run-time dynamic partitioning may degrade performance. Moreover, in a heterogeneous environment, statically partitioning is NP-complete. For embedded problems, it also introduces management overhead. In this paper We adopt our super-programming approach that partitions the entire task into medium-grain tasks that are implemented using super-instructions; the workload of super-instructions is easy to estimate. These tasks are dynamically assigned to member computer nodes. A node may execute more than one super-instruction. Our results prove the viability of our approach.

A general framework for parallel distributed processing. In: Parallel distributed processing: explorations

by Dirk Farin, Peter H. N. De - in the microstructure of cognition, Vol 1, Foundations (Rumelhart DE, McClelland JL, eds , 1986
"... Abstract. This paper presents a software framework providing a platform for parallel and distributed processing of video data on a cluster of SMP computers. Existing video-processing algorithms can be easily integrated into the framework by considering them as atomic processing tiles (PTs). PTs can ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Abstract. This paper presents a software framework providing a platform for parallel and distributed processing of video data on a cluster of SMP computers. Existing video-processing algorithms can be easily integrated into the framework by considering them as atomic processing tiles (PTs). PTs can be connected to form processing graphs that model the data flow. Parallelization of the tasks in this graph is carried out automatically using a pool-of-tasks scheme. The data format that can be processed by the framework is not restricted to image data, such that also intermediate data, like detected feature points, can be transferred between PTs. Furthermore, the processing can be carried out efficiently on special-purpose processors with separate memory, since the framework minimizes the transfer of data. We also describe an example application for a multi-camera view-interpolation system that we successfully implemented on the proposed framework. 1

Constructing a Diffeomorphism Between a Trimmed Domain and the Unit Square

by Maharavo Randrianarivony, Guido Brunnett, Reinhold Schneider , 2003
"... This document has two objectives: decomposition of a given trimmed surface into several four-sided subregions and creation of a diffeomorphism from the unit square onto each subregion. We aim at having a diffeomorphism which is easy and fast to evaluate. Throughout this paper one of our objectives i ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
This document has two objectives: decomposition of a given trimmed surface into several four-sided subregions and creation of a diffeomorphism from the unit square onto each subregion. We aim at having a diffeomorphism which is easy and fast to evaluate. Throughout this paper one of our objectives is to keep the shape of the curves delineating the boundaries of the trimmed surfaces unchanged. The approach that is used invokes the use of transfinite interpolations. We will describe an automatic manner to specify internal cubic Bézier-spline curves that are to be subsequently interpolated by a Gordon patch. Some theoretical criterion...
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University