Results 1 - 10
of
4,072
Parallelization of structured, hierarchical adaptive mesh refinement algorithms
- Computing and Visualization in Science
, 2000
"... We describe an approach to parallelization of structured adaptive mesh refinement algorithms. This type of adaptive methodology is based on the use of local grids superimposed on a coarse grid to achieve sufficient resolution in the solution. The key elements of the approach to parallelization are a ..."
Abstract
-
Cited by 39 (10 self)
- Add to MetaCart
are a dynamic load-balancing technique to distribute work to processors and a software methodology for managing data distribution and communications. The methodology is based on a message-passing model that exploits the coarse-grained parallelism inherent in the algorithms. The approach is illustrated
Distributed partial evaluation
"... Partial evaluation is an automatic program transformation that optimizes programs by specialization. We speed up the specialization process by utilizing the natural coarse-grained parallelism inherent in the partial evaluation process. We have supplemented an existing partial evaluation system for t ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Partial evaluation is an automatic program transformation that optimizes programs by specialization. We speed up the specialization process by utilizing the natural coarse-grained parallelism inherent in the partial evaluation process. We have supplemented an existing partial evaluation system
Parallelization of an Adaptive Mesh Refinement Method for Low Mach Number Combustion ∗
, 2001
"... We describe the parallelization of a computer program for the adaptive mesh refinement simulation of variable density, viscous, incompressible fluid flows for low Mach number combustion. The adaptive methodology is based on the use of local grids superimposed on a coarse grid to achieve sufficient r ..."
Abstract
- Add to MetaCart
resolution in the solution. The key elements of the approach to parallelization are a dynamic load-balancing technique to distribute work to processors and a software methodology for managing data distribution and communications. The methodology is based on a message-passing model that exploits the coarse-grained
Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks
- In EuroSys
, 2007
"... Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad applica-tion combines computational “vertices ” with communica-tion “channels ” to form a dataflow graph. Dryad runs the application by executing the vertices of this graph on a set of availa ..."
Abstract
-
Cited by 762 (27 self)
- Add to MetaCart
Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad applica-tion combines computational “vertices ” with communica-tion “channels ” to form a dataflow graph. Dryad runs the application by executing the vertices of this graph on a set
Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism
- ACM Transactions on Computer Systems
, 1992
"... Threads are the vehicle,for concurrency in many approaches to parallel programming. Threads separate the notion of a sequential execution stream from the other aspects of traditional UNIX-like processes, such as address spaces and I/O descriptors. The objective of this separation is to make the expr ..."
Abstract
-
Cited by 475 (21 self)
- Add to MetaCart
the expression and control of parallelism sufficiently cheap that the programmer or compiler can exploit even fine-grained parallelism with acceptable overhead. Threads can be supported either by the operating system kernel or by user-level library code in the application address space, but neither approach has
Decomposing Linear Programs for Parallel Solution
- Lecture Notes in Computer Science
, 1996
"... . Coarse grain parallelism inherent in the solution of Linear Programming (LP) problems with block angular constraint matrices has been exploited in recent research works. However, these approaches suffer from unscalability and load imbalance since they exploit only the existing block angular st ..."
Abstract
-
Cited by 10 (8 self)
- Add to MetaCart
. Coarse grain parallelism inherent in the solution of Linear Programming (LP) problems with block angular constraint matrices has been exploited in recent research works. However, these approaches suffer from unscalability and load imbalance since they exploit only the existing block angular
The Case for a Single-Chip Multiprocessor
- IEEE Computer
, 1996
"... Advances in IC processing allow for more microprocessor design options. The increasing gate density and cost of wires in advanced integrated circuit technologies require that we look for new ways to use their capabilities effectively. This paper shows that in advanced technologies it is possible to ..."
Abstract
-
Cited by 440 (6 self)
- Add to MetaCart
to implement a single-chip multiproces-sor in the same area as a wide issue superscalar processor. We find that for applications with little parallelism the performance of the two microarchitectures is comparable. For applications with large amounts of parallelism at both the fine and coarse grained levels
Coarse-Grain Pipelining on Multiple FPGA Architectures
- In Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 02
, 2002
"... Reconfigurable systems, and in particular, FPGA-based custom computing machines, offer a unique opportunity to define application-specific architectures. These architectures offer performance advantages for application domains such as image processing, where the use of customized pipelines exploits ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
the inherent coarse-grain parallelism. In this paper we describe a set of program analyses and an implementation that map a sequential and un-annotated C program into a pipelined implementation running on a set of FPGAs, each with multiple external memories. Based on well-known parallel computing analysis
Coarse-Grained Parallel Algorithms for Multi-Dimensional Wavelet Transforms
- The Journal of Supercomputing
, 1998
"... . This paper presents parallel algorithms for computing multi-dimensional wavelet transforms on both shared memory and distributed memory machines. Traditional data partitioning methods for n-dimensional Discrete Wavelet Transforms (DWTs) call for data redistribution once a one dimensional wavelet t ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
. This paper presents parallel algorithms for computing multi-dimensional wavelet transforms on both shared memory and distributed memory machines. Traditional data partitioning methods for n-dimensional Discrete Wavelet Transforms (DWTs) call for data redistribution once a one dimensional wavelet
Coarse-Grain Parallel Programming in Jade
, 1991
"... This paper presents Jade, a language which allows a programmer to easily express dynamic coarse-grain parallelism. Starting with a sequential program, a programmer augments those sections of code to be parallelized with abstract data usage information. The compiler and run-time system use this inf ..."
Abstract
-
Cited by 51 (4 self)
- Add to MetaCart
This paper presents Jade, a language which allows a programmer to easily express dynamic coarse-grain parallelism. Starting with a sequential program, a programmer augments those sections of code to be parallelized with abstract data usage information. The compiler and run-time system use
Results 1 - 10
of
4,072