Results 1 -
4 of
4
Implementation of a Portable Nested Data-Parallel Language
- Journal of Parallel and Distributed Computing
, 1994
"... This paper gives an overview of the implementation of Nesl, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested dataparallel function calls. These features allow the concise description of parallel alg ..."
Abstract
-
Cited by 154 (26 self)
- Add to MetaCart
This paper gives an overview of the implementation of Nesl, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested dataparallel function calls. These features allow the concise description of parallel algorithms on irregular data, such as sparse matrices and graphs. In addition, they maintain the advantages of data-parallel languages: a simple programming model and portability. The current Nesl implementation is based on an intermediate language called Vcode and a library of vector routines called Cvl. It runs on the Connection Machine CM-2, the Cray Y-MP C90, and serial machines. We compare initial benchmark results of Nesl with those of machine-specific code on these machines for three algorithms: least-squares line-fitting, median finding, and a sparse-matrix vector product. These results show that Nesl's performance is competitive with that of machine-specific codes for regular dense da...
Provably efficient scheduling for languages with fine-grained parallelism
- IN PROC. SYMPOSIUM ON PARALLEL ALGORITHMS AND ARCHITECTURES
, 1995
"... Many high-level parallel programming languages allow for fine-grained parallelism. As in the popular work-time framework for parallel algorithm design, programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to processors. A ..."
Abstract
-
Cited by 68 (22 self)
- Add to MetaCart
Many high-level parallel programming languages allow for fine-grained parallelism. As in the popular work-time framework for parallel algorithm design, programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to processors. A common concern in executing such programs is to schedule tasks to processors dynamically so as to minimize not only the execution time, but also the amount of space (memory) needed. Without careful scheduling, the parallel execution on p processors can use a factor of p or larger more space than a sequential implementation of the same program. This paper first identifies a class of parallel schedules that are provably efficient in both time and space. For any
Compiling Nested Data-Parallel Programs for Shared-Memory Multiprocessors
- ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS
, 1993
"... While data parallelism is well-suited from algorithmic, architectural, and linguistic considerations to serve as a basis for portable parallel programming, its characteristic fine-grained parallelism makes the efficient implementation of data-parallel languages on MIMD machines a challenging task. ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
While data parallelism is well-suited from algorithmic, architectural, and linguistic considerations to serve as a basis for portable parallel programming, its characteristic fine-grained parallelism makes the efficient implementation of data-parallel languages on MIMD machines a challenging task. The design, implementation, and evaluation of an optimizing compiler are presented for an applicative nested data-parallel language called VCODE targeted at the Encore Multimax, a shared-memory multiprocessor. The source language supports nested aggregate data types; aggregate operations including elementwise forms, scans, reductions, and permutations; and conditionals and recursion for control flow. A small set of graph-theoretic compile-time optimizations reduce the overheads on MIMDmachines in several ways: by increasing the grain size of the output program, by reducing synchronization and storage requirements, and by improving locality of reference. The two key ideas behind these...
Practical Parallel Divide-and-Conquer Algorithms
, 1997
"... Nested data parallelism has been shown to be an important feature of parallel languages, allowing the concise expression of algorithms that operate on irregular data structures such as graphs and sparse matrices. However, previous nested dataparallel languages have relied on a vector PRAM impleme ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Nested data parallelism has been shown to be an important feature of parallel languages, allowing the concise expression of algorithms that operate on irregular data structures such as graphs and sparse matrices. However, previous nested dataparallel languages have relied on a vector PRAM implementation layer that cannot be efficiently mapped to MPPs with high inter-processor latency. This thesis shows that by restricting the problem set to that of data-parallel divide-and-conquer algorithms I can maintain the expressibility of full nested data-parallel languages while achieving good efficiency on current distributed-memory machines. Specifically, I define

