Results 1 - 10
of
100
Models and Languages for Parallel Computation
- ACM COMPUTING SURVEYS
, 1998
"... We survey parallel programming models and languages using 6 criteria [:] should be easy to program, have a software development methodology, be architecture-independent, be easy to understand, guranatee performance, and provide info about the cost of programs. ... We consider programming models in ..."
Abstract
-
Cited by 121 (4 self)
- Add to MetaCart
We survey parallel programming models and languages using 6 criteria [:] should be easy to program, have a software development methodology, be architecture-independent, be easy to understand, guranatee performance, and provide info about the cost of programs. ... We consider programming models in 6 categories, depending on the level of abstraction they provide.
Region streams: functional macroprogramming for sensor networks
, 2004
"... Sensor networks present a number of novel programming challenges for application developers. Their inherent limitations of computational power, communication bandwidth, and energy demand new approaches to programming that shield the developer from low-level details of resource management, concurrenc ..."
Abstract
-
Cited by 85 (6 self)
- Add to MetaCart
Sensor networks present a number of novel programming challenges for application developers. Their inherent limitations of computational power, communication bandwidth, and energy demand new approaches to programming that shield the developer from low-level details of resource management, concurrency, and in-network processing. We argue that sensor networks should be programmed at the global level, allowing the compiler to automatically generate nodal behaviors from a high-level specification of the network’s global behavior. This paper presents the design of a functional macroprogramming language for sensor networks, called Regiment. The essential data model in Regiment is based on region streams, which represent spatially distributed, time-varying collections of node state. A region stream might represent the set of sensor values across all nodes in an area or the aggregation of sensor values within that area. Regiment is a purely functional language, which gives the compiler considerable leeway in terms of realizing region stream operations across sensor nodes and exploiting redundancy within the network. We describe the initial design and implementation of Regiment, including a compiler that transforms a macroprogram into an efficient nodal program based on a token machine. We present a progresssion of simple programs that illustrate the power of Regiment to succinctly represent robust, adaptive sensor network applications.
Accelerator: using data parallelism to program GPUs for general-purpose uses
- in Proceedings of the 12th international conference on Architectural
, 2006
"... GPUs are difficult to program for general-purpose uses. Programmers can either learn graphics APIs and convert their applications to use graphics pipeline operations or they can use stream programming abstractions of GPUs. We describe Accelerator, a system that uses data parallelism to program GPUs ..."
Abstract
-
Cited by 57 (0 self)
- Add to MetaCart
GPUs are difficult to program for general-purpose uses. Programmers can either learn graphics APIs and convert their applications to use graphics pipeline operations or they can use stream programming abstractions of GPUs. We describe Accelerator, a system that uses data parallelism to program GPUs for general-purpose uses instead. Programmers use a conventional imperative programming language and a library that provides only high-level data-parallel operations. No aspects of GPUs are exposed to programmers. The library implementation compiles the data-parallel operations on the fly to optimized GPU pixel shader code and API calls. We describe the compilation techniques used to do this. We evaluate the effectiveness of using data parallelism to program GPUs by providing results for a set of compute-intensive benchmarks. We compare the performance of Accelerator versions of the benchmarks against hand-written pixel shaders. The speeds of the Accelerator versions are typically within 50 % of the speeds of hand-written pixel shader code. Some benchmarks significantly outperform C versions on a CPU: they are up to 18 times faster than C code running on a CPU.
Powerlist: a structure for parallel recursion
- ACM Transactions on Programming Languages and Systems
, 1994
"... Many data parallel algorithms – Fast Fourier Transform, Batcher’s sorting schemes and prefixsum – exhibit recursive structure. We propose a data structure, powerlist, that permits succinct descriptions of such algorithms, highlighting the roles of both parallelism and recursion. Simple algebraic pro ..."
Abstract
-
Cited by 55 (2 self)
- Add to MetaCart
Many data parallel algorithms – Fast Fourier Transform, Batcher’s sorting schemes and prefixsum – exhibit recursive structure. We propose a data structure, powerlist, that permits succinct descriptions of such algorithms, highlighting the roles of both parallelism and recursion. Simple algebraic properties of this data structure can be exploited to derive properties of these algorithms and establish equivalence of different algorithms that solve the same problem.
Transforming High-Level Data-Parallel Programs into Vector Operations
- IN PROCEEDINGS OF THE FOURTH ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING
, 1993
"... Fully-parallel execution of a high-level data-parallel language based on nested sequences, higher order functions and generalized iterators can be realized in the vector model using a suitable representation of nested sequences and a small set of transformational rules to distribute iterators throug ..."
Abstract
-
Cited by 49 (19 self)
- Add to MetaCart
Fully-parallel execution of a high-level data-parallel language based on nested sequences, higher order functions and generalized iterators can be realized in the vector model using a suitable representation of nested sequences and a small set of transformational rules to distribute iterators through the constructs of the language.
Optimal Evaluation of Array Expressions on Massively Parallel Machines
- ACM TRANS. PROG. LANG. SYST
, 1992
"... ..."
C**: A Large-Grain, Object-Oriented, Data-Parallel Programming Language
- LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (5TH INTERNATIONAL WORKSHOP
, 1992
"... C** is a new data-parallel programming language based on a new computation model called large-grain data parallelism. C** overcomes many disadvantages of existing data-parallel languages, yet retains their distinctive and advantageous programming style and deterministic behavior. This style makes da ..."
Abstract
-
Cited by 46 (3 self)
- Add to MetaCart
C** is a new data-parallel programming language based on a new computation model called large-grain data parallelism. C** overcomes many disadvantages of existing data-parallel languages, yet retains their distinctive and advantageous programming style and deterministic behavior. This style makes data parallelism well-suited for massively-parallel computation. Large-grain data parallelism enhances data parallelism by permitting a wider range of algorithms to be expressed naturally. C is an object-oriented programming language that inherits data abstraction features from C++. Existing scientific programming languages do not provide modern programming facilities such as operator extensibility, abstract datatypes, or object-oriented programming. C ---and its sequential subset C ++ ---support modern programming practices and enable a single language to be used for all parts of large, complex programs and libraries. This technical report consists of three parts. The body of t...
The Cilk System for Parallel Multithreaded Computing
, 1996
"... Although cost-effective parallel machines are now commercially available, the widespread use of parallel processing is still being held back, due mainly to the troublesome nature of parallel programming. In particular, it is still diiticult to build eiticient implementations of parallel applications ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
Although cost-effective parallel machines are now commercially available, the widespread use of parallel processing is still being held back, due mainly to the troublesome nature of parallel programming. In particular, it is still diiticult to build eiticient implementations of parallel applications whose communication patterns are either highly irregular or dependent upon dynamic information. Multithreading has become an increasingly popular way to implement these dynamic, asynchronous, concurrent programs. Cilk (pronounced "silk") is our C-based multithreaded computing system that provides provably good performance guarantees. This thesis describes the evolution of the Cilk language and runtime system, and describes applications which affected the evolution of the system.
A Comparison of Data-Parallel Algorithms for Connected Components
- In Proc. 6th Ann. Symp. Parallel Algorithms and Architectures (SPAA-94
, 1994
"... This paper presents a pragmatic comparison of three parallel algorithms for finding connected components, together with optimizations on these algorithms. Those being compared are two similar algorithms by Awerbuch and Shiloach [2] and by Shiloach and Vishkin [19] and a randomized contraction algori ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
This paper presents a pragmatic comparison of three parallel algorithms for finding connected components, together with optimizations on these algorithms. Those being compared are two similar algorithms by Awerbuch and Shiloach [2] and by Shiloach and Vishkin [19] and a randomized contraction algorithm by Blelloch [7], based on algorithms by Reif [18] and Phillips [17]. Major improvements are given for the first two which significantly reduces the super-linear component of their work complexity. An improvement is also given for randomized algorithm, and this algorithm is shown to be the fastest of those tested. These comparisons are presented with NESL data-parallel code as executed on a Connection Machine 2. This research was sponsored in part by the Defense Advanced Research Projects Agency, CSTO, under the title "The Fox Project: Advanced Development of Systems Software", ARPA Order No. 8313, issued by ESD/AVS under Contract No. F19628-91-C-0168, and in part by the ONR Graduate Fell...
ZPL: An Array Sublanguage
- PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING
, 1993
"... The notion of isolating the "common case" is a well known computer science principle. This paper describes ZPL, a language that treats data parallelism as a common case of MIMD parallelism. This separation of concerns has many benefits. It allows us to define a clean and concise language for describ ..."
Abstract
-
Cited by 30 (10 self)
- Add to MetaCart
The notion of isolating the "common case" is a well known computer science principle. This paper describes ZPL, a language that treats data parallelism as a common case of MIMD parallelism. This separation of concerns has many benefits. It allows us to define a clean and concise language for describing data parallel computations, and this in turn leads to efficient parallel execution. Our particular language also provides mechanisms for handling boundary conditions. We introduce the concepts, constructs and semantics of our new language, and give a simple example that contrasts ZPL with other data parallel languages.

