Results 1 -
3 of
3
Implementation of a Portable Nested Data-Parallel Language
- Journal of Parallel and Distributed Computing
, 1994
"... This paper gives an overview of the implementation of Nesl, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested dataparallel function calls. These features allow the concise description of parallel alg ..."
Abstract
-
Cited by 154 (26 self)
- Add to MetaCart
This paper gives an overview of the implementation of Nesl, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested dataparallel function calls. These features allow the concise description of parallel algorithms on irregular data, such as sparse matrices and graphs. In addition, they maintain the advantages of data-parallel languages: a simple programming model and portability. The current Nesl implementation is based on an intermediate language called Vcode and a library of vector routines called Cvl. It runs on the Connection Machine CM-2, the Cray Y-MP C90, and serial machines. We compare initial benchmark results of Nesl with those of machine-specific code on these machines for three algorithms: least-squares line-fitting, median finding, and a sparse-matrix vector product. These results show that Nesl's performance is competitive with that of machine-specific codes for regular dense da...
Buffered Banks in Multiprocessor Systems
- IEEE Transactions on Computers
, 1995
"... A memory design based on logical banks is analyzed for shared memory multiprocessor systems. In this design, each physical bank is replaced by a logical bank consisting of a fast register and subbanks of slower memory. The subbanks are buffered by input and output queues which substantially reduce t ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
A memory design based on logical banks is analyzed for shared memory multiprocessor systems. In this design, each physical bank is replaced by a logical bank consisting of a fast register and subbanks of slower memory. The subbanks are buffered by input and output queues which substantially reduce the effective cycle time when the reference rate is below saturation. The principal contribution of this work is the development of a simple analytical model which leads to scaling relationships among the efficiency, the bank cycle time, the number of processors, the size of the buffers, and the granularity of the banks. These scaling relationships imply that if the interconnection network has sufficient bandwidth to support efficient access using high-speed memory, then lower-speed memory can be substituted with little additional interconnection cost. The scaling relationships are shown to hold for a full datapath vector simulation based on the Cray Y-MP architecture. The model is used to de...

