Results 1  10
of
17
Parallel Matrix Multiplication on a Linear Array with a Reconfigurable Pipelined Bus System
 IEEE Transactions on Computers
, 1997
"... The known fast sequential algorithms for multiplying two N \Theta N matrices (over an arbitrary ring) have time complexity O(N ff ), where 2 ! ff ! 3. The current best value of ff is less than 2.3755. We show that for all 1 p N ff , multiplying two N \Theta N matrices can be performed on a p ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
The known fast sequential algorithms for multiplying two N \Theta N matrices (over an arbitrary ring) have time complexity O(N ff ), where 2 ! ff ! 3. The current best value of ff is less than 2.3755. We show that for all 1 p N ff , multiplying two N \Theta N matrices can be performed on a pprocessor linear array with a reconfigurable pipelined bus system (LARPBS) in O ` N ff p + ` N 2 p 2=ff ' log p ' time. This is currently the fastest parallelization of the best known sequential matrix multiplication algorithm on a distributed memory parallel system. In particular, for all 1 p N 2:3755 , multiplying two N \Theta N matrices can be performed on a pprocessor LARPBS in O ` N 2:3755 p + ` N 2 p 0:8419 ' log p ' time, and linear speedup can be achieved for p as large as O(N 2:3755 =(log N) 6:3262 ). Furthermore, multiplying two N \ThetaN matrices can be performed on an LARPBS with O(N ff ) processors in O(log N) time. This compares favorably with...
Fast and processor efficient parallel matrix multiplication algorithms on a linear array with a reconfigurable pipelined bus system
 IEEE Trans. on Parallel and Distributed Systems
, 1998
"... Abstract—We present efficient parallel matrix multiplication algorithms for linear arrays with reconfigurable pipelined bus systems (LARPBS). Such systems are able to support a large volume of parallel communication of various patterns in constant time. An LARPBS can also be reconfigured into many i ..."
Abstract

Cited by 18 (9 self)
 Add to MetaCart
Abstract—We present efficient parallel matrix multiplication algorithms for linear arrays with reconfigurable pipelined bus systems (LARPBS). Such systems are able to support a large volume of parallel communication of various patterns in constant time. An LARPBS can also be reconfigured into many independent subsystems and, thus, is able to support parallel implementations of divideandconquer computations like Strassen’s algorithm. The main contributions of the paper are as follows: We develop five matrix multiplication algorithms with varying degrees of parallelism on the LARPBS computing model, namely, MM1, MM2, MM3, and compound algorithms &1 (�) and &2 (δ). Algorithm &1 (�) has adjustable time complexity in sublinear level. Algorithm &2 (δ) implies that it is feasible to achieve sublogarithmic time using o(N 3) processors for matrix multiplication on a realistic system. Algorithms MM3, &1 (�), and &2 (δ) all have o(N 3) cost and, hence, are very processor efficient. Algorithms MM1, MM3, and &1 (�) are generalpurpose matrix multiplication algorithms, where the array elements are in any ring. Algorithms MM2 and &2 (δ) are applicable to array elements that are integers of bounded magnitude, or floatingpoint values of bounded precision and magnitude, or Boolean values. Extension of algorithms MM2 and &2 (δ) to unbounded integers and reals are also discussed.
Efficient Deterministic and Probabilistic Simulations of PRAMs on Linear Arrays with Reconfigurable Pipelined Bus Systems
 Journal of Supercomputing
, 2000
"... . In this paper, we present deterministic and probabilistic methods for simulating PRAM computations on linear arrays with reconfigurable pipelined bus systems (LARPBS). The following results are established in this paper. (1) Each step of a pprocessor PRAM with m = O#p# shared memory cells can b ..."
Abstract

Cited by 15 (11 self)
 Add to MetaCart
. In this paper, we present deterministic and probabilistic methods for simulating PRAM computations on linear arrays with reconfigurable pipelined bus systems (LARPBS). The following results are established in this paper. (1) Each step of a pprocessor PRAM with m = O#p# shared memory cells can be simulated by a pprocessors LARPBS in O#log p# time, where the constant in the bigO notation is small. (2) Each step of a pprocessor PRAM with m = ##p# shared memory cells can be simulated by a pprocessors LARPBS in O#log m# time. (3) Each step of a pprocessor PRAM can be simulated by a pprocessor LARPBS in O#log p# time with probability larger than 1  1/p c for all c>0. (4) As an interesting byproduct, we show that a pprocessor LARPBS can sort p items in O#log p# time, with a small constant hidden in the bigO notation. Our results indicate that an LARPBS can simulate a PRAM very efficiently. Keywords: Concurrent read, concurrent write, deterministic simulation, linear array...
Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers
 Journal of Parallel and Distributed Computing
, 2001
"... 1 ..."
Integer Sorting and Routing in Arrays with Reconfigurable Optical Buses
, 1996
"... In this paper we present deterministic algorithms for integer sorting and online packet routing on arrays with reconfigurable optical buses. The main objective is to identify the mechanisms specific to this type of architectures that allow us to build efficient integer sorting, partial permutation ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
In this paper we present deterministic algorithms for integer sorting and online packet routing on arrays with reconfigurable optical buses. The main objective is to identify the mechanisms specific to this type of architectures that allow us to build efficient integer sorting, partial permutation routing and hrelations algorithms. The consequences of these results on the PRAM simulation complexity are also investigated. Keywords: Optical pipelined buses, reconfigurable array, sorting, routing. 1. Introduction In largescale general purpose parallel machines based on connection networks, efficient communication capabilities are essential in order to solve most of the problems of interest in a timely manner. Interprocessor communication networks are often the main bottlenecks in parallel machines. One important limitation of these networks concerns the exclusive access to the bus resources, which limits throughput to a function of the endtoend propagation time. Optical communicati...
Efficient Parallel Algorithms for Distance Maps of 2D Binary Images Using an Optical Bus
 Model of LPB and LARPBS [11] Segment Switches on an LARPBS [11] 5. Model of LARPBS with Switch Connections [12] 6. Model of LAROB [1] Model of AROB [6] (a) TwoDimensional Reconfigurable Network (b) Switch Configurations 8. Model of
, 2002
"... Computing a distance map (distance transform) is an operation that converts a twodimensional (2D) image consisting of black and white pixels to an image where each pixel has a value or a pair of coordinates that represents the distance to or location of the nearest black pixel. It is a basic opera ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
Computing a distance map (distance transform) is an operation that converts a twodimensional (2D) image consisting of black and white pixels to an image where each pixel has a value or a pair of coordinates that represents the distance to or location of the nearest black pixel. It is a basic operation in image processing and computer vision fields, and is used for expanding, shrinking, thinning, segmentation, clustering, computing shape, object reconstruction, etc. This paper examines the possibility of implementing the problem of finding a distance map for an image efficiently using an optical bus. The computational model considered is the linear array with a reconfigurable pipelined bus system (LARPBS), which has been introduced recently based on current electronic and optical technologies. It is shown that the problem for an image can be implemented in (log log log ) bus cycles deterministically or in (log ) bus cycles with high probability on an LARPBS with processors. By high probability, we mean a probability of (1 ) for any constant 1. We also show that the problem can be solved in (log log ) bus cycles deterministically or in (1) bus cycles with high probability on an LARPBS with 3 processors. Scalability of the algorithms is also discussed briefly. The same problem can be solved using an LARPBS of processors in (( ) log log log ) time deterministically or in (( ) log ) time with high probability for any practical machine size of . For processor arrays with practical sizes, a bus cycle is roughly the time of an arithmetic operation. Hence, the algorithm compares favorably to the best known parallel algorithms for the same problem in the literature.
Optimally scaling permutation routing on reconfigurable linear arrays with optical buses
 In Second Merged Symposium IPPS/SPDP, 13th International Parallel Processing Symposium & 10th Symposium on Parallel and Distributed Processing
, 2000
"... We present an optimal and scalable permutation routing algorithm for three reconfigurable models based on linear arrays that allow pipelining of information through an optical bus. Specifically, for any P N, our algorithm routes any permutation of N elements on a Pprocessor model optimally in O ( N ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
We present an optimal and scalable permutation routing algorithm for three reconfigurable models based on linear arrays that allow pipelining of information through an optical bus. Specifically, for any P N, our algorithm routes any permutation of N elements on a Pprocessor model optimally in O ( N P) steps. This algorithm extends naturally to one for routing hrelations optimally in O(h) steps. We also establish the equivalence of the three models: linear array with a reconfigurable pipelined bus system,
Sublogarithmic Deterministic Selection on Arrays with a Reconfigurable Optical Bus
 IEEE Trans. on Computers
, 2002
"... The Linear Array with a Reconfigurable Pipelined Bus System (LARPBS) is a newly introduced parallel computational model, where processors are connected by a reconfigurable optical bus. In this paper, we show that the selection problem can be solved on the LARPBS model deterministically in O((]og l ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
The Linear Array with a Reconfigurable Pipelined Bus System (LARPBS) is a newly introduced parallel computational model, where processors are connected by a reconfigurable optical bus. In this paper, we show that the selection problem can be solved on the LARPBS model deterministically in O((]og log N)2/]o ]o ]o N) time. To our best knowledge, this is the best deterministic selection algorithm on any model with a reconfigurable optical bus.
An Improved Randomized Selection Algorithm With an Experimental Study
 In Proc. The 2nd Workshop on Algorithm Engineering and Experiments (ALENEX00
, 2000
"... This paper presents an efficient randomized highlevel parallel algorithm for finding the median given a set of elements distributed across a parallel machine. In fact, our algorithm solves the general selection problem that requires the determination of the element of rank k, for an arbitrarily giv ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
This paper presents an efficient randomized highlevel parallel algorithm for finding the median given a set of elements distributed across a parallel machine. In fact, our algorithm solves the general selection problem that requires the determination of the element of rank k, for an arbitrarily given integer k. Our general...
Reconfigurable architectures and algorithms: A research survey
 IJCSA
, 2009
"... Ever since the introduction of the Dynamically Reconfigurable Buses, the architecture gained a lot of popularity amongst the researchers and scientists for its high performance computing with general purpose processor used. It is a powerful model of computation in which communication pattern between ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Ever since the introduction of the Dynamically Reconfigurable Buses, the architecture gained a lot of popularity amongst the researchers and scientists for its high performance computing with general purpose processor used. It is a powerful model of computation in which communication pattern between the processors could be changed during the execution. Following the years several new architectures and efficient algorithms for these were proposed, and their implementation using FPGA’s have been shown. This paper presents a survey on the different architectures proposed, and few important algorithms presented for these specialized architectures over the period of last two decades. Keywords: PARBS, RMESH, RN, LARPBS, Polymorphic Torus Network, AROB. 1.