Results 1 
6 of
6
Parallel matrix multiplication on a linear array with a reconfigurable pipelined bus system
 Proceedings of IPPS/SPDP ’99 (2nd Merged Symp. of 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
, 1999
"... The known fast sequential algorithms for multiplying two N N matrices (over an arbitrary ring) have time complexity O(N), where 2 < < 3. The current best value of is less than 2.3755. We show that for all 1 p N,multiplying two N N matrices can be performed on a pprocessor linear array with a ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
The known fast sequential algorithms for multiplying two N N matrices (over an arbitrary ring) have time complexity O(N), where 2 < < 3. The current best value of is less than 2.3755. We show that for all 1 p N,multiplying two N N matrices can be performed on a pprocessor linear array with a recon gurable pipelined bus system (LARPBS) in O N
Efficient Parallel Algorithms for Distance Maps of 2D Binary Images Using an Optical Bus
 Model of LPB and LARPBS [11] Segment Switches on an LARPBS [11] 5. Model of LARPBS with Switch Connections [12] 6. Model of LAROB [1] Model of AROB [6] (a) TwoDimensional Reconfigurable Network (b) Switch Configurations 8. Model of
, 2002
"... Computing a distance map (distance transform) is an operation that converts a twodimensional (2D) image consisting of black and white pixels to an image where each pixel has a value or a pair of coordinates that represents the distance to or location of the nearest black pixel. It is a basic opera ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Computing a distance map (distance transform) is an operation that converts a twodimensional (2D) image consisting of black and white pixels to an image where each pixel has a value or a pair of coordinates that represents the distance to or location of the nearest black pixel. It is a basic operation in image processing and computer vision fields, and is used for expanding, shrinking, thinning, segmentation, clustering, computing shape, object reconstruction, etc. This paper examines the possibility of implementing the problem of finding a distance map for an image efficiently using an optical bus. The computational model considered is the linear array with a reconfigurable pipelined bus system (LARPBS), which has been introduced recently based on current electronic and optical technologies. It is shown that the problem for an image can be implemented in (log log log ) bus cycles deterministically or in (log ) bus cycles with high probability on an LARPBS with processors. By high probability, we mean a probability of (1 ) for any constant 1. We also show that the problem can be solved in (log log ) bus cycles deterministically or in (1) bus cycles with high probability on an LARPBS with 3 processors. Scalability of the algorithms is also discussed briefly. The same problem can be solved using an LARPBS of processors in (( ) log log log ) time deterministically or in (( ) log ) time with high probability for any practical machine size of . For processor arrays with practical sizes, a bus cycle is roughly the time of an arithmetic operation. Hence, the algorithm compares favorably to the best known parallel algorithms for the same problem in the literature.
Optimal Algorithms for the ChannelAssignment Problem on a Reconfigurable Array of Processors with Wider Bus Networks
 IEEE Transactions on Parallel and Distributed Systems
, 2002
"... The computation model on which the algorithms are developed is the reconfigurable array of processors with wider bus networks (abbreviated to RAPWBN). The main difference between the RAPWBN model and other existing reconfigurable parallel processing systems is that the bus width of each network is ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The computation model on which the algorithms are developed is the reconfigurable array of processors with wider bus networks (abbreviated to RAPWBN). The main difference between the RAPWBN model and other existing reconfigurable parallel processing systems is that the bus width of each network is bounded within the range e#.
Efficient Parallel Hierarchical Clustering
 In International Europar Conference (EUROPAR’04
, 2004
"... Abstract. Hierarchical agglomerative clustering (HAC) is a common clustering method that outputs a dendrogram showing all N levels of agglomerations where N is the number of objects in the data set. High time and memory complexities are some of the major bottlenecks in its application to realworld ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Hierarchical agglomerative clustering (HAC) is a common clustering method that outputs a dendrogram showing all N levels of agglomerations where N is the number of objects in the data set. High time and memory complexities are some of the major bottlenecks in its application to realworld problems. In the literature parallel algorithms are proposed to overcome these limitations. But, as this paper shows, existing parallel HAC algorithms are inefficient due to ineffective partitioning of the data. We first show how HAC follows a rule where most agglomerations have very small dissimilarity and only a small portion towards the end have large dissimilarity. Partially overlapping partitioning (POP) exploits this principle and obtains efficient yet accurate HAC algorithms. The total number of dissimilarities is reduced by a factor close to the number of cells in the partition. We present pPOP, the parallel version of POP, that is implemented on a shared memory multiprocessor architecture. Extensive theoretical analysis and experimental results are presented and show that pPOP gives close to linear speedup and outperforms the existing parallel algorithms significantly both in CPU time and memory requirements.
Linear Arrays with Optical Buses Summary
"... The parentheses matching problem is to determine the index of the mate for each parenthesis, and plays an important role in the design of parallel algorithms. In this paper, we consider two problems: reconstructing an original binary from encoded bit strings and transforming an infix expression into ..."
Abstract
 Add to MetaCart
The parentheses matching problem is to determine the index of the mate for each parenthesis, and plays an important role in the design of parallel algorithms. In this paper, we consider two problems: reconstructing an original binary from encoded bit strings and transforming an infix expression into a postfix one. This paper proposes optimal parallel algorithms for these problems based on parentheses matching using linear arrays with optical buses. The proposed algorithms run in a constant number of communication cycles, using processors equal to the input size. The main contribution of this paper is to design cost optimal algorithms with a constant time for these problems. Key words:
Exploiting Parallelism to Support Scalable Hierarchical Clustering ∗
"... A distributed memory parallel version of the group average Hierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to large collections. Using standard message passing operations reduces interprocess communication while maintaining efficient load ..."
Abstract
 Add to MetaCart
A distributed memory parallel version of the group average Hierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to large collections. Using standard message passing operations reduces interprocess communication while maintaining efficient load balancing. In a series of experiments using a subset of a standard TREC test collection, our parallel hierarchical clustering algorithm is shown to be scalable in terms of processors efficiently used and the collection size. Results show that our algorithm performs close to the expected O(n 2 /p) time on p processors, rather than the worstcase O(n 3 /p) time. Furthermore, the O(n 2 /p) memory complexity per node allows larger collections to be clustered as the number of nodes increases. While partitioning algorithms such as kmeans are trivially parallelizable, our results confirm those of other studies showing that hierarchical algorithms produce significantly tighter clusters in the document clustering task. Finally, we show how our parallel hierarchical agglomerative clustering algorithm can be used as the clustering subroutine for a parallel version of the Buckshot algorithm to cluster the complete TREC collection at near theoretical runtime expectations. 1