Results 1 - 10
of
26
Scalable Load Balancing Techniques for Parallel Computers
, 1994
"... In this paper we analyze the scalability of a number of load balancing algorithms which can be applied to problems that have the following characteristics : the work done by a processor can be partitioned into independent work pieces; the work pieces are of highly variable sizes; and it is not po ..."
Abstract
-
Cited by 89 (16 self)
- Add to MetaCart
In this paper we analyze the scalability of a number of load balancing algorithms which can be applied to problems that have the following characteristics : the work done by a processor can be partitioned into independent work pieces; the work pieces are of highly variable sizes; and it is not possible (or very difficult) to estimate the size of total work at a given processor. Such problems require a load balancing scheme that distributes the work dynamically among different processors. Our goal here is to determine the most scalable load balancing schemes for different architectures such as hypercube, mesh and network of workstations. For each of these architectures, we establish lower bounds on the scalability of any possible load balancing scheme. We present the scalability analysis of a number of load balancing schemes that have not been analyzed before. This gives us valuable insights into their relative performance for different problem and architectural characteristi...
Parallel Formulations of Decision-Tree Classification Algorithms
- DATA MINING AND KNOWLEDGE DISCOVERY: AN INTERNATIONAL JOURNAL
, 1998
"... Classification decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud detection, etc. Highly parallel algorithms for constructing classification decision trees are desirable for dealing with large data sets in reasonable amount of time. ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
Classification decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud detection, etc. Highly parallel algorithms for constructing classification decision trees are desirable for dealing with large data sets in reasonable amount of time. Algorithms for building classification decision trees have a natural concurrency, but are difficult to parallelize due to the inherent dynamic nature of the computation. In this paper, we present parallel formulations of classification decision tree learning algorithm based on induction. We describe two basic parallel formulations. One is based on Synchronous Tree Construction Approach and the other is based on Partitioned Tree Construction Approach. We discuss the advantages and disadvantages of using these methods and propose a hybrid method that employs the good features of these methods. We also provide the analysis of the cost of computation and communication of the proposed hybr...
Parallel Processing of Discrete Optimization Problems
- IN ENCYCLOPEDIA OF MICROCOMPUTERS
, 1993
"... Discrete optimization problems (DOPs) arise in various applications such as planning, scheduling, computer aided design, robotics, game playing and constraint directed reasoning. Often, a DOP is formulated in terms of finding a (minimum cost) solution path in a graph from an initial node to a goa ..."
Abstract
-
Cited by 19 (6 self)
- Add to MetaCart
Discrete optimization problems (DOPs) arise in various applications such as planning, scheduling, computer aided design, robotics, game playing and constraint directed reasoning. Often, a DOP is formulated in terms of finding a (minimum cost) solution path in a graph from an initial node to a goal node and solved by graph/tree search methods such as branch-and-bound and dynamic programming. Availability of parallel computers has created substantial interest in exploring the use of parallel processing for solving discrete optimization problems. This article provides an overview of parallel search algorithms for solving discrete optimization problems.
Data-Parallel Load Balancing Strategies
- Parallel Computing
, 1996
"... Programming irregular and dynamic data-parallel algorithms requires to take data distribution into account. The implementation of a load balancing algorithm is a quite difficult task for the programmer. However, a load balancing strategy may be developed independently of the application. The integra ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Programming irregular and dynamic data-parallel algorithms requires to take data distribution into account. The implementation of a load balancing algorithm is a quite difficult task for the programmer. However, a load balancing strategy may be developed independently of the application. The integration of such a strategy in the data-parallel algorithm may be relevant to a library or a data-parallel compiler run-time. We propose load distribution data-parallel algorithms for a class of irregular data-parallel algorithms called stack algorithms. Our algorithms allow the use of regular and/or irregular communication patterns to exchange the works between processors. The results of theoretical analysis of these algorithms are presented. They allow a comparison of the different load balancing algorithms and the identification of criterion for the choice of a load balancing algorithm.
Fast Priority Queues for Parallel Branch-and-Bound
- In Workshop on Algorithms for Irregularly Structured Problems, number 980 in LNCS
, 1995
"... . Currently used parallel best first branch-and-bound algorithms either suffer from contention at a centralized priority queue or can only approximate the best first strategy. Bottleneck free algorithms for parallel priority queues are known but they cannot be implemented very efficiently on contemp ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
. Currently used parallel best first branch-and-bound algorithms either suffer from contention at a centralized priority queue or can only approximate the best first strategy. Bottleneck free algorithms for parallel priority queues are known but they cannot be implemented very efficiently on contemporary machines. We present quite simple randomized algorithms for parallel priority queues on distributed memory machines. For branch-and-bound they are asymptotically as efficient as previously known PRAM algorithms with high probability. The simplest versions require not much more communication than the approximated branch-and-bound algorithm of Karp and Zhang. Keywords: Analysis of randomized algorithms, distributed memory, load balancing, median selection, parallel best first branch-and-bound, parallel pritority queue. 1 Introduction Branch-and-bound search is an important technique for many combinatorial optimization problems. Since it can be a quite time consuming technique, paralleli...
An Investigation of Scalable SIMD I/O Techniques with Application to Parallel JPEG Compression
, 1996
"... The problem inherent with any digital image or digital video system is the large amount of bandwidth required for transmission or storage. This has driven the research area of image compression to develop more complex algorithms that compress images to lower data rates with better fidelity. One appr ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
The problem inherent with any digital image or digital video system is the large amount of bandwidth required for transmission or storage. This has driven the research area of image compression to develop more complex algorithms that compress images to lower data rates with better fidelity. One approach that can be used to increase the execution speed of these complex algorithms is through the use of parallel processing. In this paper we address the parallel implementation of the JPEG still image compression standard on the MasPar MP-1, a massively parallel SIMD computer. We develop two novel byte alignment algorithms which are used to efficiently input and output compressed data from the parallel system, and present results which show real-time performance is possible. We also discuss several applications, such as motion JPEG, that can be used in multimedia systems.
Analysis of Synchronous Dynamic Load Balancing Algorithms
, 1995
"... This paper presents the theoretical analysis of these load balancing strategies. 1. INTRODUCTION ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
This paper presents the theoretical analysis of these load balancing strategies. 1. INTRODUCTION
Adaptive Parallel Iterative Deepening Search
- Journal of Artificial Intelligence Research
, 1998
"... Many of the artificial intelligence techniques developed to date rely on heuristic search through large spaces. Unfortunately, the size of these spaces and the corresponding computational effort reduce the applicability of otherwise novel and effective algorithms. A number of parallel and distribute ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Many of the artificial intelligence techniques developed to date rely on heuristic search through large spaces. Unfortunately, the size of these spaces and the corresponding computational effort reduce the applicability of otherwise novel and effective algorithms. A number of parallel and distributed approaches to search have considerably improved the performance of the search process. Our goal is to develop an architecture that automatically selects parallel search strategies for optimal performance on a variety of search problems. In this paper we describe one such architecture realized in the Eureka system, which combines the benefits of many different approaches to parallel heuristic search. Through empirical and theoretical analyses we observe that features of the problem space directly affect the choice of optimal parallel search strategy. We then employ machine learning techniques to select the optimal parallel search strategy for a given problem space. When a new search task is...
A Hybrid Approach to Improving the Performance of Parallel Search
- Parallel Processing for Artificial Intelligence
, 1996
"... This paper describes HyPS, a hybrid parallel window / distributed tree search algorithm. Using this algorithm, the set of available processors is divided into clusters. Each cluster searches simultaneously through the same search space, but to a unique cost threshold. Within each cluster, the search ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
This paper describes HyPS, a hybrid parallel window / distributed tree search algorithm. Using this algorithm, the set of available processors is divided into clusters. Each cluster searches simultaneously through the same search space, but to a unique cost threshold. Within each cluster, the search space is divided so that an individual processor will search a fraction of the total search space. Operator ordering and load balancing techniques are used to further improve the performance of HyPS. Results on two real-world and one artificial domain show a substantial performance improvement over serial search algorithms, and indicate an improvement over existing parallel search approaches. In this paper we also describe a mechanism for automatically selecting the optimal number of clusters to use.
A Scalable Parallel Tree Search Library
- 2nd Workshop on Solving Irregular Problems on Distributed Memory Machines
, 1996
"... This paper reports design and implementation experiences with the portable and reusable library PIGSeL for parallel tree search. It is discussed how efficiency, flexibility and usability of the library can be reconciled. Two sample applications demonstrate its effectiveness for the case of parallel ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
This paper reports design and implementation experiences with the portable and reusable library PIGSeL for parallel tree search. It is discussed how efficiency, flexibility and usability of the library can be reconciled. Two sample applications demonstrate its effectiveness for the case of parallel depth-first search. On a mesh of 1024 Transputers near optimal speedup even for small instances of the Golomb ruler problem is achieved. The 0/1 knapsack problem is more challenging but the library achieves good speedups for quite irregular problem instances. From the algorithmic point of view, this is due to the random polling load balancing algorithm which turns out to perform well even on high-diameter networks, and also due to a fast initialization scheme, a bottleneck free implementation of the branch-andbound heuristics and an adaption of the tree based double-counting termination detection algorithm. 1 Introduction Many applications are based on the traversal of large implicitly defi...

