Results 1 
8 of
8
Solving the Game of Awari Using Parallel Retrograde Analysis
 IEEE Computer
, 2003
"... We have solved the game of awari, an ancient African board game that is played worldwide now. The game is a draw when both players play optimally. To solve awari, we computed several databases that can be used jointly to select the best move from any position that can occur in a game. The largest da ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
We have solved the game of awari, an ancient African board game that is played worldwide now. The game is a draw when both players play optimally. To solve awari, we computed several databases that can be used jointly to select the best move from any position that can occur in a game. The largest database contains 204 billion entries (178 gigabyte), and is much larger than the largest (endgame) database for any game computed so far. In total, we determined the results for 889 billion positions. We solved the game on a large computer cluster, using a new parallel search algorithm that optimally uses the available resources (processors, memories, disks, and network).
Scalable, Parallel BestFirst Search for Optimal Sequential Planning
, 2009
"... Largescale, parallel clusters composed of commodity processors are increasingly available, enabling the use of vast processing capabilities and distributed RAM to solve hard search problems. We investigate parallel algorithms for optimal sequential planning, with an emphasis on exploiting distribut ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
Largescale, parallel clusters composed of commodity processors are increasingly available, enabling the use of vast processing capabilities and distributed RAM to solve hard search problems. We investigate parallel algorithms for optimal sequential planning, with an emphasis on exploiting distributed memory computing clusters. In particular, we focus on an approach which distributes and schedules work among processors based on a hash function of the search state. We use this approach to parallelize the A * algorithm in the optimal sequential version of the Fast Downward planner. The scaling behavior of the algorithm is evaluated experimentally on clusters using up to 128 processors, a significant increase compared to previous work in parallelizing planners. We show that this approach scales well, allowing us to effectively utilize the large amount of distributed memory to optimally solve problems which require hundreds of gigabytes of RAM to solve. We also show that this approach scales well for a single, sharedmemory multicore machine.
Satin: a HighLevel and Efficient Grid Programming Model
"... Computational grids have an enormous potential to provide compute power. However, this power remains largely unexploited today for most applications, except trivially parallel programs. Developing parallel grid applications simply is too difficult. Grids introduce several problems not encountered be ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
Computational grids have an enormous potential to provide compute power. However, this power remains largely unexploited today for most applications, except trivially parallel programs. Developing parallel grid applications simply is too difficult. Grids introduce several problems not encountered before, mainly due to the highly heterogeneous and dynamic computing and networking environment. Furthermore, failures occur frequently, and resources may be claimed by higher priority jobs at any time. In this paper, we solve these problems for an important class of applications: divideandconquer. We introduce a system called Satin that simplifies the development of parallel grid applications by providing a rich highlevel programming model that completely hides communication. All grid issues are transparently handled in the run time system, not by the programmer. Satin’s programming model is based on Java, features spawnsync primitives and shared objects, and uses asynchronous exceptions and an abort mechanism to support speculative parallelism. To allow an efficient implementation, Satin consistently exploits the idea that grids are hierarchically structured. Dynamic loadbalancing is done with a novel clusteraware scheduling algorithm that hides the long widearea latencies by overlapping them with useful local work.
Awari Is Solved
 Journal of the ICGA
, 2002
"... to the random accesses of database entries during construction. The choice or design of the algorithm to create awari (endgame) databases is mainly determined by the amount of main memory available (Lincke, 2002), trading memory for additional computational effort and storing intermediate results on ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
to the random accesses of database entries during construction. The choice or design of the algorithm to create awari (endgame) databases is mainly determined by the amount of main memory available (Lincke, 2002), trading memory for additional computational effort and storing intermediate results on disk. Our system contains 144 Pentium III processors at 1.0 GHz, 72 GB of distributed main memory, a total disk space of 1.4 TB, and a Myrinet interconnect: a fast, switched network. One of the challenges was to handle the relative "small" amount of memory. The parallel retrograde search algorithm described by Bal and Allis (1995) is efficient, but would have required more than 350 GB of memory, much more than we had. Sequential memorylimited search algorithms for awari endgame databases exist (cf. Lincke and Marzetta, 2000), but solving awari entirely on a single machine would take decades, if not centuries, since these algorithms still require much more memory than a single computer pro
Evaluations of Hash Distributed A * in Optimal Sequence Alignment
 PROCEEDINGS OF THE TWENTYSECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
"... Hash Distributed A* (HDA*) is a parallel A* algorithm that is proven to be effective in optimal sequential planning with unit edge costs. HDA* leverages the Zobrist function to almost uniformly distribute and schedule work among processors. This paper evaluates the performance of HDA * in optimal se ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Hash Distributed A* (HDA*) is a parallel A* algorithm that is proven to be effective in optimal sequential planning with unit edge costs. HDA* leverages the Zobrist function to almost uniformly distribute and schedule work among processors. This paper evaluates the performance of HDA * in optimal sequence alignment. We observe that with a large number of CPU cores HDA* suffers from an increase of search overhead caused by reexpansions of states in the closed list due to nonuniform edge costs in this domain. We therefore present a new work distribution strategy limiting processors to distribute work, thus increasing the possibility of detecting such duplicate search effort. We evaluate the performance of this approach on a cluster of multicore machines and show that the approach scales well up to 384 CPU cores.
A MillionFold Speed Improvement in Genomic Repeats Detection
 In SuperComputing’03
, 2003
"... This paper presents a novel, parallel algorithm for generating top alignments. Top alignments are used for finding internal repeats in biological sequences like proteins and genes. Our algorithm replaces an older, sequential algorithm (Repro), which was prohibitively slow for sequence lengths highe ..."
Abstract
 Add to MetaCart
This paper presents a novel, parallel algorithm for generating top alignments. Top alignments are used for finding internal repeats in biological sequences like proteins and genes. Our algorithm replaces an older, sequential algorithm (Repro), which was prohibitively slow for sequence lengths higher than 2000. The new algorithm is an order of magnitude faster (O(n ) rather than O(n )).
Iterative Resource Allocation for Memory Intensive Parallel Search Algorithms on Clouds, Grids, and Shared Clusters
"... The increasing availability of “utility computing ” resources such as clouds, grids, and massively parallel shared clusters can provide practically unlimited processing and memory capacity on demand, at some cost per unit of resource usage. This requires a new perspective in the design and evaluatio ..."
Abstract
 Add to MetaCart
The increasing availability of “utility computing ” resources such as clouds, grids, and massively parallel shared clusters can provide practically unlimited processing and memory capacity on demand, at some cost per unit of resource usage. This requires a new perspective in the design and evaluation of parallel search algorithms. Previous work in parallel search implicitly assumed ownership of a cluster with a static amount of CPU cores and RAM, and emphasized wallclock runtime. With utility computing resources, tradeoffs between performance and monetary costs must be considered. This paper considers dynamically increasing the usage of utility computing resources until a problem is solved. Efficient resource allocation policies are analyzed in comparison with an optimal allocation strategy. We evaluate our iterative allocation strategy by applying it to the HDA * parallel search algorithm. The experimental results validate our theoretical predictions. They show that, in practice, the costs incurred by iterative allocation are reasonably close to an optimal (but a priori unknown) policy, and are significantly better than the worstcase analytical bounds. 1