Results 1 - 10
of
14
Parallel Decomposition of Unstructured FEM-Meshes
- Concurrency: Practice & Experience
, 1995
"... . We present a massively parallel algorithm for static and dynamic partitioning of unstructured FEM-meshes. The method consists of two parts. First a fast but inaccurate sequential clustering is determined which is used, together with a simple mapping heuristic, to map the mesh initially onto the pr ..."
Abstract
-
Cited by 38 (14 self)
- Add to MetaCart
. We present a massively parallel algorithm for static and dynamic partitioning of unstructured FEM-meshes. The method consists of two parts. First a fast but inaccurate sequential clustering is determined which is used, together with a simple mapping heuristic, to map the mesh initially onto the processors of a massively parallel system. The second part of the method uses a massively parallel algorithm to remap and optimize the mesh decomposition taking several cost functions into account. It first calculates the amount of nodes that have to be migrated between pairs of clusters in order to obtain an optimal load balancing. In a second step, nodes to be migrated are chosen according to cost functions optimizing the amount and necessary communication and other measures which are important for the numerical solution method (like for example the aspect ratio of the resulting domains). The parallel parts of the method are implemented in C under Parix to run on the Parsytec GCel systems. R...
Local Divergence of Markov Chains and the Analysis of Iterative Load-Balancing Schemes
- IN PROCEEDINGS OF THE 39TH IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS ’98
, 1998
"... We develop a general technique for the quantitative analysis of iterative distributed load balancing schemes. We illustrate the technique by studying two simple, intuitively appealing models that are prevalent in the literature: the diffusive paradigm, and periodic balancing circuits (or the dimensi ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
We develop a general technique for the quantitative analysis of iterative distributed load balancing schemes. We illustrate the technique by studying two simple, intuitively appealing models that are prevalent in the literature: the diffusive paradigm, and periodic balancing circuits (or the dimension exchange paradigm). It is well known that such load balancing schemes can be roughly modeled by Markov chains, but also that this approximation can be quite inaccurate. Our main contribution is an effective way of characterizing the deviation between the actual loads and the distribution generated by a related Markov chain, in terms of a natural quantity which we call the local divergence. We apply this technique to obtain bounds on the number of rounds required to achieve coarse balancing in general networks, cycles and meshes in these models. For balancing circuits, we also present bounds for the stronger requirement of perfect balancing, or counting.
Dynamic load distribution in the borealis stream processor
- In ICDE
, 2005
"... Distributed and parallel computing environments are becoming cheap and commonplace. The availability of large numbers of CPU’s makes it possible to process more data at higher speeds. Stream-processing systems are also becoming more important, as broad classes of applications require results in real ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
Distributed and parallel computing environments are becoming cheap and commonplace. The availability of large numbers of CPU’s makes it possible to process more data at higher speeds. Stream-processing systems are also becoming more important, as broad classes of applications require results in real-time. Since load can vary in unpredictable ways, exploiting the abundant processor cycles requires effective dynamic load distribution techniques. Although load distribution has been extensively studied for the traditional pull-based systems, it has not yet been fully studied in the context of push-based continuous query processing. In this paper, we present a correlation based load distribution algorithm that aims at avoiding overload and minimizing end-to-end latency by minimizing load variance and maximizing load correlation. While finding the optimal solution for such a problem is NP-hard, our greedy algorithm can find reasonable solutions in polynomial time. We present both a global algorithm for initial load distribution and a pair-wise algorithm for dynamic load migration.
A scalable P2P platform for the knowledge Grid
- IEEE Transactions on Knowledge and Data Engineering
, 2005
"... Abstract—The Knowledge Grid needs to operate with a scalable platform to provide large-scale intelligent services. A key function of such a platform is to efficiently support various complex queries in a dynamic large-scale network environment. This paper proposes a platform to support index-based p ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
Abstract—The Knowledge Grid needs to operate with a scalable platform to provide large-scale intelligent services. A key function of such a platform is to efficiently support various complex queries in a dynamic large-scale network environment. This paper proposes a platform to support index-based path queries by incorporating a semantic overlay with an underlying structured P2P network that provides object location and management services. Various distributed indexing structures can be dynamically formed by publishing semantic objects as indexing nodes. Queries are forwarded along the chains of semantic object pointers to search for objects. We investigate the deployment of a scalable distributed trie index for broadcast queries on key strings, propose a decentralized load balancing method for solving the problem of uneven load distribution incurred by heterogeneity of loads and node capacities and by the distributed trie index, and give an approach for improving the availability of the semantic overlay and its trie index. Experiments demonstrate the scalability of the proposed platform. Index Terms—Peer-to-peer, semantic overlay, knowledge grid, path query, distributed trie index, load balancing, replication.
Problems Of Computing On The Web
, 1997
"... We discuss the concept of computing on the Web. We show that the heterogeneous and dynamic nature of the Web makes it impossible to define a fixed set of operating system functions, usable for all services. Rather, we propose that generalized software configuration techniques, based on a demand-driv ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
We discuss the concept of computing on the Web. We show that the heterogeneous and dynamic nature of the Web makes it impossible to define a fixed set of operating system functions, usable for all services. Rather, we propose that generalized software configuration techniques, based on a demand-driven technique called eduction, can be used to define versions of a Web Operating System (WOS) that can be built in an incremental manner. We illustrate this problem by examining the question of load balancing.
Decentralized Remapping of Data Parallel Applications in Distributed Memory Multiprocessors
- in Distributed Memory Multiprocessors. Concurrency: Practice and Experience
, 1997
"... In this paper we present a decentralized remapping method for data parallel applications on distributed memory multiprocessors. The method uses a generalized dimensionexchange (GDE) algorithm periodically during the execution of an application to balance (remap) the system's workload. We implemented ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
In this paper we present a decentralized remapping method for data parallel applications on distributed memory multiprocessors. The method uses a generalized dimensionexchange (GDE) algorithm periodically during the execution of an application to balance (remap) the system's workload. We implemented this remapping method in parallel WaTor simulations and parallel image thinning applications, and found it to be effective in reducing the computation time. The average performance gain is about 20% in the WaTor simulation of a 256 \Theta 256 ocean grid on 16 processors, and up to 8% in the thinning of a typical image of size 128 \Theta 128 on 8 processors. The performance gains due to remapping in the image thinning case are reasonably substantial given the fact that the application by its very nature does not necessarily favor remapping. We also implemented this remapping method, using up to 32 processors, for partitioning and re-partitioning of grids in computational fluid dynamics. It w...
Theoretical Analysis of the Heterogeneous Dynamic Load Balancing Problem Using a Hydro-Dynamic Approach
- Journal of Parallel and Distributed Computing archive Volume 43 , Issue
, 1996
"... This paper presents a hydro-dynamic framework to solving the dynamic load balancing problem on a network of heterogeneous computers. In this approach, each processor is viewed as a liquid cylinder where the cross-sectional area corresponds to the capacity of the processor, the communication links ar ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper presents a hydro-dynamic framework to solving the dynamic load balancing problem on a network of heterogeneous computers. In this approach, each processor is viewed as a liquid cylinder where the cross-sectional area corresponds to the capacity of the processor, the communication links are modeled as liquid channels between the cylinders, the workload is represented as liquid, and the load balancing algorithm describes the flow of the liquid. It is proved that all algorithms under this framework converge geometrically to the state of equilibrium, in which the heights of the liquid columns are the same in all the cylinders. In this way, each processor obtains an amount of workload proportional to its capacity. The parameters that affect the convergence rate of the algorithms are also identified and discussed. 1 Introduction It is useful to explore remote computing power in local area networks (LANs) as processors get more and more powerful and the availability of high spee...
GEMS: Gossip-Enabled Monitoring Service for Scalable Heterogeneous Distributed Systems
- Cluster Comput
"... Abstract. Gossip protocols have proven to be effective means by which failures can be detected in large, distributed systems in an asynchronous manner without the limitations associated with reliable multicasting for group communications. In this paper, we discuss the development and features of a G ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. Gossip protocols have proven to be effective means by which failures can be detected in large, distributed systems in an asynchronous manner without the limitations associated with reliable multicasting for group communications. In this paper, we discuss the development and features of a Gossip-Enabled Monitoring Service (GEMS), a highly responsive and scalable resource monitoring service, to monitor health and performance information in heterogeneous distributed systems. GEMS has many novel and essential features such as detection of network partitions and dynamic insertion of new nodes into the service. Easily extensible, GEMS also incorporates facilities for distributing arbitrary system and application-specific data. We present experiments and analytical projections demonstrating scalability, fast response times and low resource utilization requirements, making GEMS a potent solution for resource monitoring in distributed computing.
Dimension-exchange algorithms for token distribution on tree-connected architectures
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 2004
"... Load balancing on a multi-processor system involves redistributing tasks among processors so that each processor has roughly the same amount of work to perform. The token-distribution problem is a static variant of the load balancing problem for the case in which the workloads in the system cannot b ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Load balancing on a multi-processor system involves redistributing tasks among processors so that each processor has roughly the same amount of work to perform. The token-distribution problem is a static variant of the load balancing problem for the case in which the workloads in the system cannot be divided arbitrarily; that is, where each token represents an atomic element of work. A scalable method for distributing tokens over a parallel architecture is the so-called dimension-exchange approach. Our results include improved analysis of two existing dimension-exchange algorithms for token distribution on arbitrary graphs and on arbitrary trees, respectively. In particular, we establish a logarithmic upper bound on the discrepancy of the resulting distribution when the second algorithm is applied to an arbitrary initial distribution on a tree. We then present a new dimension-exchange algorithm for token distribution on trees, which assuming each node knows the number of nodes in the tree, determines a ‘perfectly balanced’ distribution. Furthermore, the rate of convergence is worst-case optimal for trees of bounded degree. Note that an algorithm for token-distribution on trees is applicable to arbitrary architectures, since the algorithm can be applied on a spanning tree of any given connected graph.
GEMS: Gossip-Enabled Monitoring Service for Heterogeneous Distributed Systems,” http://www.hcs.ufl.edu/pubs/GEMS2002.pdf, submitted to Journal of Network and Systems Management
"... Abstract – Gossip protocols provide a scalable means for detecting failures in heterogeneous distributed systems in an asynchronous manner without the limits associated with group communication. In this paper, we discuss the development and features of a hierarchical Gossip-Enabled Monitoring Servic ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract – Gossip protocols provide a scalable means for detecting failures in heterogeneous distributed systems in an asynchronous manner without the limits associated with group communication. In this paper, we discuss the development and features of a hierarchical Gossip-Enabled Monitoring Service (GEMS), which extends the gossip-style failure detection service to support resource monitoring. By dividing the system into groups of nodes and layers of communication, the GEMS paradigm scales well. Easily extensible, GEMS incorporates facilities for distributing arbitrary system and application-specific data. In this paper we present experiments and analytical projections demonstrating fast response times and low resource utilization requirements, making GEMS a superior solution for resource monitoring issues in distributed computing. Also, we demonstrate the utility of GEMS through the development of a simple dynamic load balancing service for which GEMS forms the information base.

