Results 1 - 10
of
102
The design of the borealis stream processing engine
- In CIDR
, 2005
"... Borealis is a second-generation distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionality from Aurora [14] and distribution functionality from Medusa [51]. Borealis modifies and extends both ..."
Abstract
-
Cited by 132 (8 self)
- Add to MetaCart
Borealis is a second-generation distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionality from Aurora [14] and distribution functionality from Medusa [51]. Borealis modifies and extends both systems in non-trivial and critical ways to provide advanced capabilities that are commonly required by newly-emerging stream processing applications. In this paper, we outline the basic design and functionality of Borealis. Through sample real-world applications, we motivate the need for dynamically revising query results and modifying query specifications. We then describe how Borealis addresses these challenges through an innovative set of features, including revision records, time travel, and control lines. Finally, we present a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.
Simgrid: a Toolkit for the Simulation of Application Scheduling
- Proceedings of the First IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2001
, 2001
"... Advances in hardware and software technologies have made it possible to deploy parallel applications over increasingly large sets of distributed resources. Consequently, the study of scheduling algorithms for such applications has been an active area of research. Given the nature of most scheduling ..."
Abstract
-
Cited by 99 (6 self)
- Add to MetaCart
Advances in hardware and software technologies have made it possible to deploy parallel applications over increasingly large sets of distributed resources. Consequently, the study of scheduling algorithms for such applications has been an active area of research. Given the nature of most scheduling problems one must resort to simulation to effectively evaluate and compare their efficacy over a wide range of scenarios. It has thus become necessary to simulate those algorithms for increasingly complex distributed, dynamic, heterogeneous environments. In this paper we present Simgrid, a simulation toolkit for the study of scheduling algorithms for distributed application. This paper gives the main concepts and models behind Simgrid, describes its API and highlights current implementation issues. We also give some experimental results and describe work that builds on Simgrid's functionalities. 1.
A Practical Approach to Dynamic Load Balancing
, 1995
"... algorithm for load balancing. The following sections elaborate on each step in the above algorithm, presenting various design decisions that one encounters. 2.1 Load Evaluation The efficacy of any load balancing scheme is directly dependent on the quality of load evaluation. Good load measurement i ..."
Abstract
-
Cited by 64 (7 self)
- Add to MetaCart
algorithm for load balancing. The following sections elaborate on each step in the above algorithm, presenting various design decisions that one encounters. 2.1 Load Evaluation The efficacy of any load balancing scheme is directly dependent on the quality of load evaluation. Good load measurement is necessary both to determine that a load imbalance exists and to calculate how much work should be transferred to alleviate that imbalance. One can determine the load associated with a given task analytically, empirically or by a combination of those two methods. 6 CHAPTER 2. METHODOLOGY 2.1.1 Analytic Load Evaluation The load for a task is estimated based on knowledge of the time complexity of the algorithm(s) that task is executing along with the data structures on which it is operating. For example, if one knew that a task involved merge sorting a list of 64 elements, one might estimate the load to be 384, since merge sort is an O(N log 2 N) sorting algorithm, and since 64 log 2 (64) ...
SWEB: Towards a Scalable World Wide Web Server on Multicomputers
, 1996
"... We investigate the issues involved in developing a scalable World Wide Web (WWW) server on a cluster of workstations and parallel machines, using the Hypertext Transport Protocol (HTTP). The main objective is to strengthen the processing capabilities of such a server by utilizing the power of multic ..."
Abstract
-
Cited by 63 (2 self)
- Add to MetaCart
We investigate the issues involved in developing a scalable World Wide Web (WWW) server on a cluster of workstations and parallel machines, using the Hypertext Transport Protocol (HTTP). The main objective is to strengthen the processing capabilities of such a server by utilizing the power of multicomputers to match huge demands in simultaneous access requests from the Internet. We have implemented a system called SWEB on a distributed memory machine, the Meiko CS-2, and networked SUN and DEC workstations. The scheduling component of the system actively monitors the usages of CPU, I/O channels and the interconnection network to effectively distribute HTTP requests across processing units to exploit task and I/O parallelism. We present the experimental results on the performance of this system. Our results indicate that the system delivers good performance on multi-computers and obtains significant improvements over other approaches. 1 Motivation The Scalable Web server (SWEB) project ...
Thread Migration and its Applications in Distributed Shared Memory Systems
- Journal of Systems and Software
, 1997
"... In this paper we describe the way thread migration can be carried in distributed shared memory (dsm) systems. We discuss the advantages of multi-threading in dsm systems and the importance of preempted dynamic thread migration. The proposed solution is implemented in millipede: an environment for pa ..."
Abstract
-
Cited by 46 (4 self)
- Add to MetaCart
In this paper we describe the way thread migration can be carried in distributed shared memory (dsm) systems. We discuss the advantages of multi-threading in dsm systems and the importance of preempted dynamic thread migration. The proposed solution is implemented in millipede: an environment for parallel programming over a network of (personal) computers. millipede implements transparent computation migration mechanism: a mobile computation thread in a millipede application can be suspended almost at every point during its life-time and be resumed on another host. This mechanism can be used to better utilize system resources and improve performance by balancing the load and solving ping-pong situations of memory objects, and to provide user ownership on his workstation. We describe how some of these are implemented in the millipede system. millipede, including its thread migration module, is fully implemented in user-mode (currently on Windows-NT) using the standard operating system...
Increasing Application Performance in Virtual Environments through Run-time Inference and Adaptation
- In Proceedings of the 14th IEEE International Symposium on High Performance Distributed Computing (HPDC
, 2005
"... Virtual machine distributed computing greatly simplifies the use of widespread computing resources by lowering the level of abstraction, benefiting both resource providers and users. Towards that end our Virtuoso middleware closely emulates the existing process of buying, configuring and using physi ..."
Abstract
-
Cited by 33 (13 self)
- Add to MetaCart
Virtual machine distributed computing greatly simplifies the use of widespread computing resources by lowering the level of abstraction, benefiting both resource providers and users. Towards that end our Virtuoso middleware closely emulates the existing process of buying, configuring and using physical machines. Virtuoso's VNET component is a simple and efficient layer two virtual network tool that makes these virtual machines (VMs) appear to be physically connected to the home network of the user while simultaneously supporting arbitrary topologies and routing among them. Virtuoso's VTTIF component continually infers the communication behavior of the application running in a collection of VMs. The combination of overlays like VNET and inference frameworks like VTTIF has great potential to increase the performance, with no user or developer involvement, of existing, unmodified applications by adapting their virtual environments to the underlying computing infrastructure to best suit the applications. We show here how to use the continually inferred application topology and traffic to dynamically control three mechanisms of adaptation, VM migration, overlay topology, and forwarding to significantly increase the performance of two classes of applications, bulk synchronous parallel applications and transactional web ecommerce applications.
Dynamic Load Balancing in Computational Mechanics
- Computer Methods in Applied Mechanics and Engineering
"... . In many important computational mechanics applications, the computation adapts dynamically during the simulation. Examples include adaptive mesh refinement, particle simulations and transient dynamics calculations. When running these kinds of simulations on a parallel computer, the work must be a ..."
Abstract
-
Cited by 31 (2 self)
- Add to MetaCart
. In many important computational mechanics applications, the computation adapts dynamically during the simulation. Examples include adaptive mesh refinement, particle simulations and transient dynamics calculations. When running these kinds of simulations on a parallel computer, the work must be assigned to processors in a dynamic fashion to keep the computational load balanced. A number of approaches have been proposed for this dynamic load balancing problem. This paper reviews the major classes of algorithms, and discusses their relative merits on problems from computational mechanics. Shortcomings in the state-of-the-art are identified and suggestions are made for future research directions. Key words. dynamic load balancing, parallel computer, adaptive mesh refinement 1. Introduction. The efficient use of a parallel computer requires two, often competing, objectives to be achieved. First, the processors must be kept busy doing useful work. And second, the amount of interprocess...
On Runtime Parallel Scheduling for Processor Load Balancing
- IEEE Trans. Parallel and Distributed Systems
, 1997
"... Parallel scheduling is a new approach for load balancing. In parallel scheduling, all processors cooperate to schedule work. Parallel scheduling is able to accurately balance the load by using global load information at compile-time or runtime. It provides high-quality load balancing. This paper pre ..."
Abstract
-
Cited by 22 (0 self)
- Add to MetaCart
Parallel scheduling is a new approach for load balancing. In parallel scheduling, all processors cooperate to schedule work. Parallel scheduling is able to accurately balance the load by using global load information at compile-time or runtime. It provides high-quality load balancing. This paper presents an overview of the parallel scheduling technique. Scheduling algorithms for tree, hypercube, and mesh networks are presented. These algorithms can fully balance the load and maximize locality 1. Introduction Static scheduling balances the workload before runtime and can be applied to problems with a predictable structure, which are called static problems. Dynamic scheduling performs scheduling activities concurrently at runtime, which applies to problems with an unpredictable structure, which are called dynamic problems. Static scheduling utilizes the knowledge of problem characteristics to reach a well-balanced load [1, 2, 3, 4]. However, it is not able to balance the load for dynami...
Automated Parallelization of Discrete State-space Generation
- Journal of Parallel and Distributed Computing
, 1997
"... We consider the problem of generating a large state-space in a distributed fashion. Unlike previously proposed solutions that partition the set of reachable states according to a hashing function provided by the user, we explore heuristic methods that completely automate the process. The first step ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
We consider the problem of generating a large state-space in a distributed fashion. Unlike previously proposed solutions that partition the set of reachable states according to a hashing function provided by the user, we explore heuristic methods that completely automate the process. The first step is an initial random walk through the state space to initialize a search tree, duplicated in each processor. Then, the reachability graph is built in a distributed way, using the search tree to assign each newly found state to classes assigned to the available processors. Furthermore, we explore two remapping criteria that attempt to balance memory usage or future workload, respectively. We show how the cost of computing the global snapshot required for remapping will scale up for system sizes in the foreseeable future. An extensive set of results is presented to support our conclusions that remapping is extremely beneficial. 1 Introduction Discrete systems are frequently analyzed by genera...
Dynamic load balancing of samr applications on distributed systems
- In Proceedings of the ACM/IEEE Symposium on Supercomputing (SC’01). IEEE Computer
, 2001
"... Dynamic load balancing(DLB) for parallel systems has been studied extensively; however, DLB for distributed systems is relatively new. To efficiently utilize computing resources provided by distributed systems, an underlying DLB scheme must address both heterogeneous and dynamic features of distribu ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
Dynamic load balancing(DLB) for parallel systems has been studied extensively; however, DLB for distributed systems is relatively new. To efficiently utilize computing resources provided by distributed systems, an underlying DLB scheme must address both heterogeneous and dynamic features of distributed systems. In this paper, we propose a DLB scheme for Structured Adaptive Mesh Refinement(SAMR) applications on distributed systems. While the proposed scheme can take into consideration (1) the heterogeneity of processors and (2) the heterogeneity and dynamic load of the networks, the focus of this paper is on the latter. The load-balancing processes are divided into two phases: global load balancing and local load balancing. We also provide a heuristic method to evaluate the computational gain and redistribution cost for global redistribution. Experiments show that by using our distributed DLB scheme, the execution time can be reduced by 9%-46 % as compared to using parallel DLB scheme which does not consider the heterogeneous and dynamic features of distributed systems. [Keywords] dynamic load balancing, distributed systems, adaptive mesh refinement, heterogeneity, dynamic network loads 1

