Results 1 - 10
of
16
Utopia: a Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems
, 1993
"... ..."
Legion: The Next Logical Step Toward a Nationwide Virtual Computer
, 1994
"... The coming of giga-bit networks makes possible the realization of a single nationwide virtual computer comprised of a variety of geographically distributed high-performance machines and workstations. To realize the potential that the physical infrastructure provides, software must be developed that ..."
Abstract
-
Cited by 100 (8 self)
- Add to MetaCart
The coming of giga-bit networks makes possible the realization of a single nationwide virtual computer comprised of a variety of geographically distributed high-performance machines and workstations. To realize the potential that the physical infrastructure provides, software must be developed that is easy to use, supports large degrees of parallelism in applications code, and manages the complexity of the underlying physical system for the user. This paper describes our approach to constructing and exploiting such "metasystems". Our approach inherits features of earlier work on parallel processing systems and heterogeneous distributed computing systems. In particular, we are building on Mentat, an object-oriented parallel processing system developed at the University of Virginia. This report is a preliminary document. We expect changes to occur as the architecture and design of the system mature.
Managing Multiple Communication Methods in High-Performance Networked Computing Systems
- Journal of Parallel and Distributed Computing
, 1997
"... Modern networked computing environments and applications often require---or can benefit from---the use of multiple communication substrates, transport mechanisms, and protocols, chosen according to where communication is directed, what is communicated, or when communication is performed. We propose ..."
Abstract
-
Cited by 79 (13 self)
- Add to MetaCart
Modern networked computing environments and applications often require---or can benefit from---the use of multiple communication substrates, transport mechanisms, and protocols, chosen according to where communication is directed, what is communicated, or when communication is performed. We propose techniques that allow multiple communication methods to be supported transparently in a single application, with either automatic or user-specified selection criteria guiding the methods used for each communication. We explain how communication link and remote service request mechanisms facilitate the specification and implementation of multimethod communication. These mechanisms have been implemented in the Nexus multithreaded runtime system, and we use this system to illustrate solutions to various problems that arise when implementing multimethod communication. We also illustrate the application of our techniques by describing a multimethod, multithreaded implementation of the Message Pas...
Metasystems: an approach combining parallel processing and heterogeneous distributed computing systems
- J. PARALLEL & DISTRIBUTED COMPUT
"... A metasystem is a single computing resource composed of a heterogeneous group of autonomous computers linked together by a network. The interconnection network needed to construct large metasystems will soon be in place. To fully exploit these new systems, software that is easy to use, supports larg ..."
Abstract
-
Cited by 59 (16 self)
- Add to MetaCart
A metasystem is a single computing resource composed of a heterogeneous group of autonomous computers linked together by a network. The interconnection network needed to construct large metasystems will soon be in place. To fully exploit these new systems, software that is easy to use, supports large degrees of parallelism, and hides the complexity of the underlying physical architecture must be developed. In this paper we describe our metasystem vision, our approach to constructing a metasystem testbed, and early experimental results. Our approach combines features from earlier work on both parallel processing systems and heterogeneous distributed computing systems. Using the testbed we have found that data coercion costs are not a serious obstacle to high performance, but that load imbalance induced by differing processor capabilities can limit performance. We then present a mechanism to overcome load imbalance that utilizes user-provided callbacks.
Scale in Distributed Systems
- Readings in Distributed Computing Systems
, 1994
"... In recent years, scale has become a factor of increasing importance in the design of distributed systems. The scale of a system has three dimensions: numerical, geographical, and administrative. The numerical dimension consists of the number of users of the system, and the number of objects and serv ..."
Abstract
-
Cited by 46 (1 self)
- Add to MetaCart
In recent years, scale has become a factor of increasing importance in the design of distributed systems. The scale of a system has three dimensions: numerical, geographical, and administrative. The numerical dimension consists of the number of users of the system, and the number of objects and services encompassed. The geographical dimension consists of the distance over which the system is scattered. The administrative dimension consists of the number of organizations that exert control over pieces of the system. The three dimensions of scale affect distributed systems in many ways. Among the affected components are naming, authentication, authorization, accounting, communication, the use of remote resources, and the mechanisms by which users view the system. Scale affects reliability: as a system scales numerically, the likelihood that some host will be down increases; as it scales geographically, the likelihood that all hosts can communicate will decrease. Scale also affects perfor...
Campus-Wide Computing: Early Results Using Legion at the University of Virginia
, 1995
"... The Legion project at the University of Virginia is an architecture for designing and building system services that provide the illusion of a single virtual machine to users, a virtual machine that provides both improved response time via parallel execution and greater throughput. Legion targets ..."
Abstract
-
Cited by 29 (11 self)
- Add to MetaCart
The Legion project at the University of Virginia is an architecture for designing and building system services that provide the illusion of a single virtual machine to users, a virtual machine that provides both improved response time via parallel execution and greater throughput. Legion targets workstation clusters and larger wide area assemblies of workstations, supercomputers, and parallel supercomputers. We have built a working Legion prototype, called the Campus-Wide Virtual Computer (CWVC). The CWVC extends an existing object-oriented parallel processing system by aggressively incorporating lessons learned in the last twenty years of heterogeneous distributed computing. In this paper, we describe the challenges that we overcame to realize a working CWVC, and we characterize the performance of a production biochemistry application. 1. Introduction Computationally demanding applications challenge organizations to provide powerful computing resources. Traditionally, expens...
Multimethod Communication for High-Performance Metacomputing Applications
, 1996
"... Metacomputing systems use high-speed networks to connect supercomputers, mass storage systems, scientific instruments, and display devices with the objective of enabling parallel applications to access geographically distributed computing resources. However, experience shows that high performance ..."
Abstract
-
Cited by 20 (8 self)
- Add to MetaCart
Metacomputing systems use high-speed networks to connect supercomputers, mass storage systems, scientific instruments, and display devices with the objective of enabling parallel applications to access geographically distributed computing resources. However, experience shows that high performance often can be achieved only if applications can integrate diverse communication substrates, transport mechanisms, and protocols, chosen according to where communication is directed, what is communicated, or when communication is performed. In this article, we describe a software architecture that addresses this requirement. This architecture allows multiple communication methods to be supported transparently in a single application, with either automatic or user-specified selection criteria guiding the methods used for each communication. We describe an implementation of this architecture, based on the Nexus communication library, and use this implementation to evaluate performance i...
A Synopsis of the Legion Project
, 1994
"... The coming of giga-bit networks makes possible the realization of a single nationwide virtual computer comprised of a variety of geographically distributed high-performance machines and workstations. To realize the potential that the physical infrastructure provides, software must be developed that ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
The coming of giga-bit networks makes possible the realization of a single nationwide virtual computer comprised of a variety of geographically distributed high-performance machines and workstations. To realize the potential that the physical infrastructure provides, software must be developed that is easy to use, supports large degrees of parallelism in applications code, and manages the complexity of the underlying physical system for the user. This short paper briefly describes our approach to constructing and exploiting such "metasystems". Our approach inherits features of earlier work on parallel processing systems and heterogeneous distributed computing systems. In particular, we are building on Mentat, an object-oriented parallel processing system developed at the University of Virginia. A more detailed presentation can be found in technical report CS 94-21, "Legion: The Next Logical Step Towards a Nationwide Virtual Computer".
Process Migration for Heterogeneous Distributed Systems
, 1995
"... The policies and mechanisms for migrating processes in a distributed system become more complicated in a heterogeneous environment, where the hosts may di er in their architecture and operating systems. These distributed systems include a large quantity and great diversity of resources which may not ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
The policies and mechanisms for migrating processes in a distributed system become more complicated in a heterogeneous environment, where the hosts may di er in their architecture and operating systems. These distributed systems include a large quantity and great diversity of resources which may not be fully utilized without the means to migrate processes to the idle resources. In this paper, we present a graph model for single process migration which can be used for load balancing as well as other non-traditional scenarios such as migration during the graceful degradation of a host. The graph model provides the basis for a layered approach to implementing the mechanisms for process migration in a Heterogeneous Migration Facility (HMF). HMF provides the user with a library to automatically migrate processes and checkpoint data. 1
Adaptive Utilization of Communication and Computational Resources in High-Performance Distributed Systems: The EMOP Approach
- The 7th International Symposium on High Performance Distributed Computing
, 1998
"... Development of high-performance distributed applications can be extremely challenging because of their complex runtime environment coupled with their requirement of high-performance. Such applications typically run on a set of heterogeneous machines with dynamically varying loads, connected by heter ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Development of high-performance distributed applications can be extremely challenging because of their complex runtime environment coupled with their requirement of high-performance. Such applications typically run on a set of heterogeneous machines with dynamically varying loads, connected by heterogeneous networks possibly supporting a wide variety of communication protocols. In spite of the size and complexity of such applications, they must provide the required high-performance mandated by their users. In order to achieve this goal, they need to adaptively utilize their computational and communication resources. This paper describes EMOP, a programming environment for building high-performance distributed systems. EMOP is designed on the lines of CORBA and uses an Object Request Broker (ORB) to support seamless communication between distributed application components. In order to provide adaptive utilization of communication resources, it uses the principle of Open Implementation t...

