Results 1 - 10
of
31
The Grid Protocol: A High Performance Scheme for Maintaining Replicated Data
- IEEE Transactions on Knowledge and Data Engineering
, 1990
"... We present a new protocol for maintaining replicated data that can provide both high data availability and low response time. In the protocol, the nodes are organized in a logical grid. Existing protocols are designed primarily to achieve high availability by updating a large fraction of the copies ..."
Abstract
-
Cited by 108 (4 self)
- Add to MetaCart
We present a new protocol for maintaining replicated data that can provide both high data availability and low response time. In the protocol, the nodes are organized in a logical grid. Existing protocols are designed primarily to achieve high availability by updating a large fraction of the copies which provides some (although not significant) load sharing. In the new protocol, transaction processing is shared effectively among nodes storing copies of the data and both the response time experienced by transactions and the system throughput are improved significantly. We present an analysis of the availability of the new protocol and use simulation to study the effect of load sharing on the response time of transactions. We also compare the new protocol with a voting based scheme. This work was supported in part by NSF grants NCR-8604850 and CCR-8806358, and by the University Research Committee of Emory University. 1 Introduction A distributed system consists of cooperating process...
Weak-Consistency Group Communication and Membership
, 1992
"... Many distributed systems for widearea networks can be built conveniently, and operate efficiently and correctly, using a weak consistency group communication mechanism. This mechanism organizes a set of principals into a single logical entity, and provides methods to multicast messages to the membe ..."
Abstract
-
Cited by 92 (7 self)
- Add to MetaCart
Many distributed systems for widearea networks can be built conveniently, and operate efficiently and correctly, using a weak consistency group communication mechanism. This mechanism organizes a set of principals into a single logical entity, and provides methods to multicast messages to the members. A weak consistency distributed system allows the principals in the group to differ on the value of shared state at any given instant, as long as they will eventually converge to a single, consistent value. A group containing many principals and using weak consistency can provide the reliability, performance, and scalability necessary for widearea systems. I have developed a framework for constructing group communication systems, for classifying existing distributed system tools, and for constructing and reasoning about a particular group communication model. It has four components: message delivery, message ordering, group membership, and the application. Each component may have a different implementation, so that the group mechanism can be tailored to application requirements. The framework supports a new message delivery protocol, called timestamped antientropy, which provides reliable, eventual message delivery; is efficient; and tolerates most transient processor and network failures. It can be combined with message ordering implementations that provide ordering guarantees ranging from unordered to total, causal delivery. A new group membership protocol completes the set, providing temporarily inconsistent membership views resilient to up to k simultaneous principal failures. The Refdbms distributed bibliographic database system, which has been constructed using this framework, is used as an example. Refdbms databases can be replicated on many different sites, using the group communication system described here.
Replication Using Group Communication Over a Partitioned Network
, 1995
"... In systems based on the client-server model, a single server may serve many clients and the heavy load on the server may cause the response time to be adversely affected. In such circumstances, replicating data or servers may improve performance. Replication may also improve the availability of info ..."
Abstract
-
Cited by 81 (19 self)
- Add to MetaCart
In systems based on the client-server model, a single server may serve many clients and the heavy load on the server may cause the response time to be adversely affected. In such circumstances, replicating data or servers may improve performance. Replication may also improve the availability of information when processors crash or the network partitions. Existing replication methods are often needlessly expensive. They sometimes use pointto -point communication when multicast communication is available; they typically pay the full price of end-to-end acknowledgments for all of the participants for every update; they may claim locks, and therefore, may be vulnerable to faults that can unnecessarily block the system for long periods of time. This thesis presents a new architecture and algorithms for replication over a partitioned network. The architecture is structured into two layers: a replication server and a group communication layer. Each of the replication servers maintains a priva...
Ficus: A Very Large Scale Reliable Distributed File System
- UNIVERSITY OF CALIFORNIA, LOS ANGELES
, 1991
"... The dissertation presents the issues addressed in the design of Ficus, a large scale wide area distributed file system currently operational on a modest scale at UCLA. Key aspects of providing such a service include toleration of partial operation in virtually all areas; support for large scale, ..."
Abstract
-
Cited by 45 (7 self)
- Add to MetaCart
The dissertation presents the issues addressed in the design of Ficus, a large scale wide area distributed file system currently operational on a modest scale at UCLA. Key aspects of providing such a service include toleration of partial operation in virtually all areas; support for large scale, optimistic data replication; and a flexible, extensible modular design. Ficus incorporates a "stackable layers" modular architecture and full support for optimistic replication. Replication is provided by a pair of layers operating in concert above a traditional filing service. A "volume" abstraction and on-the-fly volume "grafting" mechanism are used to manage the large scale file name space. The replication service uses a f...
Dynamic Voting for Consistent Primary Components
, 1996
"... Distributed applications often use quorums in order to guarantee consistency. With emerging world-wide communication technology, many new applications (e.g. conferencing applications and interactive games) wish to allow users to freely join and leave, without restarting the entire system. The dynami ..."
Abstract
-
Cited by 43 (7 self)
- Add to MetaCart
Distributed applications often use quorums in order to guarantee consistency. With emerging world-wide communication technology, many new applications (e.g. conferencing applications and interactive games) wish to allow users to freely join and leave, without restarting the entire system. The dynamic voting paradigm allows such systems to define quorums adaptively, accounting for the changes in the set of participants. Furthermore, dynamic voting was proven to be the most available paradigm for maintaining quorums in unreliable networks. However, the subtleties of implementing dynamic voting were not well understood, in fact many of the suggested protocols may lead to inconsistencies in case of failures. Other protocols severely limit the availability in case failures occur during the protocol. In this paper we present a robust and efficient dynamic voting protocol for unreliable asynchronous networks. The protocol consistently maintains the primary component in a distributed system. O...
An Algorithm for Data Replication
- DIGITAL SYSTEMS RESEARCH CENTER TECH. REP
, 1989
"... Replication is an important technique for increasing computer system availability. In this paper, we present an algorithm for replicating stored data on multiple server machines. The algorithm organizes the replicated servers in a master/slaves scheme, with one master election being performed at the ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
Replication is an important technique for increasing computer system availability. In this paper, we present an algorithm for replicating stored data on multiple server machines. The algorithm organizes the replicated servers in a master/slaves scheme, with one master election being performed at the beginning of each service period. The status of each replica is summarized by a set of monotonically increasing epoch variables. Examining the epoch variables of a majority of the replicas reveals which replicas have up-to-date data. The set of replicas can be changed dynamically. Replicas that have been off-line can be brought up to date in background, and witness replicas, which store the epoch variables but not the data, can participate in the majority voting. The algorithm does not require distributed atomic transactions. The algorithm also permits client machines to cache copies of data, with strict cache consistency being ensured by having the replicated servers keep track of which cl...
Generic Broadcast
- In 13th. Intl. Symposium on Distributed Computing (DISC’99
, 1999
"... Message ordering is a fundamental abstraction in distributed systems. However, usual ordering guarantees are purely "syntactic", that is, message "semantics" is not taken into consideration, despite the fact that in several cases, semantic information about messages leads to more efficient messa ..."
Abstract
-
Cited by 26 (8 self)
- Add to MetaCart
Message ordering is a fundamental abstraction in distributed systems. However, usual ordering guarantees are purely "syntactic", that is, message "semantics" is not taken into consideration, despite the fact that in several cases, semantic information about messages leads to more efficient message ordering protocols. In this paper we define the Generic Broadcast problem, which orders the delivery of messages only if needed, based on the semantics of the messages. Semantic information about the messages is introduced in the system by a conflict relation defined over messages. We show that Reliable and Atomic Broadcast are special cases of Generic Broadcast, and propose an algorithm that solves Generic Broadcast efficiently. In order to assess efficiency, we introduce the concept of delivery latency.
Replicated Data Management in Mobile Environments: Anything New Under the Sun?
- In Proceedings of the IFIP Conference on Applications in Parallel and Distributed Computing
, 1994
"... The mobile wireless computing environment of the future will contain large numbers of low powered palmtop machines. Replication will be an essential technique in this environment, providing data availability to the system. In a mobile environment it is important to have dynamic replicated data ma ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
The mobile wireless computing environment of the future will contain large numbers of low powered palmtop machines. Replication will be an essential technique in this environment, providing data availability to the system. In a mobile environment it is important to have dynamic replicated data management algorithms that allow for instance copies to migrate from one site to another or for new copies to be generated. In this paper we show that such dynamic algorithms can be obtained simply by letting transaction update the directory that specifies sites holding copies. Thus we argue that no fundamentally new algorithms are needed to cope with mobility. However, exisiting algorithms may have to be "tuned" for a mobile environment, and we discuss what this may entail. As an illustration, we present a variation of the primary copy algorithm, Primary By Row, that is well suited for migrating copies 1 . Keywords: Distributed Data Bases, replication, mobility, availability. 1 Intr...
Optimizing Vote and Quorum Assignments for Reading and Writing Replicated Data
- IEEE Transactions on Knowledge and Data Engineering
, 1989
"... In the weighted voting protocol which is used to maintain the consistency of replicated data, the availability of the data to read and write operations not only depends on the availability of the nodes storing the data but also on the vote and quorum assignments used. We consider the problem of dete ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
In the weighted voting protocol which is used to maintain the consistency of replicated data, the availability of the data to read and write operations not only depends on the availability of the nodes storing the data but also on the vote and quorum assignments used. We consider the problem of determining the vote and quorum assignments that yield the best permormance in a distributed system where node availabilities can be different and the mix of the read and write operations is arbitrary. The optimal vote and quorum assignments depend not only on the system parameters such as node availability and operation mix, but also on the performance measure. We present an enumeration algorithm that can be used to find the vote and quorum assignments that need to be considered for achieving optimal performance. When the performance measure is data availability, an analytical method is derived to evaluate it for any vote and quorum assignment. This method and the enumeration algorithm is used ...
Accessing Replicated Data in a Large-Scale Distributed System
- International Journal in Computer Simulation
, 1991
"... Replicating a data object improves the availability of the data, and can improve access latency by locating copies of the object near to their use. When accessing replicated objects across an internetwork, the time to access different replicas is non-uniform. Further, the probability that a particul ..."
Abstract
-
Cited by 18 (8 self)
- Add to MetaCart
Replicating a data object improves the availability of the data, and can improve access latency by locating copies of the object near to their use. When accessing replicated objects across an internetwork, the time to access different replicas is non-uniform. Further, the probability that a particular replica is inaccessible is much higher in an internetwork than in a local-area network (LAN) because of partitions and the many intermediate hosts and networks that can fail. We report three replica-accessing algorithms which can be tuned to minimize either access latency or the number of messages sent. These algorithms assume only an unreliable datagram mechanism for communicating with replicas. Our work extends previous investigations into the performance of replication algorithms by assuming unreliable communication. We have investigated the performance of these algorithms by measuring the communication behavior of the Internet, and by building discrete-event simulations based on our m...

