Results 1 - 10
of
562
Fast Crash Recovery in RAMCloud
- In Proc. of SOSP’11
, 2011
"... RAMCloud is a DRAM-based storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel to reconstruc ..."
Abstract
-
Cited by 83 (3 self)
- Add to MetaCart
RAMCloud is a DRAM-based storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel
Crash Recovery in FAST FTL
"... Abstract. NAND flash memory is one of the non-volatile memories and has been replacing hard disk in various storage markets from mobile devices, PC/Laptop computers, even to enterprise servers. However, flash memory does not allow in-place-update, and thus a block should be erased before overwritin ..."
Abstract
- Add to MetaCart
metadata (including address mapping information) as well as data from the crash. In general, the FTL layer is responsible for the crash recovery. In this paper, we propose a novel crash recovery scheme for FAST, a hybrid address mapping FTL. It writes periodically newly generated address mapping
Crash recovery with little overhead
- In Proceedings of the 11th International Conference on Distributed Computing Systems (ICDCS-11
, 1991
"... Recovering from processor failures in distributed sys-tems is an important problem in the design and development of reliable systems. Several solutions to this problem have been presented in the literature. Most of them recover from failures by storing sufficient extra information in stable storage ..."
Abstract
-
Cited by 48 (0 self)
- Add to MetaCart
Recovering from processor failures in distributed sys-tems is an important problem in the design and development of reliable systems. Several solutions to this problem have been presented in the literature. Most of them recover from failures by storing sufficient extra information in stable storage and using this information when there are failures. In this paper, we present two solutions to this problem which involve very little overhead. Without appending any information to the messages of the application pro-gram, we show that it is possible to recover from failures using O(IVIIEI) messages where IVI is the number of processors and IEl is the number of com-munication links in the system. The second algorithm can be used to recover from processor failures without forcing non-faulty processors to roll back under certain conditions. With a small modification, the second algorithm can also be used to recover from processor failures even if no stable storage is avail-able. 1.
Crash Recovery in Client-Server EXODUS
- In Proceedings of ACM-SIGMOD 1992 International Conference on Management of Data
, 1992
"... In this paper, we address the correctness and performance issues that arise when implementing logging and crash recovery in a page-server environment. The issues result from two characteristics of page-server systems: 1) the fact that data is modified and cached in client database buffers that are n ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
In this paper, we address the correctness and performance issues that arise when implementing logging and crash recovery in a page-server environment. The issues result from two characteristics of page-server systems: 1) the fact that data is modified and cached in client database buffers
Failure Detection and Consensus in the Crash-Recovery Model
, 1999
"... We study the problems of failure detection and consensus in asynchronous systems in which processes may crash and recover, and links may lose messages. We first propose new failure detectors that are particularly suitable to the crash-recovery model. We next determine under what conditions stable ..."
Abstract
-
Cited by 123 (9 self)
- Add to MetaCart
We study the problems of failure detection and consensus in asynchronous systems in which processes may crash and recover, and links may lose messages. We first propose new failure detectors that are particularly suitable to the crash-recovery model. We next determine under what conditions stable
Easy consensus algorithms for the crash-recovery model
, 2008
"... Abstract. In the crash-recovery failure model of asynchronous distributed systems, processes can temporarily stop to execute steps and later restart their computation from a predefined local state. The crash-recovery model is much more realistic than the crash-stop failure model in which processes m ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. In the crash-recovery failure model of asynchronous distributed systems, processes can temporarily stop to execute steps and later restart their computation from a predefined local state. The crash-recovery model is much more realistic than the crash-stop failure model in which processes
1 Modular Consensus Algorithms for the Crash-Recovery Model
"... Abstract—In the crash-recovery failure model of asynchronous distributed systems, processes can temporarily stop to execute steps and later restart their computation from a predefined local state. The crash-recovery model is much more realistic than the crash-stop failure model in which processes me ..."
Abstract
- Add to MetaCart
Abstract—In the crash-recovery failure model of asynchronous distributed systems, processes can temporarily stop to execute steps and later restart their computation from a predefined local state. The crash-recovery model is much more realistic than the crash-stop failure model in which processes
Quality of Service of Crash-Recovery Failure Detectors
, 2007
"... This thesis presents the results of an investigation into the failure detection problem. We consider the specific case of the Quality of Service (QoS) of crash failure detection. In contrast to previous work, we address the crash failure detection problem when the monitored target is resilient and r ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
and recovers after failure. To the best of our knowledge, this is the first work to provide an analysis of crash-recovery failure detection from the QoS perspective. We develop a probabilistic model of the behavior of a crash-recovery target, i.e. one which has the ability to recover from the crash state. We
Optimistic crash recovery without changing application messages
- In IEEE Transactions on Parallel and Distributed Systems
, 1997
"... Abstract—We present an optimistic crash recovery technique without any communication overhead during normal operations of the distributed system. Our technique does not append any information to the application messages, it does not suffer from the domino effect, and each processor rolls back at mos ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Abstract—We present an optimistic crash recovery technique without any communication overhead during normal operations of the distributed system. Our technique does not append any information to the application messages, it does not suffer from the domino effect, and each processor rolls back
Results 1 - 10
of
562