Results 1 - 10
of
3,639
A fault tolerant implementation of Multi-Level Monte Carlo Methods
, 2014
"... The theory behind fault tolerant multi-level Monte Carlo (FT-MLMC) methods was recently developed and tested. These tests were made without a real fault tolerant implementation. We implemented an MPI-parallelized fault tolerant MLMC version of an existing parallel MLMC code (ALSVID-UQ). It is based ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The theory behind fault tolerant multi-level Monte Carlo (FT-MLMC) methods was recently developed and tested. These tests were made without a real fault tolerant implementation. We implemented an MPI-parallelized fault tolerant MLMC version of an existing parallel MLMC code (ALSVID
An adaptive, fault-tolerant implementation of BSP for Java-based volunteer computing systems
- In Proceedings of IPPS Workshop on Java for Parallel and Distributed Computing, volume 1586 of Lecture Notes in Computer Science
, 1999
"... Abstract. In recent years, there has been a surge of interest in Javabased volunteer computing systems, which aim to make it possible to build very large parallel computing networks very quickly by enabling users to join a parallel computation by simply visiting a web page and running a Java applet ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
to be used in dynamic environments. We show howwe have implemented this model using the Bayanihan software framework to enable programmers to port the growing base of BSP-based parallel applications to Java while achieving adaptive parallelism and protection against both the random faults and intentional
Fault-Tolerant Implementation of Finite-State Automata in Recurrent Neural Networks
, 1995
"... Recently, we have proven that the dynamics of any deterministic finite-state automata (DFA) with n states and m input symbols can be implemented in a sparse second-order recurrent neural network (SORNN) with n + 1 state neurons and O(mn) second-order weights and sigmoidal discriminant functions [5] ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
]. We investigate how that constructive algorithm can be extended to fault-tolerant neural DFA implementations where faults in an analog implementation of neurons or weights do not affect the desired network performance. We show that tolerance to weight perturbation can be achieved easily; tolerance
Fault-tolerant Implementations of regular Registers by safe Registers in Link Model
, 2008
"... A network that uses locally shared registers can be modelled by a graph where nodes represent processors and there is an edge between two nodes if and only if the corresponding processors communicate directly by reading or writing registers shared between them. Two variants are defined by A variant ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
conditions and with what fault-tolerance guarantees it is possible to transform a solution under one of these models into a solution under models. The fault tolerant properties we consider are selfstabilization and wait-freedom. Our principal result is a wait-free and self-stabilizing compiler from
Fault-tolerant implementations of atomic-state communication model for distributed computing
- In DISC’07, the 21th International Symposium on Distributed Computing, Springer LNCS:4731
, 2007
"... There is a proliferation of models for distributed computing, consisting of both shared memory and message passing paradigms. Di erent communities adopt di erent variants as the \standard " model for their research setting. Since subtle changes in the communication model can result in signi can ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
cant changes to the solvability/unsolvability or to the complexity of various problems, it becomes imperative to understand the relationships between the many models. The situation becomes even more complicated when additional requirements such as fault-tolerance are added to the mix. This motivates us
High Performance Linpack Benchmark: A Fault Tolerant Implementation without Checkpointing
- in Proceedings of the 25th ACM International Conference on Supercomputing (ICS 2011). ACM
"... The probability that a failure will occur before the end of the computation increases as the number of processors used in a high performance computing application increases. For long running applications using a large number of processors, it is essential that fault tolerance be used to prevent a to ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
The probability that a failure will occur before the end of the computation increases as the number of processors used in a high performance computing application increases. For long running applications using a large number of processors, it is essential that fault tolerance be used to prevent a
Analytically Redundant Controllers for Fault Tolerance: Implementation with Separation of Concerns
"... Abstract-Diversity or redundancy based software fault tolerance encompasses the development of application domain specific variants and error detection mechanisms. In this regard, this paper presents an analytical design strategy to develop the variants for a fault tolerant real-time control system ..."
Abstract
- Add to MetaCart
system. This work also presents a generalized error detection mechanism based on the stability performance of a designed controller using the Lyapunov Stability Criterion. The diverse redundant fault tolerance is implemented with an aspect oriented compiler to separate and thus reduce this additional
Fault Tolerant Implementation of Peer-to-Peer Distributed Iterative Algorithms
"... Abstract—Fault tolerance issues related to the implementation of distributed iterative algorithms via the P2PDC peer-to-peer distributed computing environment are considered. P2PDC is a decentralized environment dedicated to task parallel applica-tions. It has been designed more particularly for the ..."
Abstract
- Add to MetaCart
Abstract—Fault tolerance issues related to the implementation of distributed iterative algorithms via the P2PDC peer-to-peer distributed computing environment are considered. P2PDC is a decentralized environment dedicated to task parallel applica-tions. It has been designed more particularly
Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial
- ACM COMPUTING SURVEYS
, 1990
"... The state machine approach is a general method for implementing fault-tolerant services in distributed systems. This paper reviews the approach and describes protocols for two different failure models--Byzantine and fail-stop. System reconfiguration techniques for removing faulty components and i ..."
Abstract
-
Cited by 975 (9 self)
- Add to MetaCart
The state machine approach is a general method for implementing fault-tolerant services in distributed systems. This paper reviews the approach and describes protocols for two different failure models--Byzantine and fail-stop. System reconfiguration techniques for removing faulty components
Pregel: A system for large-scale graph processing
- IN SIGMOD
, 2010
"... Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs—in some cases billions of vertices, trillions of edges—poses challenges to their efficient processing. In this paper we present a computational model ..."
Abstract
-
Cited by 496 (0 self)
- Add to MetaCart
is flexible enough to express a broad set of algorithms. The model has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Distributionrelated details are hidden behind
Results 1 - 10
of
3,639