Results 1 - 10
of
10
Failure Mode Assumptions and Assumption Coverage
, 1995
"... . A method is proposed for the formal analysis of failure mode assumptions and for the evaluation of the dependability of systems whose design correctness is conditioned on the validity of such assumptions. Formal definitions are given for the types of errors that can affect items of service deliver ..."
Abstract
-
Cited by 114 (4 self)
- Add to MetaCart
. A method is proposed for the formal analysis of failure mode assumptions and for the evaluation of the dependability of systems whose design correctness is conditioned on the validity of such assumptions. Formal definitions are given for the types of errors that can affect items of service delivered by a system or component. Failure mode assumptions are then formalized as assertions on the types of errors that a component may induce in its enclosing system. The concept of assumption coverage is introduced to relate the notion of partiallyordered assumption assertions to the quantification of system dependability. Assumption coverage is shown to be extremely important in systems requiring very high dependability. It is also shown that the need to increase system redundancy to accommodate more severe modes of component failure can sometimes result in a decrease in dependability. 1 Introduction and Overview The definition of assumptions about the types of faults, the rate at which comp...
Closure and Convergence: A Foundation of Fault-Tolerant Computing
- IEEE Transactions on Software Engineering
, 1993
"... We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if faults continue occurring, the ..."
Abstract
-
Cited by 103 (28 self)
- Add to MetaCart
We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if faults continue occurring, the system state remains within that larger set (Closure). And two, if faults stop occurring, the system eventually reaches a state within the legal set (Convergence). We demonstrate the applicability of our definition for specifying and verifying the fault-tolerance properties of a variety of digital and computer systems. Further, using the definition, we obtain a simple classification of fault-tolerant systems and discuss methods for their systematic design. as traditionally been studied in the context of specifi...
Safety Tactics for Software Architecture Design
- in Proceedings of the 28th Annual International Computer Software and Applications Conference, (Hong Kong, 2004), IEEE Computer Society
, 2004
"... The influence of architecture in assurance of system safety is being increasingly recognised in mission-critical software applications. Nevertheless, most architectural strategies have not been developed to the extent necessary to ensure safety of these systems. Moreover, many software safety standa ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
The influence of architecture in assurance of system safety is being increasingly recognised in mission-critical software applications. Nevertheless, most architectural strategies have not been developed to the extent necessary to ensure safety of these systems. Moreover, many software safety standards fail to discuss the rationale behind the adoption of alternative architectural mechanisms. Safety has not been explicitly considered by existing software architecture design methodologies. As a result, there is little practical guidance on how to address safety concerns in ‘shaping ’ a ‘safe ’ software architecture. This paper presents a method for software architecture design within the context of safety. This method is centred upon extending the existing notion of architectural tactics to include safety as a consideration. The approach extends existing software architecture design methodologies and demonstrates the true value of deployment of specific protection mechanisms. The feasibility of this method is demonstrated by an example. 1.
Safety-Directed System Monitoring Using Safety Cases
, 2000
"... Currently, the safety studies of the system (which are also collectively known as the safety case) cease or reduce in their utility after system certification, and with that, a vast amount of knowledge about the failure (or safe) behaviour of the system is usually rendered useless. In this thesis, w ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Currently, the safety studies of the system (which are also collectively known as the safety case) cease or reduce in their utility after system certification, and with that, a vast amount of knowledge about the failure (or safe) behaviour of the system is usually rendered useless. In this thesis, we argue that this knowledge could be usefully exploited in the context of an appropriate on-line safety monitoring scheme. As a practical application of our approach, we propose a safety monitor that operates on safety cases to support the on-line detection and control of hazardous failures in safety critical systems. Firstly,
New Directions In Software Safety: Causal Modelling As An Aid To Integration
"... Analysis of software safety can provide us with much interesting data on potential failure modes of individual software components and of the effects of these failures on the system as a whole. In this paper we describe our approach to software safety analysis, based around integrating notations wit ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Analysis of software safety can provide us with much interesting data on potential failure modes of individual software components and of the effects of these failures on the system as a whole. In this paper we describe our approach to software safety analysis, based around integrating notations with diverse causal models, and how we believe it can be used as an aid to the design process. We introduce the Failure Propagation and Transformation Notation (FPTN) as a modelling tool which can be used throughout the software lifecycle as an aid to design and implementation of safe systems. This paper expands upon material presented at the November 1992 IEE Colloquium on Hazard Analysis. [Fenelon92] 1 The SSAP Project on which Peter Fenelon was employed during the preparation of this paper was funded by British Aerospace Defence (Military Aircraft Division), Warton, Lancs. He is now employed on the ASAM-II Project at York. 2 Table Of Contents Table Of Contents ..............................
Integrating Safety Analysis Techniques, Supporting Identification of Common Cause Failures
, 2000
"... When we apply safety analysis techniques on a new design, our primary objective is to malfunctions. The ultimate aim is to identify weak areas of the design and stimulate design iterations that improve the safety of the system under examination. Unfortunately, the current industrial pratrise sho ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
When we apply safety analysis techniques on a new design, our primary objective is to malfunctions. The ultimate aim is to identify weak areas of the design and stimulate design iterations that improve the safety of the system under examination. Unfortunately, the current industrial pratrise shows that this aim is seriously hindered by the lack of appropriate techniques for the analysis of complex hierarchical designs.
Replication for Fault Tolerant Software Using a Functional and Attribute Grammar Based Computation Model
- PhD thesis, School of Information Science, Japan Advanced Institute of Science and Technology
, 1998
"... As people reliance on computer systems increases, it is of primary importance for these systems to be dependable. This new dependability requirement increases the need for the development of fault tolerant software. Designing and implementing fault tolerant software is a difficult task, especially w ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
As people reliance on computer systems increases, it is of primary importance for these systems to be dependable. This new dependability requirement increases the need for the development of fault tolerant software. Designing and implementing fault tolerant software is a difficult task, especially when implementing fault tolerant parallel software. Only few programming languages support fault tolerance and parallel programming. These languages are based on an imperative language paradigm. Most fault tolerance techniques are developed for such language paradigm. The imperative language paradigm increases system complexity. Novel fault tolerance techniques for the implementation of fault tolerant software based on a different language paradigm have to be developed in order to decrease system complexity and increase its performance. This dissertation presents a novel replication technique for implementing fault tolerant parallel software based on a declarative language paradigm. The repli...
Towards Integrated Integrated Safety Analysis and Design
- ACM Applied Computing Review
, 1994
"... There are currently many problems with the development and assessment of software intensive safety-critical systems. In this paper we describe the problems, and introduce a novel approach to their solution, based around goal-structuring concepts, which we believe will ameliorate some of the difficul ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
There are currently many problems with the development and assessment of software intensive safety-critical systems. In this paper we describe the problems, and introduce a novel approach to their solution, based around goal-structuring concepts, which we believe will ameliorate some of the difficulties. We discuss the use of modified and new forms of safety assessment notations to provide evidence of safety, and the use of data derived from such notations as a means of providing quantified input into the design assessment process. We then show how the design assessment can be partially automated, and from this develop some ideas on how we might move from analytical to synthetic approaches, using safety criteria and evidence as a fitness function for comparing alternative automaticallygenerated designs. Keywords: safety assessment, architectural design, goal structures, method integration, automated design Introduction Much current industrial practice in the design and assessment of...
Algorithms for Building Fault-Tolerant Distributed Systems
, 1997
"... v Notation xi Chapter 1 Introduction 1 1.1 Active replication and checkpointing . . . . . . . . . . . . . . . . . . 2 1.2 Issues in fault tolerance . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.1 Failure dete ..."
Abstract
- Add to MetaCart
v Notation xi Chapter 1 Introduction 1 1.1 Active replication and checkpointing . . . . . . . . . . . . . . . . . . 2 1.2 Issues in fault tolerance . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.1 Failure detection . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.2 Predicate detection . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.3 Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.4 Optimistic coordination . . . . . . . . . . . . . . . . . . . . . 7 1.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4.1 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.4.2 Layer descriptions . . . . . . . . . . . . . . . . . . . . . . . . 10 1.4.3 Example of use . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Chapter 2 Model 15 2.1...

