Results 1 - 10
of
12
Modular Certification
, 2002
"... Airplanes are certified as a whole: there is no established basis for separately certifying some components, particularly software-intensive ones, independently of their specific application in a given airplane. The absence of separate certification inhibits the development of modular components tha ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Airplanes are certified as a whole: there is no established basis for separately certifying some components, particularly software-intensive ones, independently of their specific application in a given airplane. The absence of separate certification inhibits the development of modular components that could be largely "precertified" and used in several different contexts within a single airplane, or across many different airplanes.
Reliability Simulation of Component-Based Software Systems
, 1998
"... Prevalent Markovian and semi-Markovian methods to predict the reliability and performance of component-based heterogeneous systems suffer from several limitations: they are subject to an intractably large state-space for more involved scenarios, and they cannot take into account the influence of var ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Prevalent Markovian and semi-Markovian methods to predict the reliability and performance of component-based heterogeneous systems suffer from several limitations: they are subject to an intractably large state-space for more involved scenarios, and they cannot take into account the influence of various parameters such as reliability growth of the individual components, dependencies among the components, etc., in a single model. Discrete-event simulation on the other hand offers an attractive alternative to analytical models as it can capture a detailed system structure, and can be used to study the influence of different factors separately as well as in a combined fashion on dependability measures. In this paper we demonstrate the flexibility offered by discrete-event simulation to analyze such complex systems through two case studies, one of a terminating application, and the other of a real-time application with feedback control. We simulate the failure behavior of the terminating a...
Analysis of Faults Detected in a Large-Scale Multi-Version Software Development Experiment
- Proc. DASC '90
"... Twenty programs were built to the same specification of an inertial navigation problem. The programs were then subjected to a three phase testing and debugging process: an acceptance test, a certification test, and an operational test. Less than 20% of the faults discovered during the certification ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
Twenty programs were built to the same specification of an inertial navigation problem. The programs were then subjected to a three phase testing and debugging process: an acceptance test, a certification test, and an operational test. Less than 20% of the faults discovered during the certification and operational testing were non-unique, i.e. the same or very similar faults would be found in more than one program. However, some of these "common" faults spanned as many as half of the versions. Faults discovered during the certification testing were due to specification errors and ambiguities, inadequate programmer background knowledge, insufficient programming experience, incomplete analysis, and insufficient acceptance testing. Faults discovered during the operational testing were of a more subtle nature, and were mostly due to various programmer knowledge defects and incomplete analysis errors. Techniques that may be used to avoid the observed fault types are discussed. 1. Introducti...
An empirical study on testing and fault tolerance for software reliability engineering
- In Proceedings 14th IEEE International Symposium on Software Reliability Engineering (ISSRE’2003
, 2003
"... Software testing and software fault tolerance are two major techniques for developing reliable software systems, yet limited empirical data are available in the literature to evaluate their effectiveness. We conducted a major experiment to engage 34 programming teams to independently develop multipl ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
Software testing and software fault tolerance are two major techniques for developing reliable software systems, yet limited empirical data are available in the literature to evaluate their effectiveness. We conducted a major experiment to engage 34 programming teams to independently develop multiple software versions for an industry-scale critical flight application, and collected faults detected in these program versions. To evaluate the effectiveness of software testing and software fault tolerance, mutants were created by injecting real faults occurred in the development stage. The nature, manifestation, detection, and correlation of these faults were carefully investigated. The results show that coverage testing is generally an effective mean to detecting software faults, but the effectiveness of testing coverage is not equivalent to that of mutation coverage, which is a more truthful indicator of testing quality. We also found that exact faults found among versions are very limited. This result supports software fault tolerance by design diversity as a creditable approach for software reliability engineering. Finally we conducted domain analysis approach for test case generation, and concluded that it is a promising technique for software testing purpose.
Software Verification and System Assurance
, 2009
"... Littlewood [1] introduced the idea that software may be possibly perfect and that we can contemplate its probability of (im)perfection. We review this idea and show how it provides a bridge between correctness, which is the goal of software verification (and especially formal verification), and the ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Littlewood [1] introduced the idea that software may be possibly perfect and that we can contemplate its probability of (im)perfection. We review this idea and show how it provides a bridge between correctness, which is the goal of software verification (and especially formal verification), and the probabilistic properties such as reliability that are the targets for system-level assurance. We enumerate the hazards to formal verification, consider how each of these may be countered, and propose relative weightings that an assessor may employ in assigning a probability of perfection.
An Empirical Evaluation of Consensus Voting and Consensus Recovery Block Reliability in the Presence of Failure Correlation
- Journal of Computer and Software Engineering
, 1993
"... The reliability of fault-tolerant software system implementations, based on Consensus Voting and Consensus Recovery Block strategies, is evaluated using a set of independently developed functionally equivalent versions of an avionics application. The strategies are studied under conditions of high i ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
The reliability of fault-tolerant software system implementations, based on Consensus Voting and Consensus Recovery Block strategies, is evaluated using a set of independently developed functionally equivalent versions of an avionics application. The strategies are studied under conditions of high inter-version failure correlation, and with program versions of medium-to-high reliability. Comparisons are made with classical N-Version Programming that uses Majority Voting, and with Recovery Block strategies. The empirical behavior of the three schemes is found to be in good agreement with theoretical analyses and expectations. In this study Consensus Voting and Consensus Recovery Block based systems were found to perform better, and more uniformly, than corresponding traditional strategies, that is, Recovery Block and N-Version Programming that use Majority Voting. This is the first experimental evaluation of the system reliability provided by Consensus Voting, and the first experimental...
Modeling Execution Time of Multi-Stage N-Version Fault-Tolerant Software
- Fault-Tolerant Software Systems: Techniques and Applications (Hoang Pham, ed), IEEE Computer Society Press, Los Alamitos
, 1992
"... An N-version system can be subdivided into stages for the purpose of forward error recovery through voting after each stage. In the simplest case at each stage the whole system waits for the slowest version to finish before a vote is taken. A better solution is to use a scheme we call Expedient Voti ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
An N-version system can be subdivided into stages for the purpose of forward error recovery through voting after each stage. In the simplest case at each stage the whole system waits for the slowest version to finish before a vote is taken. A better solution is to use a scheme we call Expedient Voting in which the voting takes place as soon as an adequate number of components have finished in a stage. The concept of a "runahead" is introduced ¾ the faster versions are allowed to run ahead of the rest of the slower versions by one or more stages, with synchronized re-start in the event of a failure. If the versions are highly reliable, inter-version failure dependence is small, and the difference between the fastest and the slowest successful components in each stage is large, then the execution speed-up through Expedient Voting may be substantial. Runaheads exceeding 3 stages offer diminishing returns. Speed-up deteriorates with reduction in the version reliability and independence. Th...
Reasoning about the Reliability Of Diverse Two-Channel Systems In which One Channel is “Possibly Perfect”
, 2009
"... should appear on the left and odd-numbered pages on the right when opened as a doublepage This report refines and extends an earlier paper by the first author [25]. It considers the problem of reasoning about the reliability of fault-tolerant systems with two “channels” (i.e., components) of which o ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
should appear on the left and odd-numbered pages on the right when opened as a doublepage This report refines and extends an earlier paper by the first author [25]. It considers the problem of reasoning about the reliability of fault-tolerant systems with two “channels” (i.e., components) of which one, A, because it is conventionally engineered and presumed to contain faults, supports only a claim of reliability, while the other, B, by virtue of extreme simplicity and extensive analysis, supports a plausible claim of “perfection.” We begin with the case where either channel can bring the system to a safe state. The reasoning about system probability of failure on demand (pfd) is divided into two steps. The first concerns aleatory uncertainty about (i) whether channel A will fail on a randomly selected demand and (ii) whether channel B is imperfect. It is shown that, conditional upon knowing pA (the probability that A fails on a randomly selected demand) and pB (the probability that channel B is imperfect), a conservative bound on the probability that the system fails on a randomly selected demand is simply pA × pB. That is, there is conditional independence between the events “A fails ” and “B is imperfect. ” The second
1. Preliminary Report on Consensus
, 1989
"... 2. Modeling Execution Time Fault-Tolerant Software ..."
A Practical Implementation of Maximum Likelihood Voting
"... The Maximum Likelihood Voting (MLV) strategy was recently proposed as one of the most reliable voting methods. The strategy determines the most likely correct result based on the reliability history of each software version. In this paper we first discuss the issues that arise in practical implement ..."
Abstract
- Add to MetaCart
The Maximum Likelihood Voting (MLV) strategy was recently proposed as one of the most reliable voting methods. The strategy determines the most likely correct result based on the reliability history of each software version. In this paper we first discuss the issues that arise in practical implementation of MLV, such as the question of unrealized outputs, the handling of voting ties, and the issue of inter-version failure correlation. We then present an extended MLV algorithm that a) uses a dynamic voting strategy which automatically adapts to the number of realized output space categories, and b) uses component reliability estimates to break voting ties. We also present an empirical evaluation of the implemented MLV strategy, and we compare it with Recovery Block (RB), N-Version Programming (NVP) and Consensus Recovery Block (CRB) approaches. Our results show that, even under high inter-version failure correlation conditions, our implementation of MLV performs well. In fact, statist...

