Results 1 - 10
of
264
PIE: A dynamic failure-based technique
- IEEE Transactions on Software Engineering
, 1992
"... This paper presents a dynamic technique for statistically estimating three program characteristics that affect a program's computational behavior: (1) the probability that a particular section of a program is executed, (2) the probability that the particular section affects the data state, and (3) t ..."
Abstract
-
Cited by 108 (22 self)
- Add to MetaCart
This paper presents a dynamic technique for statistically estimating three program characteristics that affect a program's computational behavior: (1) the probability that a particular section of a program is executed, (2) the probability that the particular section affects the data state, and (3) the probability that a data state produced by that section has an effect on program output. These three characteristics can be used to predict whether faults are likely to be uncovered by software testing.
Using model checking to generate tests from specifications
- In Proceedings of the Second IEEE International Conference on Formal Engineering Methods (ICFEM’98
, 1998
"... Abstract We apply a model checker to the problem of test generation using a new application of mutation analysis. We define syntactic operators, each of which produces a slight variation on a given model. The operators define a form of mutation analysis at the level of the model checker specificatio ..."
Abstract
-
Cited by 102 (12 self)
- Add to MetaCart
Abstract We apply a model checker to the problem of test generation using a new application of mutation analysis. We define syntactic operators, each of which produces a slight variation on a given model. The operators define a form of mutation analysis at the level of the model checker specification. A model checker generates counterexamples which distinguish the variations from the original specification. The counterexamples can easily be turned into complete test cases, that is, with inputs and expected output. We define two classes of operators: those that produce test cases from which a correct implementation must differ, and those that produce test cases with which it must agree. There are substantial advantages to combining a model checker with mutation analysis. First, the generation of test cases is automatic; each counterexample serves as a complete test case. Second, in sharp contrast to program-based mutation analysis, the identification of equivalent mutants is also automatic; the model checker simply reports that the mutant satisfies the constraints, and hence no counterexample is produced. We apply our method to an example specification and evaluate the resulting test sets with coverage metrics on a corresponding implementation in Java. 1 Introduction The use of formal methods has been widely advocated to reduce the likelihood of errors in the early stages of system development. Some of the chief drawbacks to applying formal methods is the difficulty of conducting formal analysis [5] and the perceived or actual payoff in project budget. Testing is an expensive part of the software budget, and formal methods offer an opportunity to significantly reduce the testing costs. We have developed an innovative combination of mutation analysis, model checking, and test generation which solves some problems previously plaguing these approaches and automatically produces good sets of tests from formal specifications. This section reviews the formal methods and approaches we use.
Is Mutation an Appropriate Tool for Testing Experiments
- in Proceedings of the 27th International Conference on Software Engineering (ICSE’05), St Louis
, 2005
"... The empirical assessment of test techniques plays an important role in software testing research. One common practice is to instrument faults, either manually or by using mutation operators. The latter allows the systematic, repeatable seeding of large numbers of faults; however, we do not know whet ..."
Abstract
-
Cited by 93 (1 self)
- Add to MetaCart
The empirical assessment of test techniques plays an important role in software testing research. One common practice is to instrument faults, either manually or by using mutation operators. The latter allows the systematic, repeatable seeding of large numbers of faults; however, we do not know whether empirical results obtained this way lead to valid, representative conclusions. This paper investigates this important question based on a number of programs with comprehensive pools of test cases and known faults. It is concluded that, based on the data available thus far, the use of mutation operators is yielding trustworthy results (generated mutants are similar to real faults). Mutants appear however to be different from hand-seeded faults that seem to be harder to detect than real faults.
An Experimental Comparison of the Effectiveness of Branch Testing and Data Flow Testing
- IEEE Transactions on Software Engineering
, 1993
"... An experiment comparing the effectiveness of the all-uses and all-edges test data adequacy criteria was performed. The experiment was designed so as to overcome some of the deficiencies of previous software testing experiments. A large number of test sets was randomly generated for each of nine subj ..."
Abstract
-
Cited by 92 (4 self)
- Add to MetaCart
An experiment comparing the effectiveness of the all-uses and all-edges test data adequacy criteria was performed. The experiment was designed so as to overcome some of the deficiencies of previous software testing experiments. A large number of test sets was randomly generated for each of nine subject programs with subtle errors. For each test set, the percentages of executable edges and definition-use associations covered were measured and it was determined whether the test set exposed an error. Hypothesis testing was used to investigate whether all-uses adequate test sets are more likely to expose errors than are all-edges adequate test sets. All-uses was significantly more effective than all-edges for five of the subjects, and appeared guaranteed to detect the error in four of them. Further analysis showed that in four of these subjects, all-uses-adequate test sets were more effective than all-edges-adequate test sets of similar size. Logistic regression analysis was used to investigate whether the probability that a test set exposes an error increases as the percentage of definition-use associations or edges covered by it increases. The evidence did not strongly support this conjecture. Error exposing ability was shown to be strongly positively correlated to percentage of covered definition-use associations in only four of the nine subjects. Error exposing ability was also shown to be positively correlated to the percentage of covered edges in four (different) subjects, but the relationship was weaker. Author's address: Computer Science Dept., Polytechnic University, 6 Metrotech Center, Brooklyn, N.Y. 11201. E-mail: pfrankl@poly.edu. Supported in part by NSF Grants CCR-8810287 and CCR9206910 and by the New York State Science and Technology Founda...
A Fortran Language System for Mutation-based Software Testing
, 1991
"... this paper, we discuss some of the unique requirements of a language system used in a mutation-based testing environment. We then describe how these requirements affected the design and implementation of the Fortran 77 version of the Mothra system. We also describe the intermediate language used by ..."
Abstract
-
Cited by 88 (18 self)
- Add to MetaCart
this paper, we discuss some of the unique requirements of a language system used in a mutation-based testing environment. We then describe how these requirements affected the design and implementation of the Fortran 77 version of the Mothra system. We also describe the intermediate language used by Mothra and the features of the language system that are needed for software testing. The appendices contain a full description of the intermediate language and the mutation operators used by Mothra. The design and implementation techniques that were developed for Mothra are applicable for constructing not just software testing systems, but any type of program analysis system or language system for a special-purpose application. In particular, we discuss decisions made and techniques developed by the Mothra team that can be useful in such applications as debuggers, program measurement tools, software development environments and other types of program analysis systems
Test case prioritization: A family of empirical studies
- IEEE Transactions on Software Engineering
, 2002
"... Gregg Rothermel z To reduce the cost of regression testing, software testers may prioritize their test cases so that those which are more important, by some measure, are run earlier in the regression testing process. One potential goal of such prioritization is to increase a test suite's rate of fau ..."
Abstract
-
Cited by 77 (19 self)
- Add to MetaCart
Gregg Rothermel z To reduce the cost of regression testing, software testers may prioritize their test cases so that those which are more important, by some measure, are run earlier in the regression testing process. One potential goal of such prioritization is to increase a test suite's rate of fault detection. Previous work reported the results of studies that showed that prioritization techniques can signi cantly improve rate of fault detection. Those studies, however, raised several additional questions: (1) can prioritization techniques be e ective when targeted at speci c modi ed versions; (2) what tradeo s exist between ne granularity and coarse granularity prioritization techniques; (3) can the incorporation of measures of fault proneness into prioritization techniques improve their e ectiveness? To address these questions, we have performed several new studies, in which we empirically compared prioritization techniques using both controlled experiments and case studies. The results of these studies show that each of the prioritization techniques considered can improve the rate of fault detection of test suites overall. Finegranularity techniques typically outperformed coarse-granularity techniques, but only by a relatively small margin overall; in other words, the relative imprecision in coarse-granularity analysis did not dramatically reduce their ability to improve rate of fault detection. Incorporation of fault-proneness techniques produced relatively small improvements over other techniques in terms rate of fault detection, a result which ran contrary to our expectations. Our studies also show that the relative e ectiveness of various techniques can vary signi cantly across target programs. Furthermore, our analysis shows that whether the e ectiveness di erences observed will result in savings in practice varies substantially with the cost factors associated with particular testing processes. Further work to understand the sources of this variance, and to incorporate such understanding into prioritization techniques and the choice of techniques, would be bene cial. 1
An experimental determination of sufficient mutant operators
- ACM Transactions on Software Engineering Methodology
, 1996
"... Mutation testing is a technique for unit testing software that, although powerful, is computationally expensive. The principal expense of mutation is that many variants of the test program, called mutants, must be repeatedly executed. This paper quantifies the expense of mutation in terms of the num ..."
Abstract
-
Cited by 75 (2 self)
- Add to MetaCart
Mutation testing is a technique for unit testing software that, although powerful, is computationally expensive. The principal expense of mutation is that many variants of the test program, called mutants, must be repeatedly executed. This paper quantifies the expense of mutation in terms of the number of mutants that are created, then proposes and evaluates a technique that reduces the number of mutants by an order of magnitude. Selective mutation reduces the cost of mutation testing by reducing the number of mutants. This paper reports experimental results that compare selective mutation testing with standard, or non-selective, mutation testing, and results that quantify the savings achieved by selective mutation testing. The results support the hypothesis that selective mutation is almost as strong as non-selective mutation; in experimental trials selective mutation provides almost the same coverage as non-selective mutation, with a four-fold or more reduction in the number of mutants.
Feedback-directed random test generation
- In ICSE
, 2007
"... We present a technique that improves random test generation by incorporating feedback obtained from executing test inputs as they are created. Our technique builds inputs incrementally by randomly selecting a method call to apply and finding arguments from among previously-constructed inputs. As soo ..."
Abstract
-
Cited by 74 (14 self)
- Add to MetaCart
We present a technique that improves random test generation by incorporating feedback obtained from executing test inputs as they are created. Our technique builds inputs incrementally by randomly selecting a method call to apply and finding arguments from among previously-constructed inputs. As soon as an input is built, it is executed and checked against a set of contracts and filters. The result of the execution determines whether the input is redundant, illegal, contract-violating, or useful for generating more inputs. The technique outputs a test suite consisting of unit tests for the classes under test. Passing tests can be used to ensure that code contracts are preserved across program changes; failing tests (that violate one or more contract) point to potential errors that should be corrected. Our experimental results indicate that feedback-directed random test generation can outperform systematic and undirected random test generation, in terms of coverage and error detection. On four small but nontrivial data structures (used previously in the literature), our technique achieves higher or equal block and predicate coverage than model checking (with and without abstraction) and undirected random generation. On 14 large, widely-used libraries (comprising 780KLOC), feedback-directed random test generation finds many previously-unknown errors, not found by either model checking or undirected random generation. 1
Investigations of the Software Testing Coupling Effect
"... Fault-based testing strategies test software by focusing on specific, common types of faults. The coupling effect hypothesizes that test data sets that detect simple types of faults are sensitive enough to detect more complex types of faults. This paper describes empirical investigations into the co ..."
Abstract
-
Cited by 73 (15 self)
- Add to MetaCart
Fault-based testing strategies test software by focusing on specific, common types of faults. The coupling effect hypothesizes that test data sets that detect simple types of faults are sensitive enough to detect more complex types of faults. This paper describes empirical investigations into the coupling effect over a specific class of software faults. All of the results from this investigation support the validity of the coupling effect. The major conclusion from this investigation is the fact that by explicitly testing for simple faults, we are also implicitly testing for more complicated faults, giving us con dence that fault-based testing is an effective way to test software.
Incremental Regression Testing
, 1993
"... The purpose of regression testing is to ensure that bug fixes and new functionality introduced in a new version of a software do not adversely affect the correct functionality inherited from the previous version. This paper explores efficient methods of selecting small subsets of regression test set ..."
Abstract
-
Cited by 70 (2 self)
- Add to MetaCart
The purpose of regression testing is to ensure that bug fixes and new functionality introduced in a new version of a software do not adversely affect the correct functionality inherited from the previous version. This paper explores efficient methods of selecting small subsets of regression test sets that may be used to establish the same.

