Results 1 - 10
of
26
A Comprehensive Survey of Trends in Oracles for Software Testing
, 2013
"... Testing involves examining the behaviour of a system in order to discover potential faults. Determining the desired correct behaviour for a given input is called the “oracle problem”. Oracle automation is important to remove a current bottleneck which inhibits greater overall test automation; witho ..."
Abstract
-
Cited by 20 (4 self)
- Add to MetaCart
Testing involves examining the behaviour of a system in order to discover potential faults. Determining the desired correct behaviour for a given input is called the “oracle problem”. Oracle automation is important to remove a current bottleneck which inhibits greater overall test automation; without oracle automation, the human has to determine whether observed behaviour is correct. The literature on oracles has introduced techniques for oracle automation, including modelling, specifications, contract-driven development and metamorphic testing. When none of these is completely adequate, the final source of oracle information remains the human, who may be aware of informal specifications, expectations, norms and domain specific information that provide informal oracle guidance. All forms of oracle, even the humble human, involve challenges of reducing cost and increasing benefit. This paper provides a comprehensive survey of current approaches to the oracle problem and an analysis of trends in this important area of software testing research and practice.
Does Automated White-Box Test Generation Really Help Software Testers?
"... Automated test generation techniques can efficiently produce test data that systematically cover structural aspects of a program. In the absence of a specification, a common assumption is that these tests relieve a developer of most of the work, as the act of testing is reduced to checking the resul ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
(Show Context)
Automated test generation techniques can efficiently produce test data that systematically cover structural aspects of a program. In the absence of a specification, a common assumption is that these tests relieve a developer of most of the work, as the act of testing is reduced to checking the results of the tests. Although this as-sumption has persisted for decades, there has been no conclusive evidence to date confirming it. However, the fact that the approach has only seen a limited uptake in industry suggests the contrary, and calls into question its practical usefulness. To investigate this issue, we performed a controlled experiment comparing a total of 49 sub-jects split between writing tests manually and writing tests with the aid of an automated unit test generation tool, EVOSUITE. We found that, on one hand, tool support leads to clear improvements in com-monly applied quality metrics such as code coverage (up to 300% increase). However, on the other hand, there was no measurable improvement in the number of bugs actually found by developers. Our results not only cast some doubt on how the research commu-nity evaluates test generation tools, but also point to improvements and future work necessary before automated test generation tools will be widely adopted by practitioners.
Oracle-Centric Test Case Prioritization
- Proceedings of the International Symposium on Software Reliability Engineering
, 2012
"... Abstract—Recent work in testing has demonstrated the benefits of considering test oracles in the testing process. Unfortunately, this work has focused primarily on developing techniques for gen-erating test oracles, in particular techniques based on mutation testing. While effective for test case ge ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
(Show Context)
Abstract—Recent work in testing has demonstrated the benefits of considering test oracles in the testing process. Unfortunately, this work has focused primarily on developing techniques for gen-erating test oracles, in particular techniques based on mutation testing. While effective for test case generation, existing research has not considered the impact of test oracles in the context of regression testing tasks. Of interest here is the problem of test case prioritization, in which a set of test cases are ordered to attempt to detect faults earlier and to improve the effectiveness of testing when the entire set cannot be executed. In this work, we propose a technique for prioritizing test cases that explicitly takes into account the impact of test oracles on the effectiveness of testing. Our technique operates by first capturing the flow of information from variable assignments to test oracles for each test case, and then prioritizing to “cover ” variables using the shortest paths possible to a test oracle. As a result, we favor test orderings in which many variables impact the test oracle’s result early in test execution. Our results demonstrate improvements in rate of fault detection relative to both random and structural coverage based prioritization techniques when applied to faulty versions of three synchronous reactive systems. Keywords-Software testing, software metrics. I.
Observable modified condition/decision coverage
- In Proceedings of the 2013 International Conference on Software Engineering. ACM
, 2013
"... Abstract—In many critical systems domains, test suite ade-quacy is currently measured using structural coverage metrics over the source code. Of particular interest is the modified condition/decision coverage (MC/DC) criterion required for, e.g., critical avionics systems. In previous investigations ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
Abstract—In many critical systems domains, test suite ade-quacy is currently measured using structural coverage metrics over the source code. Of particular interest is the modified condition/decision coverage (MC/DC) criterion required for, e.g., critical avionics systems. In previous investigations we have found that the efficacy of such test suites is highly dependent on the structure of the program under test and the choice of variables monitored by the oracle. MC/DC adequate tests would frequently exercise faulty code, but the effects of the faults would not propagate to the monitored oracle variables. In this report, we combine the MC/DC coverage metric with a notion of observability that helps ensure that the result of a fault encountered when covering a structural obligation propagates to a monitored variable; we term this new coverage criterion Observable MC/DC (OMC/DC). We hypothesize this path requirement will make structural coverage metrics 1.) more effective at revealing faults, 2.) more robust to changes in program structure, and 3.) more robust to the choice of variables monitored. We assess the efficacy and sensitivity to program structure of OMC/DC as compared to masking MC/DC using four subsystems from the civil avionics domain and the control logic of a microwave. We have found that test suites satisfying OMC/DC are significantly more effective than test suites satisfying MC/DC, revealing up to 88 % more faults, and are less sensitive to program structure and the choice of monitored variables. I.
Operator-Based and Random Mutant Selection: Better Together
"... Abstract—Mutation testing is a powerful methodology for evaluating the quality of a test suite. However, the methodology is also very costly, as the test suite may have to be executed for each mutant. Selective mutation testing is a well-studied technique to reduce this cost by selecting a subset of ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract—Mutation testing is a powerful methodology for evaluating the quality of a test suite. However, the methodology is also very costly, as the test suite may have to be executed for each mutant. Selective mutation testing is a well-studied technique to reduce this cost by selecting a subset of all mutants, which would otherwise have to be considered in their entirety. Two common approaches are operator-based mutant selection, which only generates mutants using a subset of mutation operators, and random mutant selection, which selects a subset of mutants generated using all mutation operators. While each of the two approaches provides some reduction in the number of mutants to execute, applying either of the two to medium-sized, real-world programs can still generate a huge number of mutants, which makes their execution too expensive. This paper presents eight random sampling strategies defined on top of operator-based mutant selection, and empirically validates that operator-based selection and random selection can be applied in tandem to further reduce the cost of mutation testing. The experimental results show that even sampling only 5 % of mutants generated by operator-based selection can still provide precise mutation testing results, while reducing the average mutation testing time to 6.54 % (i.e., on average less than 5 minutes for this study). I.
Angels and Monsters: An Empirical Investigation of Potential Test Effectiveness and Efficiency Improvement from Strongly Subsuming Higher Order Mutation
"... We study the simultaneous test effectiveness and efficiency improvement achievable by Strongly Subsuming Higher Or-der Mutants (SSHOMs), constructed from 15,792 first order mutants in four Java programs. Using SSHOMs in place of the first order mutants they subsume yielded a 35%-45% reduction in the ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
We study the simultaneous test effectiveness and efficiency improvement achievable by Strongly Subsuming Higher Or-der Mutants (SSHOMs), constructed from 15,792 first order mutants in four Java programs. Using SSHOMs in place of the first order mutants they subsume yielded a 35%-45% reduction in the number of mutants required, while simulta-neously improving test efficiency by 15 % and effectiveness by between 5.6 % and 12%. Trivial first order faults often com-bine to form exceptionally non-trivial higher order faults; apparently innocuous angels can combine to breed monsters. Nevertheless, these same monsters can be recruited to im-prove automated test effectiveness and efficiency. 1.
Improving the Accuracy of Oracle Verdicts Through Automated Model Steering∗
"... The oracle—a judge of the correctness of the system under test (SUT)—is a major component of the testing process. Specifying test oracles is challenging for some domains, such as real-time em-bedded systems, where small changes in timing or sensory input may cause large behavioral differences. Model ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
The oracle—a judge of the correctness of the system under test (SUT)—is a major component of the testing process. Specifying test oracles is challenging for some domains, such as real-time em-bedded systems, where small changes in timing or sensory input may cause large behavioral differences. Models of such systems, often built for analysis and simulation, are appealing for reuse as oracles. These models, however, typically represent an idealized system, abstracting away certain issues such as non-deterministic timing behavior and sensor noise. Thus, even with the same inputs, the model’s behavior may fail to match an acceptable behavior of the SUT, leading to many false positives reported by the oracle. We propose an automated steering framework that can adjust the behavior of the model to better match the behavior of the SUT to re-duce the rate of false positives. This model steering is limited by a set of constraints (defining acceptable differences in behavior) and is based on a search process attempting to minimize a dissimilar-ity metric. This framework allows non-deterministic, but bounded, behavior differences, while preventing future mismatches, by guid-ing the oracle—within limits—to match the execution of the SUT. Results show that steering significantly increases SUT-oracle con-formance with minimal masking of real faults and, thus, has signif-icant potential for reducing false positives and, consequently, de-velopment costs.
A Survey on Unit Testing Practices and Problems
"... Abstract—Unit testing is a common practice where developers write test cases together with regular code. Automation frame-works such as JUnit for Java have popularised this approach, allowing frequent and automatic execution of unit test suites. Despite the appraisals of unit testing in practice, so ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Unit testing is a common practice where developers write test cases together with regular code. Automation frame-works such as JUnit for Java have popularised this approach, allowing frequent and automatic execution of unit test suites. Despite the appraisals of unit testing in practice, software engi-neering researchers see potential for improvement and investigate advanced techniques such as automated unit test generation. To align such research with the needs of practitioners, we conducted a survey amongst 225 software developers, covering different programming languages and 29 countries, using a global online marketing research platform. The survey responses confirm that unit testing is an important factor in software development, and suggest that there is indeed potential and need for research on automation of unit testing. The results help us to identify areas of importance on which further research will be necessary (e.g., maintenance of unit tests), and also provide insights into the suitability of online marketing research platforms for software engineering surveys. I.
JSEFT: Automated JavaScript Unit Test Generation
"... Abstract—The event-driven and highly dynamic nature of JavaScript, as well as its runtime interaction with the Document Object Model (DOM) make it challenging to test JavaScript-based applications. Current web test automation techniques target the generation of event sequences, but they ignore testi ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract—The event-driven and highly dynamic nature of JavaScript, as well as its runtime interaction with the Document Object Model (DOM) make it challenging to test JavaScript-based applications. Current web test automation techniques target the generation of event sequences, but they ignore testing the JavaScript code at the unit level. Further they either ignore the oracle problem completely or simplify it through generic soft oracles such as HTML validation and runtime exceptions. We present a framework to automatically generate test cases for Java-Script applications at two complementary levels, namely events and individual JavaScript functions. Our approach employs a combination of function coverage maximization and function state abstraction algorithms to efficiently generate test cases. In addition, these test cases are strengthened by automatically generated mutation-based oracles. We empirically evaluate the implementation of our approach, called JSEFT, to assess its efficacy. The results, on 13 JavaScript-based applications, show that the generated test cases achieve a coverage of 68 % and that JSEFT can detect injected JavaScript and DOM faults with a high accuracy (100 % precision, 70 % recall). We also find that JSEFT outperforms an existing JavaScript test automation framework both in terms of coverage and detected faults. Keywords—Test generation; oracles; JavaScript; DOM I.
Dodona: Automated Oracle Data Set Selection
"... Software complexity has increased the need for automated software testing. Most research on automating testing, how-ever, has focused on creating test input data. While careful selection of input data is necessary to reach faulty states in a system under test, test oracles are needed to actually det ..."
Abstract
- Add to MetaCart
(Show Context)
Software complexity has increased the need for automated software testing. Most research on automating testing, how-ever, has focused on creating test input data. While careful selection of input data is necessary to reach faulty states in a system under test, test oracles are needed to actually detect failures. In this work, we describe Dodona, a sys-tem that supports the generation of test oracles. Dodona ranks program variables based on the interactions and de-pendencies observed between them during program execu-tion. Using this ranking, Dodona proposes a set of variables to be monitored, that can be used by engineers to construct assertion-based oracles. Our empirical study of Dodona re-veals that it is more effective and efficient than the current state-of-the-art approach for generating oracle data sets, and can often yield oracles that are almost as effective as oracles hand-crafted by engineers without support.