Results 1 -
3 of
3
MuCheck: An extensible tool for mutation testing of Haskell programs. Online Draft; Under Submission, 2014. [16
- Oregon State University
, 2014
"... This paper presents MuCheck, a mutation testing tool for Haskell programs. MuCheck is a counterpart to the widely used QuickCheck random testing tool for functional pro-grams, and can be used to evaluate the efficacy of QuickCheck property definitions. The tool implements mutation opera-tors that ar ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
This paper presents MuCheck, a mutation testing tool for Haskell programs. MuCheck is a counterpart to the widely used QuickCheck random testing tool for functional pro-grams, and can be used to evaluate the efficacy of QuickCheck property definitions. The tool implements mutation opera-tors that are specifically designed for functional programs, and makes use of the type system of Haskell to achieve a more relevant set of mutants than otherwise possible. Mu-tation coverage is particularly valuable for functional pro-grams due to highly compact code, referential transparency, and clean semantics; these make augmenting a test suite or specification based on surviving mutants a practical method for improved testing.
An Empirical Study on the Scalability of Selective Mutation Testing
"... Abstract—Software testing plays an important role in ensur-ing software quality by running a program with test suites. Mutation testing is designed to evaluate whether a test suite is adequate in detecting faults. Due to the expensive cost of mutation testing, selective mutation testing was proposed ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Software testing plays an important role in ensur-ing software quality by running a program with test suites. Mutation testing is designed to evaluate whether a test suite is adequate in detecting faults. Due to the expensive cost of mutation testing, selective mutation testing was proposed to select a subset of mutants whose effectiveness is similar to the whole set of generated mutants. Although selective mutation testing has been widely investigated in recent years, many people still doubt whether it can suit well for large programs. To study the scalability of selective mutation testing, we systematically explore how the program size impacts selective mutation testing through four projects (including 12 versions all together). Based on the empirical study, for programs smaller than 16 KLOC, selective mutation testing has surprisingly good scalability. In particular, for a program whose number of lines of executable code is E, the number of mutants used in selective mutation testing is proportional to Ec, where c is a constant whose value is between 0.05 and 0.25. I.
Do Mutation Reduction Strategies Matter?
"... Abstract—Mutation analysis is a well-known, but computa-tionally intensive, method for measuring test suite quality. While multiple strategies have been proposed to reduce the number of mutants, there is inconclusive evidence for their utility due to the limited number and size of programs used for ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Mutation analysis is a well-known, but computa-tionally intensive, method for measuring test suite quality. While multiple strategies have been proposed to reduce the number of mutants, there is inconclusive evidence for their utility due to the limited number and size of programs used for validation, and a lack of comprehensive comparative studies. Traditional evaluation criteria for mutation reduction also rely on mutation-adequate suites, which are rare in practice. We propose novel criteria for evaluating reduction strategies for non-mutation-adequate test suites, directly linked to the actual use of mutation analysis during development — to ensure that tests check for many different possible faults. We evaluate using both these criteria and the traditional criteria with 201 real-world projects, and show that the popular strategies — operator selection, and stratified sampling (on operators or program elements) — are at best marginally better than random sampling, and are often worse. I.