Results 1 - 10
of
131
QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs
- ACM SIGPLAN Notices
, 2000
"... QUickCheck is a tool which aids the Haskell programmer in formulating and testing properties of programs. Properties are described as Haskell functions, and can be automatically tested on random input, but it is also possible to define custom test data generators. We present a number of case studies ..."
Abstract
-
Cited by 252 (10 self)
- Add to MetaCart
QUickCheck is a tool which aids the Haskell programmer in formulating and testing properties of programs. Properties are described as Haskell functions, and can be automatically tested on random input, but it is also possible to define custom test data generators. We present a number of case studies, in which the tool was successfully used, and also point out some pitfalls to avoid. Random testing is especially suitable for functional programs because properties can be stated at a fine grain. When a function is built from separately tested components, then random testing suffices to obtain good coverage of the definition under test.
Software unit test coverage and adequacy
- ACM Computing Surveys
, 1997
"... Objective measurement of test quality is one of the key issues in software testing. It has been a major research focus for the last two decades. Many test criteria have been proposed and studied for this purpose. Various kinds of rationales have been presented in support of one criterion or another. ..."
Abstract
-
Cited by 226 (6 self)
- Add to MetaCart
Objective measurement of test quality is one of the key issues in software testing. It has been a major research focus for the last two decades. Many test criteria have been proposed and studied for this purpose. Various kinds of rationales have been presented in support of one criterion or another. We survey the research work in
The AETG System: An Approach to Testing Based on Combinatorial Design
- IEEE Transactions on Software Engineering
, 1997
"... This paper describes a new approach to testing that uses combinatorial designs to generate tests that cover the pairwise, triple, or n-way combinations of a system's test parameters. These are the parameters that determine the system's test scenarios. Examples are system configuration parameters, ..."
Abstract
-
Cited by 116 (3 self)
- Add to MetaCart
This paper describes a new approach to testing that uses combinatorial designs to generate tests that cover the pairwise, triple, or n-way combinations of a system's test parameters. These are the parameters that determine the system's test scenarios. Examples are system configuration parameters, user inputs and other external events. We implemented this new method in the AETG system. The AETG system uses new combinatorial algorithms to generate test sets that cover all valid n-way parameter combinations. The size of an AETG test set grows logarithmically in the number of test parameters. This allows testers to define test models with dozens of parameters. The AETG system is used in a variety of applications for unit, system, and interoperability testing. It has generated both high-level test plans and detailed test cases. In several applications, it greatly reduced the cost of test plan development.
An Experimental Comparison of the Effectiveness of Branch Testing and Data Flow Testing
- IEEE Transactions on Software Engineering
, 1993
"... An experiment comparing the effectiveness of the all-uses and all-edges test data adequacy criteria was performed. The experiment was designed so as to overcome some of the deficiencies of previous software testing experiments. A large number of test sets was randomly generated for each of nine subj ..."
Abstract
-
Cited by 92 (4 self)
- Add to MetaCart
An experiment comparing the effectiveness of the all-uses and all-edges test data adequacy criteria was performed. The experiment was designed so as to overcome some of the deficiencies of previous software testing experiments. A large number of test sets was randomly generated for each of nine subject programs with subtle errors. For each test set, the percentages of executable edges and definition-use associations covered were measured and it was determined whether the test set exposed an error. Hypothesis testing was used to investigate whether all-uses adequate test sets are more likely to expose errors than are all-edges adequate test sets. All-uses was significantly more effective than all-edges for five of the subjects, and appeared guaranteed to detect the error in four of them. Further analysis showed that in four of these subjects, all-uses-adequate test sets were more effective than all-edges-adequate test sets of similar size. Logistic regression analysis was used to investigate whether the probability that a test set exposes an error increases as the percentage of definition-use associations or edges covered by it increases. The evidence did not strongly support this conjecture. Error exposing ability was shown to be strongly positively correlated to percentage of covered definition-use associations in only four of the nine subjects. Error exposing ability was also shown to be positively correlated to the percentage of covered edges in four (different) subjects, but the relationship was weaker. Author's address: Computer Science Dept., Polytechnic University, 6 Metrotech Center, Brooklyn, N.Y. 11201. E-mail: pfrankl@poly.edu. Supported in part by NSF Grants CCR-8810287 and CCR9206910 and by the New York State Science and Technology Founda...
Eclat: Automatic generation and classification of test inputs
- In 19th European Conference Object-Oriented Programming
, 2005
"... Abstract. This paper describes a technique that selects, from a large set of test inputs, a small subset likely to reveal faults in the software under test. The technique takes a program or software component, plus a set of correct executions— say, from observations of the software running properly, ..."
Abstract
-
Cited by 91 (12 self)
- Add to MetaCart
Abstract. This paper describes a technique that selects, from a large set of test inputs, a small subset likely to reveal faults in the software under test. The technique takes a program or software component, plus a set of correct executions— say, from observations of the software running properly, or from an existing test suite that a user wishes to enhance. The technique first infers an operational model of the software’s operation. Then, inputs whose operational pattern of execution differs from the model in specific ways are suggestive of faults. These inputs are further reduced by selecting only one input per operational pattern. The result is a small portion of the original inputs, deemed by the technique as most likely to reveal faults. Thus, the technique can also be seen as an error-detection technique. The paper describes two additional techniques that complement test input selection. One is a technique for automatically producing an oracle (a set of assertions) for a test input from the operational model, thus transforming the test input into a test case. The other is a classification-guided test input generation technique that also makes use of operational models and patterns. When generating inputs, it filters out code sequences that are unlikely to contribute to legal inputs, improving the efficiency of its search for fault-revealing inputs. We have implemented these techniques in the Eclat tool, which generates unit tests for Java classes. Eclat’s input is a set of classes to test and an example program execution—say, a passing test suite. Eclat’s output is a set of JUnit test cases, each containing a potentially fault-revealing input and a set of assertions at least one of which fails. In our experiments, Eclat successfully generated inputs that exposed fault-revealing behavior; we have used Eclat to reveal real errors in programs. The inputs it selects as fault-revealing are an order of magnitude as likely to reveal a fault as all generated inputs. 1
Improving Test Suites via Operational Abstraction
- In Proceedings of the 25th International Conference on Software Engineering
, 2003
"... This paper presents the operational difference technique for generating, augmenting, and minimizing test suites. The technique is analogous to structural code coverage techniques, but it operates in the semantic domain of program properties rather than the syntactic domain of program text. The opera ..."
Abstract
-
Cited by 75 (12 self)
- Add to MetaCart
This paper presents the operational difference technique for generating, augmenting, and minimizing test suites. The technique is analogous to structural code coverage techniques, but it operates in the semantic domain of program properties rather than the syntactic domain of program text. The operational difference technique automatically selects test cases; it assumes only the existence of a source of test cases. The technique dynamically generates operational abstractions (which describe observed behavior and are syntactically identical to formal specifications) from test suite executions. Test suites can be generated by adding cases until the operational abstraction stops changing. The resulting test suites are as small, and detect as many faults, as suites with 100% branch coverage, and are better at detecting certain common faults.
Feedback-directed random test generation
- In ICSE
, 2007
"... We present a technique that improves random test generation by incorporating feedback obtained from executing test inputs as they are created. Our technique builds inputs incrementally by randomly selecting a method call to apply and finding arguments from among previously-constructed inputs. As soo ..."
Abstract
-
Cited by 74 (14 self)
- Add to MetaCart
We present a technique that improves random test generation by incorporating feedback obtained from executing test inputs as they are created. Our technique builds inputs incrementally by randomly selecting a method call to apply and finding arguments from among previously-constructed inputs. As soon as an input is built, it is executed and checked against a set of contracts and filters. The result of the execution determines whether the input is redundant, illegal, contract-violating, or useful for generating more inputs. The technique outputs a test suite consisting of unit tests for the classes under test. Passing tests can be used to ensure that code contracts are preserved across program changes; failing tests (that violate one or more contract) point to potential errors that should be corrected. Our experimental results indicate that feedback-directed random test generation can outperform systematic and undirected random test generation, in terms of coverage and error detection. On four small but nontrivial data structures (used previously in the literature), our technique achieves higher or equal block and predicate coverage than model checking (with and without abstraction) and undirected random generation. On 14 large, widely-used libraries (comprising 780KLOC), feedback-directed random test generation finds many previously-unknown errors, not found by either model checking or undirected random generation. 1
An empirical study of the robustness of Windows NT applications using random testing
- In Proceedings of the 4th USENIX Windows System Symposium
, 2000
"... We report on the third in a series of studies on the reliability of application programs in the face of random input. In 1990 and 1995, we studied the reliability of UNIX application programs, both command line and X-Window based (GUI). In this study, we apply our testing techniques to applications ..."
Abstract
-
Cited by 72 (0 self)
- Add to MetaCart
We report on the third in a series of studies on the reliability of application programs in the face of random input. In 1990 and 1995, we studied the reliability of UNIX application programs, both command line and X-Window based (GUI). In this study, we apply our testing techniques to applications running on the Windows NT operating system. Our testing is simple black-box random input testing; by any measure, it is a crude technique, but it seems to be effective at locating bugs in real programs. We tested over 30 GUI-based applications by subjecting them to two kinds of random input: (1) streams of valid keyboard and mouse events and (2) streams of random Win32 messages. We have built a tool that helps automate the testing of Windows NT applications. With a few simple parameters, any application can be tested. Using our random testing techniques, our previous UNIXbased studies showed that we could crash a wide variety of command-line and X-window based applications on several UNIX platforms. The test results are similar for NT-based applications. When subjected to random valid input that could be produced by using the mouse and keyboard, we crashed 21% of applications that we tested and hung an additional 24 % of applications. When subjected to raw random Win32 messages, we crashed or hung all the applications that we tested. We report which applications failed under which tests, and provide some analysis of the failures. 1INTRODUCTION We report on the third in a series of studies on the reliability of application programs in the face of random input. In 1990 and 1995, we studied the reliability of UNIX command line and X-Window based (GUI) application programs[8,9]. In this study, we apply our techniques to applications running on the Windows NT operating system. Our testing, called fuzz testing, uses simple black-box random input; no knowledge of the application is used in generating the random input. Our 1990 study evaluated the reliability of standard UNIX command line utilities. It showed that 25-33 % of such applications crashed or hung when reading random input. The 1995 study evaluated a larger collection of
Finding failures by cluster analysis of execution profiles
- In ICSE
, 2001
"... We experimentally evaluate the effectiveness of using cluster analysis of execution profiles to find failures among the executions induced by a set of potential test cases. We compare several filtering procedures for selecting executions to evaluate for conformance to requirements. Each filtering pr ..."
Abstract
-
Cited by 70 (4 self)
- Add to MetaCart
We experimentally evaluate the effectiveness of using cluster analysis of execution profiles to find failures among the executions induced by a set of potential test cases. We compare several filtering procedures for selecting executions to evaluate for conformance to requirements. Each filtering procedure involves a choice of a sampling strategy and a clustering metric. The results suggest that filtering procedures based on clustering are more effective than simple random sampling for identifying failures in populations of operational executions, with adaptive sampling from clusters being the most effective sampling strategy. The results also suggest that clustering metrics that give extra weight to unusual profile features are most effective. Scatter plots of execution populations, produced by multidimensional scaling, are used to provide intuition for these results. Keywords: Observation-based testing, software testing, operational testing, beta testing, cluster analysis, multidimensional scaling 1.
A Formal Analysis of the Fault-detecting Ability of Testing Methods
- IEEE Transactions on Software Engineering
, 1993
"... This paper examines several relations between software testing criteria, exploring whether for each relation R and each pair of criteria, C 1 and C 2 , R(C 1 ; C 2 ) guarantees that C 1 is better at detecting faults than C 2 according to various probabilistic measures of fault-detecting ability. ..."
Abstract
-
Cited by 63 (5 self)
- Add to MetaCart
This paper examines several relations between software testing criteria, exploring whether for each relation R and each pair of criteria, C 1 and C 2 , R(C 1 ; C 2 ) guarantees that C 1 is better at detecting faults than C 2 according to various probabilistic measures of fault-detecting ability. It is shown that the fact that C 1 subsumes C 2 does not guarantee that C 1 is better at detecting faults. Relations that strengthen the subsumption relation and that have more bearing on fault-detecting ability are introduced. 1

