Results 1 - 10
of
592
Extended Static Checking for Java
, 2002
"... Software development and maintenance are costly endeavors. The cost can be reduced if more software defects are detected earlier in the development cycle. This paper introduces the Extended Static Checker for Java (ESC/Java), an experimental compile-time program checker that finds common programming ..."
Abstract
-
Cited by 638 (24 self)
- Add to MetaCart
(Show Context)
Software development and maintenance are costly endeavors. The cost can be reduced if more software defects are detected earlier in the development cycle. This paper introduces the Extended Static Checker for Java (ESC/Java), an experimental compile-time program checker that finds common programming errors. The checker is powered by verification-condition generation and automatic theoremproving techniques. It provides programmers with a simple annotation language with which programmer design decisions can be expressed formally. ESC/Java examines the annotated software and warns of inconsistencies between the design decisions recorded in the annotations and the actual code, and also warns of potential runtime errors in the code. This paper gives an overview of the checker architecture and annotation language and describes our experience applying the checker to tens of thousands of lines of Java programs.
Korat: Automated testing based on Java predicates
- IN PROC. INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS (ISSTA
, 2002
"... This paper presents Korat, a novel framework for automated testing of Java programs. Given a formal specification for a method, Korat uses the method precondition to automatically generate all nonisomorphic test cases bounded by a given size. Korat then executes the method on each of these test case ..."
Abstract
-
Cited by 331 (53 self)
- Add to MetaCart
This paper presents Korat, a novel framework for automated testing of Java programs. Given a formal specification for a method, Korat uses the method precondition to automatically generate all nonisomorphic test cases bounded by a given size. Korat then executes the method on each of these test cases, and uses the method postcondition as a test oracle to check the correctness of each output. To generate test cases for a method, Korat constructs a Java predicate (i.e., a method that returns a boolean) from the method’s precondition. The heart of Korat is a technique for automatic test case generation: given a predicate and a bound on the size of its inputs, Korat generates all nonisomorphic inputs for which the predicate returns true. Korat exhaustively explores the input space of the predicate but does so efficiently by monitoring the predicate’s executions and pruning large portions of the search space. This paper illustrates the use of Korat for testing several data structures, including some from the Java Collections Framework. The experimental results show that it is feasible to generate test cases from Java predicates, even when the search space for inputs is very large. This paper also compares Korat with a testing framework based on declarative specifications. Contrary to our initial expectation, the experiments show that Korat generates test cases much faster than the declarative framework.
Generalized Symbolic Execution for Model Checking and Testing
, 2003
"... Modern software systems, which often are concurrent and manipulate complex data structures must be extremely reliable. We present a novel framework based on symbolic execution, for automated checking of such systems. We provide a two-fold generalization of traditional symbolic execution based ap ..."
Abstract
-
Cited by 232 (52 self)
- Add to MetaCart
Modern software systems, which often are concurrent and manipulate complex data structures must be extremely reliable. We present a novel framework based on symbolic execution, for automated checking of such systems. We provide a two-fold generalization of traditional symbolic execution based approaches. First, we de ne a source to source translation to instrument a program, which enables standard model checkers to perform symbolic execution of the program. Second, we give a novel symbolic execution algorithm that handles dynamically allocated structures (e.g., lists and trees), method preconditions (e.g., acyclicity), data (e.g., integers and strings) and concurrency. The program instrumentation enables a model checker to automatically explore dierent program heap con gurations and manipulate logical formulae on program data (using a decision procedure). We illustrate two applications of our framework: checking correctness of multi-threaded programs that take inputs from unbounded domains with complex structure and generation of non-isomorphic test inputs that satisfy a testing criterion.
Cmc: A pragmatic approach to model checking real code
- In Proceedings of the Fifth Symposium on Operating Systems Design and Implementation
, 2002
"... Permission is granted for noncommercial reproduction of the work for educational or research purposes. ..."
Abstract
-
Cited by 225 (12 self)
- Add to MetaCart
(Show Context)
Permission is granted for noncommercial reproduction of the work for educational or research purposes.
Feedback-directed random test generation
- In ICSE
, 2007
"... We present a technique that improves random test generation by incorporating feedback obtained from executing test inputs as they are created. Our technique builds inputs incrementally by randomly selecting a method call to apply and finding arguments from among previously-constructed inputs. As soo ..."
Abstract
-
Cited by 188 (17 self)
- Add to MetaCart
(Show Context)
We present a technique that improves random test generation by incorporating feedback obtained from executing test inputs as they are created. Our technique builds inputs incrementally by randomly selecting a method call to apply and finding arguments from among previously-constructed inputs. As soon as an input is built, it is executed and checked against a set of contracts and filters. The result of the execution determines whether the input is redundant, illegal, contract-violating, or useful for generating more inputs. The technique outputs a test suite consisting of unit tests for the classes under test. Passing tests can be used to ensure that code contracts are preserved across program changes; failing tests (that violate one or more contract) point to potential errors that should be corrected. Our experimental results indicate that feedback-directed random test generation can outperform systematic and undirected random test generation, in terms of coverage and error detection. On four small but nontrivial data structures (used previously in the literature), our technique achieves higher or equal block and predicate coverage than model checking (with and without abstraction) and undirected random generation. On 14 large, widely-used libraries (comprising 780KLOC), feedback-directed random test generation finds many previously-unknown errors, not found by either model checking or undirected random generation. 1
Data flow analysis for verifying properties of concurrent programs
- In Proceedings of the Second ACM SIGSOFT Symposium on Foundations of Software Engineering
, 1994
"... Classification D.2.4 Software/Program Verification, D.1.3 Concurrent Programming This paper describes FLAVERS, a finite-state verification approach that analyzes whether concurrent systems satisfy user-defined, behavioral properties. FLAVERS automatically creates a compact, event-based model of the ..."
Abstract
-
Cited by 176 (61 self)
- Add to MetaCart
(Show Context)
Classification D.2.4 Software/Program Verification, D.1.3 Concurrent Programming This paper describes FLAVERS, a finite-state verification approach that analyzes whether concurrent systems satisfy user-defined, behavioral properties. FLAVERS automatically creates a compact, event-based model of the system that supports efficient data-flow analysis. FLAVERS achieves this efficiency at the cost of precision. Analysts, however, can improve the precision of analysis results by selectively and judiciously incorporating additional semantic information into an analysis. We report on an empirical study of the performance of the FLAVERS/Ada toolset applied to a collection of multitasking Ada systems. This study indicates that sufficient precision for proving system properties can usually be
Perracotta: mining temporal API rules from imperfect traces
- Ohio University
, 2006
"... Dynamic inference techniques have been demonstrated to provide useful support for various software engineering tasks including bug finding, test suite evaluation and improvement, and specification generation. To date, however, dynamic inference has only been used effectively on small programs under ..."
Abstract
-
Cited by 175 (3 self)
- Add to MetaCart
(Show Context)
Dynamic inference techniques have been demonstrated to provide useful support for various software engineering tasks including bug finding, test suite evaluation and improvement, and specification generation. To date, however, dynamic inference has only been used effectively on small programs under controlled conditions. In this paper, we identify reasons why scaling dynamic inference techniques has proven difficult, and introduce solutions that enable a dynamic inference technique to scale to large programs and work effectively with the imperfect traces typically available in industrial scenarios. We describe our approximate inference algorithm, present and evaluate heuristics for winnowing the large number of inferred properties to a manageable set of interesting properties, and report on experiments using inferred properties. We evaluate our techniques on JBoss and the Windows kernel. Our tool is able to infer many of the properties checked by the Static Driver Verifier and leads us to discover a previously unknown bug in Windows.
Finding and Reproducing Heisenbugs in Concurrent Programs
"... Concurrency is pervasive in large systems. Unexpected interference among threads often results in “Heisenbugs” that are extremely difficult to reproduce and eliminate. We have implemented a tool called CHESS for finding and reproducing such bugs. When attached to a program, CHESS takes control of th ..."
Abstract
-
Cited by 154 (12 self)
- Add to MetaCart
(Show Context)
Concurrency is pervasive in large systems. Unexpected interference among threads often results in “Heisenbugs” that are extremely difficult to reproduce and eliminate. We have implemented a tool called CHESS for finding and reproducing such bugs. When attached to a program, CHESS takes control of thread scheduling and uses efficient search techniques to drive the program through possible thread interleavings. This systematic exploration of program behavior enables CHESS to quickly uncover bugs that might otherwise have remained hidden for a long time. For each bug, CHESS consistently reproduces an erroneous execution manifesting the bug, thereby making it significantly easier to debug the problem. CHESS scales to large concurrent programs and has found numerous bugs in existing systems that had been tested extensively prior to being tested by CHESS. CHESS has been integrated into the test frameworks of many code bases inside Microsoft and is used by testers on a daily basis. 1
Sober: statistical model-based bug localization
- In Proc. ESEC/FSE’05
, 2005
"... Automated localization of software bugs is one of the es-sential issues in debugging aids. Previous studies indicated that the evaluation history of program predicates may dis-close important clues about underlying bugs. In this paper, we propose a new statistical model-based approach, called SOBER, ..."
Abstract
-
Cited by 144 (12 self)
- Add to MetaCart
(Show Context)
Automated localization of software bugs is one of the es-sential issues in debugging aids. Previous studies indicated that the evaluation history of program predicates may dis-close important clues about underlying bugs. In this paper, we propose a new statistical model-based approach, called SOBER, which localizes software bugs without any prior knowledge of program semantics. Unlike existing statisti-cal debugging approaches that select predicates correlated with program failures, SOBER models evaluation patterns of predicates in both correct and incorrect runs respectively and regards a predicate as bug-relevant if its evaluation pat-tern in incorrect runs differs significantly from that in correct ones. SOBER features a principled quantification of the pat-tern difference that measures the bug-relevance of program predicates. We systematically evaluated our approach under the same setting as previous studies. The result demonstrated the power of our approach in bug localization: SOBER can help programmers locate 68 out of 130 bugs in the Siemens suite when programmers are expected to examine no more than 10 % of the code, whereas the best previously reported is 52 out of 130. Moreover, with the assistance of SOBER, we found two bugs in bc 1.06 (an arbitrary precision calcula-tor on UNIX/Linux), one of which has never been reported before.