Results 1  10
of
13
Sledgehammer: Judgement Day
"... Sledgehammer, a component of the interactive theorem prover Isabelle, finds proofs in higherorder logic by calling the automated provers for firstorder logic E, SPASS and Vampire. This paper is the largest and most detailed empirical evaluation of such a link to date. Our test data consists of 12 ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
Sledgehammer, a component of the interactive theorem prover Isabelle, finds proofs in higherorder logic by calling the automated provers for firstorder logic E, SPASS and Vampire. This paper is the largest and most detailed empirical evaluation of such a link to date. Our test data consists of 1240 proof goals arising in 7 diverse Isabelle theories, thus representing typical Isabelle proof obligations. We measure the effectiveness of Sledgehammer and many other parameters such as run time and complexity of proofs. A facility for minimizing the number of facts needed to prove a goal is presented and analyzed.
Extending Sledgehammer with SMT Solvers
"... Abstract. Sledgehammer is a component of Isabelle/HOL that employs firstorder automatic theorem provers (ATPs) to discharge goals arising in interactive proofs. It heuristically selects relevant facts and, if an ATP is successful, produces a snippet that replays the proof in Isabelle. We extended Sl ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
Abstract. Sledgehammer is a component of Isabelle/HOL that employs firstorder automatic theorem provers (ATPs) to discharge goals arising in interactive proofs. It heuristically selects relevant facts and, if an ATP is successful, produces a snippet that replays the proof in Isabelle. We extended Sledgehammer to invoke satisfiability modulo theories (SMT) solvers as well, exploiting its relevance filter and parallel architecture. Isabelle users are now pleasantly surprised by SMT proofs for problems beyond the ATPs ’ reach. Remarkably, the best SMT solver performs better than the best ATP on most of our benchmarks. 1
Encoding Monomorphic and Polymorphic Types
"... Abstract. Most automatic theorem provers are restricted to untyped or monomorphic logics, and existing translations from polymorphic logics are bulky or unsound. Recent research shows how to exploit monotonicity to encode ground types efficiently: monotonic types can be safely erased, while nonmonot ..."
Abstract

Cited by 11 (8 self)
 Add to MetaCart
Abstract. Most automatic theorem provers are restricted to untyped or monomorphic logics, and existing translations from polymorphic logics are bulky or unsound. Recent research shows how to exploit monotonicity to encode ground types efficiently: monotonic types can be safely erased, while nonmonotonic types must generally be encoded. We extend this work to rank1 polymorphism and show how to eliminate even more clutter. We also present alternative schemes that lighten the translation of polymorphic symbols, based on the novel notion of “cover”. The new encodings are implemented, and partly proved correct, in Isabelle/HOL. Our evaluation finds them vastly superior to previous schemes. 1
Modular SMT Proofs for Fast Reflexive Checking inside Coq ⋆
"... Abstract. We present a new methodology for exchanging unsatisfiability proofs between an untrusted SMT solver and a sceptical proof assistant with computation capabilities like Coq. We advocate modular SMT proofs that separate boolean reasoning and theory reasoning; and structure the communication b ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Abstract. We present a new methodology for exchanging unsatisfiability proofs between an untrusted SMT solver and a sceptical proof assistant with computation capabilities like Coq. We advocate modular SMT proofs that separate boolean reasoning and theory reasoning; and structure the communication between theories using NelsonOppen combination scheme. We present the design and implementation of a Coq reflexive verifier that is modular and allows for finetuned theoryspecific verifiers. The current verifier is able to verify proofs for quantifierfree formulae mixing linear arithmetic and uninterpreted functions. Our proof generation scheme benefits from the efficiency of stateoftheart SMT solvers while being independent from a specific SMT solver proof format. Our only requirement for the SMT solver is the ability to extract unsat cores and generate boolean models. In practice, unsat cores are relatively small and their proof is obtained with a modest overhead by our proofproducing prover. We present experiments assessing the feasibility of the approach for benchmarks obtained from the SMT competition. 1
Faster and More Complete Extended Static Checking for the Java Modeling Language
 J AUTOM REASONING
"... Extended Static Checking (ESC) is a fully automated formal verification technique. Verification in ESC is achieved by translating programs and their specifications into verification conditions (VCs). Proof of a VC establishes the correctness of the program. The implementations of many seemingly simp ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Extended Static Checking (ESC) is a fully automated formal verification technique. Verification in ESC is achieved by translating programs and their specifications into verification conditions (VCs). Proof of a VC establishes the correctness of the program. The implementations of many seemingly simple algorithms are beyond the ability of traditional Extended Static Checking (ESC) tools to verify. Not being able to verify toy examples is often enough to turn users off of the idea of using formal methods. ESC4, the ESC component of the JML4 project, is able to verify many more kinds of methods in part because of its use of novel techniques which apply multiple theorem provers. In particular, we present Offline UserAssisted ESC (OUAESC), a new form of verification that lies between ESC and Full Static Program Verification (FSPV). ESC is generally quite efficient, as far as verification tools go, but it is still orders of magnitude slower than simple compilation. As can be imagined, proving VCs is computationally expensive: While small classes can be verified in seconds, verifying larger programs of 50 KLOC can take hours. To help address the added cost of using multiple provers and this lack of scalability, we present the multithreaded version of ESC4 and its distributed prover backend.
Automatic Proof and Disproof in Isabelle/HOL
, 2011
"... Isabelle/HOL is a popular interactive theorem prover based on higherorder logic. It owes its success to its ease of use and powerful automation. Much of the automation is performed by external tools: The metaprover Sledgehammer relies on resolution provers and SMT solvers for its proof search, the c ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Isabelle/HOL is a popular interactive theorem prover based on higherorder logic. It owes its success to its ease of use and powerful automation. Much of the automation is performed by external tools: The metaprover Sledgehammer relies on resolution provers and SMT solvers for its proof search, the counterexample generator Quickcheck uses the ML compiler as a fast evaluator for ground formulas, and its rival Nitpick is based on the model finder Kodkod, which performs a reduction to SAT. Together with the Isar structured proof format and a new asynchronous user interface, these tools have radically transformed the Isabelle user experience. This paper provides an overview of the main automatic proof and disproof tools.
Redirecting proofs by contradiction
"... This paper presents an algorithm that redirects proofs by contradiction. The input is a refutation graph, as produced by an automatic theorem prover (e.g., E, SPASS, Vampire, Z3); the output is a direct proof expressed in natural deduction extended with case analyses and nested subproofs. The algori ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This paper presents an algorithm that redirects proofs by contradiction. The input is a refutation graph, as produced by an automatic theorem prover (e.g., E, SPASS, Vampire, Z3); the output is a direct proof expressed in natural deduction extended with case analyses and nested subproofs. The algorithm is implemented in Isabelle’s Sledgehammer, where it enhances the legibility of machinegenerated proofs. 1
More SPASS with Isabelle Superposition with Hard Sorts and Configurable Simplification
"... Abstract. Sledgehammer for Isabelle/HOL integrates automatic theorem provers to discharge interactive proof obligations. This paper considers a tighter integration of the superposition prover SPASS to increase Sledgehammer’s success rate. The main enhancements are native support for hard sorts (simp ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract. Sledgehammer for Isabelle/HOL integrates automatic theorem provers to discharge interactive proof obligations. This paper considers a tighter integration of the superposition prover SPASS to increase Sledgehammer’s success rate. The main enhancements are native support for hard sorts (simple types) in SPASS, simplification that honors the orientation of Isabelle simp rules, and a pair of clauseselection strategies targeted at large lemma libraries. The usefulness of this integration is confirmed by an evaluation on a vast benchmark suite and by a case study featuring a formalization of languagebased security. 1
Integrating Automated and Interactive Protocol Verification (Extended Version)
"... interpretation techniques and other overapproximations of the set of reachable states or traces. The protocol models that these tools employ are shaped by the needs of automated verification and require subtle assumptions. Also, a complex verification tool may suffer from implementation bugs so tha ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
interpretation techniques and other overapproximations of the set of reachable states or traces. The protocol models that these tools employ are shaped by the needs of automated verification and require subtle assumptions. Also, a complex verification tool may suffer from implementation bugs so that in the worst case the tool could accept some incorrect protocols as being correct. These risks of errors are also present, but considerably smaller, when using an LCFstyle theorem prover like Isabelle. The interactive security proof, however, requires a lot of expertise and time. We combine the advantages of both worlds by using the representation of the overapproximated search space computed by the automated tools as a “proof idea ” in Isabelle. Thus, we devise proof tactics for Isabelle that generate the correctness proof of the protocol from the output of the automated tools. In the worst case, these tactics fail to construct a proof, namely when the representation of the search space is for some reason incorrect. However, when they succeed, the correctness only relies on the basic model and the Isabelle core. 1
Robust, SemiIntelligible Isabelle Proofs from ATP Proofs
"... Sledgehammer integrates external automatic theorem provers (ATPs) in the Isabelle/HOL proof assistant. To guard against bugs, ATP proofs must be reconstructed in Isabelle. Reconstructing complex proofs involves translating them to detailed Isabelle proof texts, using suitable proof methods to justif ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Sledgehammer integrates external automatic theorem provers (ATPs) in the Isabelle/HOL proof assistant. To guard against bugs, ATP proofs must be reconstructed in Isabelle. Reconstructing complex proofs involves translating them to detailed Isabelle proof texts, using suitable proof methods to justify the inferences. This has been attempted before with little success, but we have addressed the main issues: Sledgehammer now transforms the proofs by contradiction into direct proofs (as described in a companion paper [3]); it reconstructs skolemization inferences correctly; it provides the right amount of type annotations to ensure formulas are parsed correctly without marring them with types; and it iteratively tests and compresses the output, resulting in simpler and faster working proofs.