Results 1  10
of
17
Sledgehammer: Judgement Day
"... Sledgehammer, a component of the interactive theorem prover Isabelle, finds proofs in higherorder logic by calling the automated provers for firstorder logic E, SPASS and Vampire. This paper is the largest and most detailed empirical evaluation of such a link to date. Our test data consists of 12 ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
Sledgehammer, a component of the interactive theorem prover Isabelle, finds proofs in higherorder logic by calling the automated provers for firstorder logic E, SPASS and Vampire. This paper is the largest and most detailed empirical evaluation of such a link to date. Our test data consists of 1240 proof goals arising in 7 diverse Isabelle theories, thus representing typical Isabelle proof obligations. We measure the effectiveness of Sledgehammer and many other parameters such as run time and complexity of proofs. A facility for minimizing the number of facts needed to prove a goal is presented and analyzed.
Extending Sledgehammer with SMT Solvers
"... Abstract. Sledgehammer is a component of Isabelle/HOL that employs firstorder automatic theorem provers (ATPs) to discharge goals arising in interactive proofs. It heuristically selects relevant facts and, if an ATP is successful, produces a snippet that replays the proof in Isabelle. We extended Sl ..."
Abstract

Cited by 16 (7 self)
 Add to MetaCart
Abstract. Sledgehammer is a component of Isabelle/HOL that employs firstorder automatic theorem provers (ATPs) to discharge goals arising in interactive proofs. It heuristically selects relevant facts and, if an ATP is successful, produces a snippet that replays the proof in Isabelle. We extended Sledgehammer to invoke satisfiability modulo theories (SMT) solvers as well, exploiting its relevance filter and parallel architecture. Isabelle users are now pleasantly surprised by SMT proofs for problems beyond the ATPs ’ reach. Remarkably, the best SMT solver performs better than the best ATP on most of our benchmarks. 1
Encoding Monomorphic and Polymorphic Types
"... Abstract. Most automatic theorem provers are restricted to untyped or monomorphic logics, and existing translations from polymorphic logics are bulky or unsound. Recent research shows how to exploit monotonicity to encode ground types efficiently: monotonic types can be safely erased, while nonmonot ..."
Abstract

Cited by 12 (9 self)
 Add to MetaCart
Abstract. Most automatic theorem provers are restricted to untyped or monomorphic logics, and existing translations from polymorphic logics are bulky or unsound. Recent research shows how to exploit monotonicity to encode ground types efficiently: monotonic types can be safely erased, while nonmonotonic types must generally be encoded. We extend this work to rank1 polymorphism and show how to eliminate even more clutter. We also present alternative schemes that lighten the translation of polymorphic symbols, based on the novel notion of “cover”. The new encodings are implemented, and partly proved correct, in Isabelle/HOL. Our evaluation finds them vastly superior to previous schemes. 1
Modular SMT Proofs for Fast Reflexive Checking inside Coq
 FIRST INTERNATIONAL CONFERENCE ON CERTIFIED PROGRAMS AND PROOFS
, 2011
"... We present a new methodology for exchanging unsatisfiability proofs between an untrusted SMT solver and a sceptical proof assistant with computation capabilities like Coq. We advocate modular SMT proofs that separate boolean reasoning and theory reasoning; and structure the communication between th ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
We present a new methodology for exchanging unsatisfiability proofs between an untrusted SMT solver and a sceptical proof assistant with computation capabilities like Coq. We advocate modular SMT proofs that separate boolean reasoning and theory reasoning; and structure the communication between theories using NelsonOppen combination scheme. We present the design and implementation of a Coq reflexive verifier that is modular and allows for finetuned theoryspecific verifiers. The current verifier is able to verify proofs for quantifierfree formulae mixing linear arithmetic and uninterpreted functions. Our proof generation scheme benefits from the efficiency of stateoftheart SMT solvers while being independent from a specific SMT solver proof format. Our only requirement for the SMT solver is the ability to extract unsat cores and generate boolean models. In practice, unsat cores are relatively small and their proof is obtained with a modest overhead by our proofproducing prover. We present experiments assessing the feasibility of the approach for benchmarks obtained from the SMT competition.
Premise Selection for Mathematics by Corpus Analysis and Kernel Methods
"... Smart premise selection is essential when using automated reasoning as a tool for largetheory formal proof development. A good method for premise selection in complex mathematical libraries is the application of machine learning to large corpora of proofs. This work develops learningbased premise ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Smart premise selection is essential when using automated reasoning as a tool for largetheory formal proof development. A good method for premise selection in complex mathematical libraries is the application of machine learning to large corpora of proofs. This work develops learningbased premise selection in two ways. First, a newly available minimal dependency analysis of existing highlevel formal mathematical proofs is used to build a large knowledge base of proof dependencies, providing precise data for ATPbased reverification and for training premise selection algorithms. Second, a new machine learning algorithm for premise selection based on kernel methods is proposed and implemented. To evaluate the impact of both techniques, a benchmark consisting of 2078 largetheory mathematical problems is constructed, extending the older MPTP Challenge benchmark. The combined effect of the techniques results in a 50% improvement on the benchmark over the Vampire/SInE stateoftheart system for automated reasoning in large theories.
Faster and More Complete Extended Static Checking for the Java Modeling Language
 J AUTOM REASONING
"... Extended Static Checking (ESC) is a fully automated formal verification technique. Verification in ESC is achieved by translating programs and their specifications into verification conditions (VCs). Proof of a VC establishes the correctness of the program. The implementations of many seemingly simp ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Extended Static Checking (ESC) is a fully automated formal verification technique. Verification in ESC is achieved by translating programs and their specifications into verification conditions (VCs). Proof of a VC establishes the correctness of the program. The implementations of many seemingly simple algorithms are beyond the ability of traditional Extended Static Checking (ESC) tools to verify. Not being able to verify toy examples is often enough to turn users off of the idea of using formal methods. ESC4, the ESC component of the JML4 project, is able to verify many more kinds of methods in part because of its use of novel techniques which apply multiple theorem provers. In particular, we present Offline UserAssisted ESC (OUAESC), a new form of verification that lies between ESC and Full Static Program Verification (FSPV). ESC is generally quite efficient, as far as verification tools go, but it is still orders of magnitude slower than simple compilation. As can be imagined, proving VCs is computationally expensive: While small classes can be verified in seconds, verifying larger programs of 50 KLOC can take hours. To help address the added cost of using multiple provers and this lack of scalability, we present the multithreaded version of ESC4 and its distributed prover backend.
Automatic Proof and Disproof in Isabelle/HOL
, 2011
"... Isabelle/HOL is a popular interactive theorem prover based on higherorder logic. It owes its success to its ease of use and powerful automation. Much of the automation is performed by external tools: The metaprover Sledgehammer relies on resolution provers and SMT solvers for its proof search, the c ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Isabelle/HOL is a popular interactive theorem prover based on higherorder logic. It owes its success to its ease of use and powerful automation. Much of the automation is performed by external tools: The metaprover Sledgehammer relies on resolution provers and SMT solvers for its proof search, the counterexample generator Quickcheck uses the ML compiler as a fast evaluator for ground formulas, and its rival Nitpick is based on the model finder Kodkod, which performs a reduction to SAT. Together with the Isar structured proof format and a new asynchronous user interface, these tools have radically transformed the Isabelle user experience. This paper provides an overview of the main automatic proof and disproof tools.
More SPASS with Isabelle Superposition with Hard Sorts and Configurable Simplification
"... Abstract. Sledgehammer for Isabelle/HOL integrates automatic theorem provers to discharge interactive proof obligations. This paper considers a tighter integration of the superposition prover SPASS to increase Sledgehammer’s success rate. The main enhancements are native support for hard sorts (simp ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract. Sledgehammer for Isabelle/HOL integrates automatic theorem provers to discharge interactive proof obligations. This paper considers a tighter integration of the superposition prover SPASS to increase Sledgehammer’s success rate. The main enhancements are native support for hard sorts (simple types) in SPASS, simplification that honors the orientation of Isabelle simp rules, and a pair of clauseselection strategies targeted at large lemma libraries. The usefulness of this integration is confirmed by an evaluation on a vast benchmark suite and by a case study featuring a formalization of languagebased security. 1
Redirecting proofs by contradiction
"... This paper presents an algorithm that redirects proofs by contradiction. The input is a refutation graph, as produced by an automatic theorem prover (e.g., E, SPASS, Vampire, Z3); the output is a direct proof expressed in natural deduction extended with case analyses and nested subproofs. The algori ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This paper presents an algorithm that redirects proofs by contradiction. The input is a refutation graph, as produced by an automatic theorem prover (e.g., E, SPASS, Vampire, Z3); the output is a direct proof expressed in natural deduction extended with case analyses and nested subproofs. The algorithm is implemented in Isabelle’s Sledgehammer, where it enhances the legibility of machinegenerated proofs.
Integrating Automated and Interactive Protocol Verification (Extended Version)
"... interpretation techniques and other overapproximations of the set of reachable states or traces. The protocol models that these tools employ are shaped by the needs of automated verification and require subtle assumptions. Also, a complex verification tool may suffer from implementation bugs so tha ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
interpretation techniques and other overapproximations of the set of reachable states or traces. The protocol models that these tools employ are shaped by the needs of automated verification and require subtle assumptions. Also, a complex verification tool may suffer from implementation bugs so that in the worst case the tool could accept some incorrect protocols as being correct. These risks of errors are also present, but considerably smaller, when using an LCFstyle theorem prover like Isabelle. The interactive security proof, however, requires a lot of expertise and time. We combine the advantages of both worlds by using the representation of the overapproximated search space computed by the automated tools as a “proof idea ” in Isabelle. Thus, we devise proof tactics for Isabelle that generate the correctness proof of the protocol from the output of the automated tools. In the worst case, these tactics fail to construct a proof, namely when the representation of the search space is for some reason incorrect. However, when they succeed, the correctness only relies on the basic model and the Isabelle core. 1