#### DMCA

## Three Years of Experience with Sledgehammer, a Practical Link between Automatic and Interactive Theorem Provers

### Cached

### Download Links

Citations: | 44 - 7 self |

### Citations

1053 |
Isabelle/HOL: A Proof Assistant for Higher-Order Logic
- Nipkow, Paulson, et al.
(Show Context)
Citation Context ...consumed many thousands of hours of processor time. 2.1 Translation into First-Order Logic Most interactive theorem provers support a language much richer than that of first-order logic. Isabelle/HOL =-=[16]-=- supports polymorphic higher-order logic, augmented with axiomatic type classes [32].1 Many user problems contain no higher-order features, and might be imagined to lie within first-order logic; howev... |

995 | A Formulation of the Simple Theory of Types - Church - 1940 |

576 | Introduction to HOL: A Theorem Proving Environment for Higher-Order Logic, - Gordon, Melham - 1993 |

471 | Isabelle: A Generic Theorem Prover.
- Paulson
- 1994
(Show Context)
Citation Context ...ns taking varying numbers of arguments [12]. We 1Note that Isabelle/HOL is the instantiation of Isabelle [20] to higher-order logic. Isabelle is a generic theorem prover, based on a logical framework =-=[19]-=-. 2 Three Years of Experience with Sledgehammer L. C. Paulson eventually adopted a translation based on the one that we used for first-order logic, modified to introduce higher-order mechanisms (such ... |

368 | An Introduction to Mathematical Logic and Type Theory: To Truth through Proof - Andrews - 1986 |

108 | Computing small clause normal forms.
- Nonnengart, Weidenbach
- 2001
(Show Context)
Citation Context ...der format. Sledgehammer nevertheless translates problems into clause form itself, and using a naive application of distributive laws rather than a polynomial time algorithm based on formula renaming =-=[17]-=-. Moreover, the translation to clauses is performed using Isabelle’s internal proof engine; this was thought to be essential to allow proof reconstruction within Isabelle. Sledgehammer’s naive transla... |

107 | Spass: Combining superposition, sorts and splitting. Handbook of automated reasoning
- Weidenbach
- 1999
(Show Context)
Citation Context ...rt of the Ωmega system [4, 26]. The parallel invocation of different theorem provers is invaluable. Böhme and Nipkow [6] have demonstrated that running three different theorem provers (E [25], SPASS =-=[30]-=- and Vampire [22]) for five seconds solves as many problems as running the best theorem prover (Vampire) for two full minutes. It would be better to utilise even more theorem provers. I have undertake... |

74 | Type classes and overloading in higher-order logic.
- Wenzel
- 1997
(Show Context)
Citation Context ... Logic Most interactive theorem provers support a language much richer than that of first-order logic. Isabelle/HOL [16] supports polymorphic higher-order logic, augmented with axiomatic type classes =-=[32]-=-.1 Many user problems contain no higher-order features, and might be imagined to lie within first-order logic; however, even these problems are full of typing information. Type information can take qu... |

71 | First-order proof tactics in higher-order logic theorem provers. In:
- Hurd
- 2003
(Show Context)
Citation Context ...r, even these problems are full of typing information. Type information can take quadratic space [12] because every term must be labelled with its type, recursively, right down to the variables. Hurd =-=[8]-=- observed that omitting type information greatly improved the success rate of his theorem prover, Metis. This is hardly surprising, since the type information virtually buries the terms themselves. Hu... |

58 | A.: LEO-II — A cooperative automatic theorem prover for classical higher-order logic. In:
- Benzmuller, Paulson, et al.
- 2008
(Show Context)
Citation Context ...perience suggests, unfortunately, that Sledgehammer is seldom successful on problems containing higher-order elements. Integration with a genuine higher-order automatic theorem prover, such as LEO-II =-=[3]-=-, seems necessary. This would pose interesting problems for proof reconstruction: LEO-II’s approach is to reduce higher-order problems to first-order ones by repeatedly applying specialised inference ... |

51 | Set theory for verification: I. From foundations to functions - Paulson - 1993 |

50 | Lightweight relevance filtering for machine-generated resolution problems.
- Meng, Paulson
- 2009
(Show Context)
Citation Context ...were two libraries, one consisting of facts useful for forward and backward chaining, the other consisting of rewriting rules for simplification. Each library contained hundreds of lemmas. Meng and I =-=[13]-=- discovered that automatic theorem provers could solve only trivial problems in the presence so many extraneous facts; by developing a lightweight, symbol-based relevance filter, we greatly improved t... |

49 | Integrating Gandalf and HOL.
- Hurd
- 1999
(Show Context)
Citation Context ...g long, incomprehensible chains of deduction. Many researchers have attempted to use them to support interactive theorem proving; particularly pertinent are Ahrendt et al. [1], Bezem et al. [5], Hurd =-=[7]-=- and Siekmann et al. [26]. But the only one to pass the test of time is Sledgehammer [14, 21], which links Isabelle/HOL to the automatic provers E, SPASS and Vampire. Isabelle users invoke Sledgehamme... |

45 | System description: E 0.81.
- Schulz
- 2004
(Show Context)
Citation Context ...time been part of the Ωmega system [4, 26]. The parallel invocation of different theorem provers is invaluable. Böhme and Nipkow [6] have demonstrated that running three different theorem provers (E =-=[25]-=-, SPASS [30] and Vampire [22]) for five seconds solves as many problems as running the best theorem prover (Vampire) for two full minutes. It would be better to utilise even more theorem provers. I ha... |

45 | Set theory for verification: II. Induction and recursion - Paulson - 1995 |

44 | Translating higher-order clauses to first-order clauses.
- Meng, Paulson
- 2008
(Show Context)
Citation Context ...roblems contain no higher-order features, and might be imagined to lie within first-order logic; however, even these problems are full of typing information. Type information can take quadratic space =-=[12]-=- because every term must be labelled with its type, recursively, right down to the variables. Hurd [8] observed that omitting type information greatly improved the success rate of his theorem prover, ... |

43 | Oants – an open approach at combining interactive and automated theorem proving
- Benzmüller, Sorge
- 2000
(Show Context)
Citation Context ...sing power to support several ATP executions without becoming sluggish. An agentbased implementation of similar ideas, using a blackboard architecture, has for some time been part of the Ωmega system =-=[4, 26]-=-. The parallel invocation of different theorem provers is invaluable. Böhme and Nipkow [6] have demonstrated that running three different theorem provers (E [25], SPASS [30] and Vampire [22]) for fiv... |

41 |
Ontic: A Knowledge Representation System for Mathematics
- McAllester
- 1989
(Show Context)
Citation Context ...the enticing prospect that any relevant existing theorem, however obscure, could be located. I thought this goal to be unrealistic; it seemed to have too much in common with McAllester’s Ontic system =-=[9]-=-. Ontic was intended to be able to prove mathematical results using known results that it identified automatically, and it seems fair to say that this objective was too ambitious. But Sledgehammer wou... |

41 |
TRAMP: Transformation of machine-found proofs into natural deduction proofs at the assertion level. In
- Meier
- 2000
(Show Context)
Citation Context ... Otterfier proof transformation service [33]. Resolution proofs should ideally be translated to natural, intuitive Isabelle proofs. The best-known prior work on translating resolution proofs is TRAMP =-=[11]-=-; its applicability to Sledgehammer is unexplored. Preliminary work has commenced at Munich to see to what extent resolution proofs can be transformed into intelligible proofs. 2.4.2 One-Line Reconstr... |

40 |
Vampire 1.1 (system description
- Riazanov, Voronkov
- 2001
(Show Context)
Citation Context ...ystem [4, 26]. The parallel invocation of different theorem provers is invaluable. Böhme and Nipkow [6] have demonstrated that running three different theorem provers (E [25], SPASS [30] and Vampire =-=[22]-=-) for five seconds solves as many problems as running the best theorem prover (Vampire) for two full minutes. It would be better to utilise even more theorem provers. I have undertaken informal, unpub... |

38 | Sledgehammer: Judgement Day.
- Böhme, Nipkow
- 2010
(Show Context)
Citation Context ...ertaking difficult proofs. In a recent study involving older Isabelle proof scripts, Böhme and Nipkow demonstrated that Sledgehammer could prove 34% of the nontrivial goals contained in those proofs =-=[6]-=-. Sledgehammer was first released in February 2007 to users daring enough to download an Isabelle nightly build. It was announced in November 2007 as a component of Isabelle2007. This paper outlines t... |

31 | Integrating automated and interactive theorem proving
- Ahrendt, Beckert, et al.
- 1998
(Show Context)
Citation Context ...ATPs) are capable of creating long, incomprehensible chains of deduction. Many researchers have attempted to use them to support interactive theorem proving; particularly pertinent are Ahrendt et al. =-=[1]-=-, Bezem et al. [5], Hurd [7] and Siekmann et al. [26]. But the only one to pass the test of time is Sledgehammer [14, 21], which links Isabelle/HOL to the automatic provers E, SPASS and Vampire. Isabe... |

30 | L.C.: Automation for interactive proof: First prototype.
- Meng, Quigley, et al.
- 2006
(Show Context)
Citation Context ...m to support interactive theorem proving; particularly pertinent are Ahrendt et al. [1], Bezem et al. [5], Hurd [7] and Siekmann et al. [26]. But the only one to pass the test of time is Sledgehammer =-=[14, 21]-=-, which links Isabelle/HOL to the automatic provers E, SPASS and Vampire. Isabelle users invoke Sledgehammer routinely when undertaking difficult proofs. In a recent study involving older Isabelle pro... |

25 | C.E.: Analytic Tableaux for Higher-Order Logic with Choice. - Backes, Brown - 2010 |

22 | Isabelle/Isar—A generic framework for human-readable proof documents.
- Wenzel
- 2007
(Show Context)
Citation Context ...s into theorems. In Isabelle, the simple combination of structured proofs and Sledgehammer takes the user surprisingly far. This is not the place to give a detailed tutorial on Isar structured proofs =-=[15, 31]-=-. In brief, they support natural deduction through local scopes that can introduce assumptions (using the keyword assume) as well as local variables and definitions. Moreover, while traditional tactic... |

19 | Calculational Reasoning Revisited – An Isabelle/Isar experience. In: 14th TPHOLs,
- Bauer, Wenzel
- 2001
(Show Context)
Citation Context ...ammer. Nor need we restrict ourselves to a linear progression of facts. Because proofs are structured, you can nest the proofs of these lemmas to any depth. Isar also supports calculational reasoning =-=[2]-=-. A chain of reasoning steps, connected by familiar relations such as =, ≤ and <, can be written with separate proofs for each step of the calculation. Once again, if you can see the intermediate stag... |

19 | Limited resource strategy in resolution theorem.
- Riazanov, Voronkov
- 2003
(Show Context)
Citation Context ...superfluous clutter from the proof scripts. ATPs themselves could return proofs using a minimum of axioms, or alternatively, proofs of a minimum length. Vampire’s well-known limited resource strategy =-=[23]-=-, although designed to cope with limited processor time, could probably be modified to minimise proofs efficiently. 3 Sledgehammer and Teaching Sledgehammer was not designed specifically as an aid to ... |

16 |
and Kong Woei Susanto. Source-level proof reconstruction for interactive theorem proving
- Paulson
- 2007
(Show Context)
Citation Context ...m to support interactive theorem proving; particularly pertinent are Ahrendt et al. [1], Bezem et al. [5], Hurd [7] and Siekmann et al. [26]. But the only one to pass the test of time is Sledgehammer =-=[14, 21]-=-, which links Isabelle/HOL to the automatic provers E, SPASS and Vampire. Isabelle users invoke Sledgehammer routinely when undertaking difficult proofs. In a recent study involving older Isabelle pro... |

12 | System description: SystemOnTPTP - Sutcliffe - 2000 |

11 | A tutorial introduction to structured Isar proofs. http://isabelle.in.tum.de/dist/ Isabelle/doc/isar-overview.pdf
- Nipkow
(Show Context)
Citation Context ...s into theorems. In Isabelle, the simple combination of structured proofs and Sledgehammer takes the user surprisingly far. This is not the place to give a detailed tutorial on Isar structured proofs =-=[15, 31]-=-. In brief, they support natural deduction through local scopes that can introduce assumptions (using the keyword assume) as well as local variables and definitions. Moreover, while traditional tactic... |

11 | MaLARea: A metasystem for automated reasoning in large theories.
- Urban
- 2007
(Show Context)
Citation Context ...d obviously be preferable for the automatic theorem provers themselves to perform relevance filtering. Or we should use a sophisticated system based on machine learning, such as Josef Urban’s MaLARea =-=[28]-=-, where successful proofs provide information to guide other proofs. Unfortunately, any such approach will fail given Sledgehammer’s use of unsound translations. In unpublished work by Urban, MaLARea ... |

10 |
de Nivelle. Automatic proof construction in type theory using resolution
- Bezem, Hendriks, et al.
- 2002
(Show Context)
Citation Context ...of creating long, incomprehensible chains of deduction. Many researchers have attempted to use them to support interactive theorem proving; particularly pertinent are Ahrendt et al. [1], Bezem et al. =-=[5]-=-, Hurd [7] and Siekmann et al. [26]. But the only one to pass the test of time is Sledgehammer [14, 21], which links Isabelle/HOL to the automatic provers E, SPASS and Vampire. Isabelle users invoke S... |

6 | Tool support for logics of programs
- Paulson
- 1997
(Show Context)
Citation Context ...c, allowing for at least truth values to be used as the values of terms and for curried functions taking varying numbers of arguments [12]. We 1Note that Isabelle/HOL is the instantiation of Isabelle =-=[20]-=- to higher-order logic. Isabelle is a generic theorem prover, based on a logical framework [19]. 2 Three Years of Experience with Sledgehammer L. C. Paulson eventually adopted a translation based on t... |

6 |
Integrated proof transformation services
- Zimmer, Meier, et al.
- 2004
(Show Context)
Citation Context ...reconstructing proof steps easily. The output of Sledgehammer was now a list of calls to Metis, each of which proved a clause. This approach was inspired by the Otterfier proof transformation service =-=[33]-=-. Resolution proofs should ideally be translated to natural, intuitive Isabelle proofs. The best-known prior work on translating resolution proofs is TRAMP [11]; its applicability to Sledgehammer is u... |

1 |
Interactive formal verification. http://www.cl.cam.ac.uk/teaching/0910/L21/. Lecture course materials
- Paulson
(Show Context)
Citation Context ...isk of discovering that a lemma is useless only after spending weeks proving it. In January 2010, as part of its new MPhil. programme, the University of Cambridge offered a lecture course on Isabelle =-=[18]-=-. The course materials included almost no information about the 3The existence of sorry does not compromise Isabelle’s soundness, because it is only permitted during interactive sessions. A theory fil... |

1 |
Bounded model generation for Isabelle/HOL. Electron
- Weber
- 2005
(Show Context)
Citation Context ...andom testing is an obvious way to do this, but counterexample finding can also make use of automated deduction technology. An early example is refute, which uses a SAT solver to find counterexamples =-=[29]-=-; it is also a component of Isabelle. Sledgehammer has a number of limitations, most of which open up suggestions for future work. The relevance filter is primitive, but an improved one will have to b... |