## Practical Proof Checking for Program Certification (2005)

### Cached

### Download Links

Venue: | Proceedings of the CADE-20 Workshop on Empirically Successful Classical Automated Reasoning (ESCAR’05 |

Citations: | 5 - 4 self |

### BibTeX

@INPROCEEDINGS{Sutcliffe05practicalproof,

author = {Geoff Sutcliffe and Ewen Denney and Bernd Fischer},

title = {Practical Proof Checking for Program Certification},

booktitle = {Proceedings of the CADE-20 Workshop on Empirically Successful Classical Automated Reasoning (ESCAR’05},

year = {2005}

}

### OpenURL

### Abstract

Program certification aims to provide explicit evidence that a program meets a specified level of safety. This evidence must be independently reproducible and verifiable. We have developed a system, based on theorem proving, that generates proofs that auto-generated aerospace code adheres to a number of safety policies. For certification purposes, these proofs need to be verified by a proof checker. Here, we describe and evaluate a semantic derivation verification approach to proof checking. The evaluation is based on 109 safety obligations that are attempted by EP and SPASS. Our system is able to verify 129 out of the 131 proofs found by the two provers. The majority of the proofs are checked completely in less than 15 seconds wall clock time. This shows that the proof checking task arising from a substantial prover application is practically tractable. 1

### Citations

571 | A transformation system for developing recursive programs - Burstall, Darlington - 1977 |

492 |
Simple word problem in universal algebra
- Knuth, Bendix
- 1970
(Show Context)
Citation Context ... the default settings [BH96]. Even so, for some proof tasks nearly 50 % are needed by the orderings. The aim of this work is the development of an efficient version of the Knuth-Bendix Ordering (KBO) =-=[KB70]-=-, one of the orderings in widespread use. It is widely believed that an implementation of KBO is asymptotically optimal if it shows quadratic worstcase behavior. In the following, however, we will sho... |

472 |
Interactive Theorem Proving and Program Development. Coq’Art: The Calculus of Inductive Constructions
- Bertot, Castéran
- 2004
(Show Context)
Citation Context ...ion of Otter proof steps by Ivy [MSM00], higher-order proof term reconstruction in Isabelle [BN00], higher-order proof step checking in HOL [Won99], reducing proof checking to type checking as in Coq =-=[BC04]-=-, and semantic derivation verification [SB05]. Semantic derivation verification has been used in this work. In semantic derivation verification, the required semantic properties of each proof step are... |

457 | Comprehending monads
- WADLER
- 1992
(Show Context)
Citation Context ... and update are standard functions to read and modify entries of an array. We assume that they need constant time. Therefore, functions inc and dec need 2 The cognoscenti will recognize a state monad =-=[Wad92]-=-. 6sconstant time as well. Note the explicit threading of the array, even read returns it. This facilitates the aforementioned linearity analysis. The following function noNeg tests whether for all va... |

422 | Isabelle: a Generic Theorem Prover
- Paulson
- 1994
(Show Context)
Citation Context ...ments involved more than 8000 inference steps. Consequently, simple “correct-by-inspection” theorem provers like leanTAP [BP95], or tactic-based provers built on top of a trusted kernel like Isabelle =-=[Pau89]-=-, are not powerful enough. 1 Instead, we need to employ high-performance ATPs, which use complicated calculi, elaborate data structures, and optimized implementations. This makes formal verification o... |

294 | 3.0 Reference Manual and Guide
- McCune, Otter
- 1993
(Show Context)
Citation Context ...hree alternative techniques, described here in order of preference, may be used to show satisfiability. First, a finite model of the axioms may be found using a model 5sgeneration system such as MACE =-=[McC03a]-=- or Paradox [CS03]. Second, a saturation of the axioms may be found using a saturating ATP system such as SPASS or EP [Sch02b]. Third, an attempt to show the axioms to be contradictory can be made usi... |

267 | The design and implementation of a certifying compiler
- Necula, Lee
- 1998
(Show Context)
Citation Context ...fier ATP rewrite rules domain theory proofs axioms / lemmas proof checker Figure 1: Certifiable program synthesis: System architecture code certificate proofs untrusted Similar to proof carrying code =-=[NL98]-=-, the architecture distinguishes between trusted and untrusted components, shown in Figure 1 in red (dark grey) and blue (light grey), respectively. Components are called trusted—and must thus be corr... |

174 |
The Design and Implementation of Vampire
- Riazanov, Voronkov
- 2002
(Show Context)
Citation Context ...lemented in the system TeMP [HKRV04]. A distinctive feature of our calculus is the possibility to implement its inference rules using first-order ordered resolution. For TeMP we currently use Vampire =-=[RV02]-=- for this purpose. In this paper we describe our experiments with TeMP on a class of verification problems. While these verification efforts are interesting per se, they also allow us to study the beh... |

128 | a brainiac theorem prover
- Schulz, E
(Show Context)
Citation Context ...of the axioms may be found using a model 5sgeneration system such as MACE [McC03a] or Paradox [CS03]. Second, a saturation of the axioms may be found using a saturating ATP system such as SPASS or EP =-=[Sch02b]-=-. Third, an attempt to show the axioms to be contradictory can be made using a refutation system. If that succeeds then the satisfiability obligation cannot be discharged. If it fails it provides an i... |

127 |
The TPTP Problem Library: CNF Release v1.2.1
- G, Suttner
- 1998
(Show Context)
Citation Context ...1998 without finishing the Mizar-to-ATP export. Thanks to ILF, hundreds of ATP problems extracted from several untyped Mizar articles have been for several years already included in the standard TPTP =-=[SS98]-=- library. 1.3 First MPTP version The first version of MPTP has already been used for initial exploration of the usability of ATP systems on the Mizar Mathematical Library (MML), and of the benefits of... |

92 | Combining superposition, sorts and splitting
- Weidenbach
- 2001
(Show Context)
Citation Context .... Many shortcuts and simplifications were therefore taken in the first MPTP version, naming at least the following: • Mizar formulas were directly exported to the DFG [HKW96] syntax used by the SPASS =-=[Wei01]-=- system. SPASS seemed to perform best on MPTP problems, probably because of its handling of sort theories. SPASS also has a built-in efficient clausifier [NW01], which the other efficient provers like... |

78 | leantap: Lean tableau-based deduction
- Beckert, Posegga
- 1995
(Show Context)
Citation Context ...t requires substantial “deductive power”: the longest proof found during experiments involved more than 8000 inference steps. Consequently, simple “correct-by-inspection” theorem provers like leanTAP =-=[BP95]-=-, or tactic-based provers built on top of a trusted kernel like Isabelle [Pau89], are not powerful enough. 1 Instead, we need to employ high-performance ATPs, which use complicated calculi, elaborate ... |

62 | New techniques that improve mace-style finite model finding
- Claessen, Sorensson
- 2003
(Show Context)
Citation Context ...hniques, described here in order of preference, may be used to show satisfiability. First, a finite model of the axioms may be found using a model 5sgeneration system such as MACE [McC03a] or Paradox =-=[CS03]-=-. Second, a saturation of the axioms may be found using a saturating ATP system such as SPASS or EP [Sch02b]. Third, an attempt to show the axioms to be contradictory can be made using a refutation sy... |

59 |
Autobayes: a system for generating data analysis programs from statistical models
- Fischer, Schumann
(Show Context)
Citation Context ...allel with the code fragments. We have implemented this approach in two synthesis systems, AUTOFILTER [WS04], which generates state estimation code based on the Kalman filter algorithm, and AUTOBAYES =-=[FS03]-=-, which generates statistical data analysis code. Figure 1 shows the overall architecture of a certifiable program synthesis system. At its core is the original synthesis system that generates code fo... |

45 | SPASS version 2.0 - Weidenbach, Brahm, et al. - 2002 |

41 | Otter 3.3 reference manual
- McCune
- 2003
(Show Context)
Citation Context ...pproach. Section 6 concludes, and discusses directions for future work. 1 Seehttp://www.cl.cam.ac.uk/users/jeh1004/software/metis/performance.html for benchmark data. 2 The notable exception is Otter =-=[McC03b]-=-, which has been essentially unchanged since 1996. However, previous experiments have shown that its performance is not sufficient for discharging the safety obligations we generate [DFS05]. 2s2 Forma... |

37 | Splitting without backtracking
- Riazanov, Voronkov
- 2001
(Show Context)
Citation Context ...ses. There are several variants of splitting that have been implemented in specific ATPs, including explicit splitting as implemented in SPASS, and forms of pseudo-splitting as implemented in Vampire =-=[RV01]-=- and E. Verification of splitting inferences requires several theorem obligations to be discharged. Explicit splitting takes a CNF problem S ∪ {L ∨ R}, in which L and R do not share any variables, and... |

34 | Proof terms for simply typed higher order logic
- Berghofer, Nipkow
- 2000
(Show Context)
Citation Context ...n the logical system in use. There are several approaches to proof checking, including the syntactic validation of Otter proof steps by Ivy [MSM00], higher-order proof term reconstruction in Isabelle =-=[BN00]-=-, higher-order proof step checking in HOL [Won99], reducing proof checking to type checking as in Coq [BC04], and semantic derivation verification [SB05]. Semantic derivation verification has been use... |

32 | Correctness of Source-level Safety Policies
- Denney, Fischer
- 2003
(Show Context)
Citation Context ...t “go wrong”, i.e., does not violate certain conditions. A safety policy is defined by a set of Hoare-style inference rules and auxiliary definitions. The formal basis of this approach is explored in =-=[DF03]-=-. Safety policies exist at two levels of granularity. Language-specific policies can be expressed in terms of the constructs of the underlying programming language itself. They are sensible for any gi... |

30 | TSTP Data-Exchange Formats for Automated Theorem Proving Tools
- Sutcliffe, Zimmer, et al.
- 2004
(Show Context)
Citation Context ...equence of its parents, but in other cases, e.g., Skolemization and splitting, the inferred formula has a weaker relation to its parents. A comprehensive list of inferred formula statuses is given in =-=[SZS04]-=-. Consequently, there are different forms of proof check obligations; currently GDV distinguishes between theorem obligations, satisfiability obligations, and leaf theorem obligations, which are expla... |

29 | Ivy: A preprocessor and proof checker for first-order logic
- McCune, Shumsky
- 2000
(Show Context)
Citation Context ...stead, we need to employ high-performance ATPs, which use complicated calculi, elaborate data structures, and optimized implementations. This makes formal verification of their correctness infeasible =-=[MSM00]-=-. One could argue that these provers have been extensively validated by the theorem proving community (e.g., the soundness checks required for participation in the CADE ATP System Competition (CASC), ... |

29 |
Automating the Implementation of Kalman Filter Algorithms
- Whittle, Schumann
(Show Context)
Citation Context ...ake the annotations part of the code templates so that they can be instantiated and refined in parallel with the code fragments. We have implemented this approach in two synthesis systems, AUTOFILTER =-=[WS04]-=-, which generates state estimation code based on the Kalman filter algorithm, and AUTOBAYES [FS03], which generates statistical data analysis code. Figure 1 shows the overall architecture of a certifi... |

29 | Synthesizing certified code - Whalen, Schumann, et al. - 2002 |

28 | The CADE-17 ATP System Competition - Sutcliffe |

25 | Evaluating General Purpose Automated Theorem Proving Systems
- Sutcliffe, Suttner
(Show Context)
Citation Context ...ected based on the results of evaluating several state-of-the-art ATPs against the problems, and were selected so as to be “difficult”, i.e., with TPTP difficulty ratings strictly between 0.0 and 1.0 =-=[SS01]-=-. As a practical test and evaluation of the proof checking approach described in this paper, we scrutinized the proofs generated for these 109 problems by the ATPs EP (Version 0.82) [Sch02b] 3 and SPA... |

25 |
Entailment: The Logic of Relevance and
- Anderson, Belnap
- 1975
(Show Context)
Citation Context ... solves the problem (i.e., finds a proof), the obligation has been discharged. This verification of logical consequences ensures the soundness of the inference steps, but does not check for relevance =-=[AB75]-=-. As a contradiction in first order logic entails everything, an inference step with contradictory parents can soundly infer anything. If such inferences should be rejected 3 , a satisfiability obliga... |

22 | Using Automated Theorem Provers to Certify Auto-generated Aerospace Software
- Denney, Fischer, et al.
- 2004
(Show Context)
Citation Context ...d to check the proofs that are found by ATPs for the safety obligations generated in the program certification process. The success of ATPs in discharging the safety obligations has been described in =-=[DFS04a]-=-. The success of (trusted) ATPs in verifying the resultant proofs is demonstrated here. Section 2 provides the necessary background on the program certification process, and Section 3 describes the se... |

14 | WALDMEISTER: Development of a High Performance Completion-Based Theorem
- BUCH, HILLENBRAND
- 1996
(Show Context)
Citation Context ...ith up to 80 % for the hardest problems. For Waldmeister [LH02] we observe a more modest figure of typically 5–10 % as rewriting with unorientable equations is very restricted in the default settings =-=[BH96]-=-. Even so, for some proof tasks nearly 50 % are needed by the orderings. The aim of this work is the development of an efficient version of the Knuth-Bendix Ordering (KBO) [KB70], one of the orderings... |

13 | A Comparison of Different Techniques for Grounding Near-Propositional CNF Formulae - Schulz - 2002 |

12 |
A Phytography of Waldmeister
- Loechner, Hillenbrand
(Show Context)
Citation Context ...04], Riazanov and Voronkov give a figure of about 40 % on the average for the prover Vampire [RV02] and their straightforward implementation, with up to 80 % for the hardest problems. For Waldmeister =-=[LH02]-=- we observe a more modest figure of typically 5–10 % as rewriting with unorientable equations is very restricted in the default settings [BH96]. Even so, for some proof tasks nearly 50 % are needed by... |

11 | Towards Efficient Subsumption
- Tammet
- 1998
(Show Context)
Citation Context ... be discharged. An advantage of the semantic technique for verifying leaf formulae is that it is robust to some of the preprocessing inferences that are performed by ATP systems. For example, Gandalf =-=[Tam98]-=- may factor and simplify input clauses before storing them in its clause data structure. The leaves of refutations output by Gandalf may thus be derived from input clauses, rather than directly being ... |

10 | Semantic Derivation Verification
- Sutcliffe
(Show Context)
Citation Context ...er-order proof term reconstruction in Isabelle [BN00], higher-order proof step checking in HOL [Won99], reducing proof checking to type checking as in Coq [BC04], and semantic derivation verification =-=[SB05]-=-. Semantic derivation verification has been used in this work. In semantic derivation verification, the required semantic properties of each proof step are encoded in one or more proof check obligatio... |

10 |
Validation of HOL proofs by proof checking
- Wong
- 1999
(Show Context)
Citation Context ...pproaches to proof checking, including the syntactic validation of Otter proof steps by Ivy [MSM00], higher-order proof term reconstruction in Isabelle [BN00], higher-order proof step checking in HOL =-=[Won99]-=-, reducing proof checking to type checking as in Coq [BC04], and semantic derivation verification [SB05]. Semantic derivation verification has been used in this work. In semantic derivation verificati... |

10 |
Rules and Strategies for Program Transformation
- Pettorossi, Proietti
- 1993
(Show Context)
Citation Context ...icted in Figures 1 and 2 which indicates the different dependency on K of the two implementations. 3.3 Deriving a linear version The tupling strategy is a standard approach in program transformations =-=[PP93]-=-. The key insight in optimizing kbo2 is that the tupling strategy allows us not only to combine the calculation of the variable balances and the weights, which is straightforward, but also to include ... |

9 | The CADE-20 automated theorem proving competition - Sutcliffe |

6 | AutoBayes/CC — combining program synthesis with automatic code certification (system description - Whalen, Schumann, et al. - 2002 |

5 |
An Empirical Evaluation of Automated Theorem
- Denney, Fischer, et al.
- 2004
(Show Context)
Citation Context ... is Otter [McC03b], which has been essentially unchanged since 1996. However, previous experiments have shown that its performance is not sufficient for discharging the safety obligations we generate =-=[DFS05]-=-. 2s2 Formal Program Certification Formal program certification is based on the idea that the mathematical proof of some program property can be regarded as an externally verifiable certificate of thi... |

5 |
Things to Know When Implementing LPO
- Löchner
- 2006
(Show Context)
Citation Context ...llowing, however, we will show the derivation of a variant that needs only linear time. Similar to previous work, where we investigated the efficient implementation of the Lexicographic Path Ordering =-=[Löc04]-=-, our approach is based on program transformations [BD77, PP93]: Using a language that is close to functional programming or algebraic specification, we start with some “obviously correct” implementat... |

5 | Termination of Rewriting - Steinbach - 1994 |

4 | A program certification assistant based on fully automated theorem provers
- Denney, Fischer
- 2005
(Show Context)
Citation Context ...r, in order to convince users of the validity of the overall certification process, there needs to be some explicit linking or tracing between the logical entities and the program being certified. In =-=[DF05]-=-, we describe a browser which enables a two-way linking between the verification conditions and the individual statements of the annotated program. We are also developing an extension to the VCG which... |

4 | Efficient checking of term ordering constraints
- Riazanov, Voronkov
- 2004
(Show Context)
Citation Context ...is spent on determining ordering relations can amount to a significant part of the overall running time. Schulz gives an estimation of up to 35 % for the prover E [Sch02] (personal communication). In =-=[RV04]-=-, Riazanov and Voronkov give a figure of about 40 % on the average for the prover Vampire [RV02] and their straightforward implementation, with up to 80 % for the hardest problems. For Waldmeister [LH... |

1 |
The TPTP Problem Library. http://www.TPTP.org
- Sutcliffe, Suttner
(Show Context)
Citation Context ...ns generated from the certification of programs generated by the AUTOBAYES and AUTOFILTER program synthesis systems. Of those 366 problems, 109 were selected for inclusion in the TPTP problem library =-=[SS05]-=-, the standard library of test problem for testing and evaluating ATPs. The 109 problems were selected based on the results of evaluating several state-of-the-art ATPs against the problems, and were s... |