Commit 8daee6f4 authored by Adam Chlipala's avatar Adam Chlipala

PC comments on Reflection and Large

parent 8649f747
(* Copyright (c) 2009, Adam Chlipala (* Copyright (c) 2009-2010, Adam Chlipala
* *
* This work is licensed under a * This work is licensed under a
* Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 * Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
...@@ -18,14 +18,14 @@ Set Implicit Arguments. ...@@ -18,14 +18,14 @@ Set Implicit Arguments.
(** %\chapter{Proving in the Large}% *) (** %\chapter{Proving in the Large}% *)
(** It is somewhat unfortunate that the term "theorem-proving" looks so much like the word "theory." Most researchers and practitioners in software assume that mechanized theorem-proving is profoundly impractical. Indeed, until recently, most advances in theorem-proving for higher-order logics have been largely theoretical. However, starting around the beginning of the 21st century, there was a surge in the use of proof assistants in serious verification efforts. That line of work is still quite new, but I believe it is not too soon to distill some lessons on how to work effectively with large formal proofs. (** It is somewhat unfortunate that the term %``%#"#theorem-proving#"#%''% looks so much like the word %``%#"#theory.#"#%''% Most researchers and practitioners in software assume that mechanized theorem-proving is profoundly impractical. Indeed, until recently, most advances in theorem-proving for higher-order logics have been largely theoretical. However, starting around the beginning of the 21st century, there was a surge in the use of proof assistants in serious verification efforts. That line of work is still quite new, but I believe it is not too soon to distill some lessons on how to work effectively with large formal proofs.
Thus, this chapter gives some tips for structuring and maintaining large Coq developments. *) Thus, this chapter gives some tips for structuring and maintaining large Coq developments. *)
(** * Ltac Anti-Patterns *) (** * Ltac Anti-Patterns *)
(** In this book, I have been following an unusual style, where proofs are not considered finished until they are "fully automated," in a certain sense. SEach such theorem is proved by a single tactic. Since Ltac is a Turing-complete programming language, it is not hard to squeeze arbitrary heuristics into single tactics, using operators like the semicolon to combine steps. In contrast, most Ltac proofs "in the wild" consist of many steps, performed by individual tactics followed by periods. Is it really worth drawing a distinction between proof steps terminated by semicolons and steps terminated by periods? (** In this book, I have been following an unusual style, where proofs are not considered finished until they are %``%#"#fully automated,#"#%''% in a certain sense. Each such theorem is proved by a single tactic. Since Ltac is a Turing-complete programming language, it is not hard to squeeze arbitrary heuristics into single tactics, using operators like the semicolon to combine steps. In contrast, most Ltac proofs %``%#"#in the wild#"#%''% consist of many steps, performed by individual tactics followed by periods. Is it really worth drawing a distinction between proof steps terminated by semicolons and steps terminated by periods?
I argue that this is, in fact, a very important distinction, with serious consequences for a majority of important verification domains. The more uninteresting drudge work a proof domain involves, the more important it is to work to prove theorems with single tactics. From an automation standpoint, single-tactic proofs can be extremely effective, and automation becomes more and more critical as proofs are populated by more uninteresting detail. In this section, I will give some examples of the consequences of more common proof styles. I argue that this is, in fact, a very important distinction, with serious consequences for a majority of important verification domains. The more uninteresting drudge work a proof domain involves, the more important it is to work to prove theorems with single tactics. From an automation standpoint, single-tactic proofs can be extremely effective, and automation becomes more and more critical as proofs are populated by more uninteresting detail. In this section, I will give some examples of the consequences of more common proof styles.
...@@ -129,7 +129,7 @@ Abort. ...@@ -129,7 +129,7 @@ Abort.
(** Can you spot what went wrong, without stepping through the script step-by-step? The problem is that [trivial] never fails. Originally, [trivial] had been succeeding in proving an equality that follows by reflexivity. Our change to [times] leads to a case where that equality is no longer true. [trivial] happily leaves the false equality in place, and we continue on to the span of tactics intended for the second inductive case. Unfortunately, those tactics end up being applied to the %\textit{%#<i>#first#</i>#%}% case instead. (** Can you spot what went wrong, without stepping through the script step-by-step? The problem is that [trivial] never fails. Originally, [trivial] had been succeeding in proving an equality that follows by reflexivity. Our change to [times] leads to a case where that equality is no longer true. [trivial] happily leaves the false equality in place, and we continue on to the span of tactics intended for the second inductive case. Unfortunately, those tactics end up being applied to the %\textit{%#<i>#first#</i>#%}% case instead.
The problem with [trivial] could be "solved" by writing [solve [trivial]] instead, so that an error is signaled early on if something unexpected happens. However, the root problem is that the syntax of a tactic invocation does not imply how many subgoals it produces. Much more confusing instances of this problem are possible. For example, if a lemma [L] is modified to take an extra hypothesis, then uses of [apply L] will general more subgoals than before. Old unstructured proof scripts will become hopelessly jumbled, with tactics applied to inappropriate subgoals. Because of the lack of structure, there is usually relatively little to be gleaned from knowledge of the precise point in a proof script where an error is raised. *) The problem with [trivial] could be %``%#"#solved#"#%''% by writing [solve [trivial]] instead, so that an error is signaled early on if something unexpected happens. However, the root problem is that the syntax of a tactic invocation does not imply how many subgoals it produces. Much more confusing instances of this problem are possible. For example, if a lemma [L] is modified to take an extra hypothesis, then uses of [apply L] will general more subgoals than before. Old unstructured proof scripts will become hopelessly jumbled, with tactics applied to inappropriate subgoals. Because of the lack of structure, there is usually relatively little to be gleaned from knowledge of the precise point in a proof script where an error is raised. *)
Reset times. Reset times.
......
(* Copyright (c) 2008-2009, Adam Chlipala (* Copyright (c) 2008-2010, Adam Chlipala
* *
* This work is licensed under a * This work is licensed under a
* Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 * Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
...@@ -47,7 +47,7 @@ Even_SS ...@@ -47,7 +47,7 @@ Even_SS
]] ]]
%\noindent%...and so on. This procedure always works (at least on machines with infinite resources), but it has a serious drawback, which we see when we print the proof it generates that 256 is even. The final proof term has length linear in the input value. This seems like a shame, since we could write a trivial and trustworthy program to verify evenness of constants. The proof checker could simply call our program where needed. %\noindent%...and so on. This procedure always works (at least on machines with infinite resources), but it has a serious drawback, which we see when we print the proof it generates that 256 is even. The final proof term has length super-linear in the input value. (Coq's implicit arguments mechanism is hiding the values given for parameter [n] of [Even_SS], which is why the proof term only appears linear here.) This seems like a shame, since we could write a trivial and trustworthy program to verify evenness of constants. The proof checker could simply call our program where needed.
It is also unfortunate not to have static typing guarantees that our tactic always behaves appropriately. Other invocations of similar tactics might fail with dynamic type errors, and we would not know about the bugs behind these errors until we happened to attempt to prove complex enough goals. It is also unfortunate not to have static typing guarantees that our tactic always behaves appropriately. Other invocations of similar tactics might fail with dynamic type errors, and we would not know about the bugs behind these errors until we happened to attempt to prove complex enough goals.
...@@ -90,7 +90,7 @@ Definition partialOut (P : Prop) (x : [P]) := ...@@ -90,7 +90,7 @@ Definition partialOut (P : Prop) (x : [P]) :=
| Uncertain => I | Uncertain => I
end. end.
(** It may seem strange to define a function like this. However, it turns out to be very useful in writing a reflective verison of our earlier [prove_even] tactic: *) (** It may seem strange to define a function like this. However, it turns out to be very useful in writing a reflective version of our earlier [prove_even] tactic: *)
Ltac prove_even_reflective := Ltac prove_even_reflective :=
match goal with match goal with
...@@ -98,7 +98,7 @@ Ltac prove_even_reflective := ...@@ -98,7 +98,7 @@ Ltac prove_even_reflective :=
end. end.
(* end thide *) (* end thide *)
(** We identify which natural number we are considering, and we "prove" its evenness by pulling the proof out of the appropriate [check_even] call. *) (** We identify which natural number we are considering, and we %``%#"#prove#"#%''% its evenness by pulling the proof out of the appropriate [check_even] call. *)
Theorem even_256' : isEven 256. Theorem even_256' : isEven 256.
prove_even_reflective. prove_even_reflective.
...@@ -111,7 +111,9 @@ even_256' = partialOut (check_even 256) ...@@ -111,7 +111,9 @@ even_256' = partialOut (check_even 256)
]] ]]
We can see a constant wrapper around the object of the proof. For any even number, this form of proof will suffice. What happens if we try the tactic with an odd number? *) We can see a constant wrapper around the object of the proof. For any even number, this form of proof will suffice. The size of the proof term is now linear in the number being checked, containing two repetitions of the unary form of that number, one of which is hidden above within the implicit argument to [partialOut].
What happens if we try the tactic with an odd number? *)
Theorem even_255 : isEven 255. Theorem even_255 : isEven 255.
(** [[ (** [[
...@@ -158,7 +160,7 @@ and_ind (fun _ _ : True => or_introl (True /\ (True -> True)) I) H ...@@ -158,7 +160,7 @@ and_ind (fun _ _ : True => or_introl (True /\ (True -> True)) I) H
As we might expect, the proof that [tauto] builds contains explicit applications of natural deduction rules. For large formulas, this can add a linear amount of proof size overhead, beyond the size of the input. As we might expect, the proof that [tauto] builds contains explicit applications of natural deduction rules. For large formulas, this can add a linear amount of proof size overhead, beyond the size of the input.
To write a reflective procedure for this class of goals, we will need to get into the actual "reflection" part of "proof by reflection." It is impossible to case-analyze a [Prop] in any way in Gallina. We must %\textit{%#<i>#reflect#</i>#%}% [Prop] into some type that we %\textit{%#<i>#can#</i>#%}% analyze. This inductive type is a good candidate: *) To write a reflective procedure for this class of goals, we will need to get into the actual %``%#"#reflection#"#%''% part of %``%#"#proof by reflection.#"#%''% It is impossible to case-analyze a [Prop] in any way in Gallina. We must %\textit{%#<i>#reflect#</i>#%}% [Prop] into some type that we %\textit{%#<i>#can#</i>#%}% analyze. This inductive type is a good candidate: *)
(* begin thide *) (* begin thide *)
Inductive taut : Set := Inductive taut : Set :=
...@@ -167,7 +169,7 @@ Inductive taut : Set := ...@@ -167,7 +169,7 @@ Inductive taut : Set :=
| TautOr : taut -> taut -> taut | TautOr : taut -> taut -> taut
| TautImp : taut -> taut -> taut. | TautImp : taut -> taut -> taut.
(** We write a recursive function to "unreflect" this syntax back to [Prop]. *) (** We write a recursive function to %``%#"#unreflect#"#%''% this syntax back to [Prop]. *)
Fixpoint tautDenote (t : taut) : Prop := Fixpoint tautDenote (t : taut) : Prop :=
match t with match t with
...@@ -229,12 +231,12 @@ tautTrue ...@@ -229,12 +231,12 @@ tautTrue
]] ]]
It is worth considering how the reflective tactic improves on a pure-Ltac implementation. The formula reflection process is just as ad-hoc as before, so we gain little there. In general, proofs will be more complicated than formula translation, and the "generic proof rule" that we apply here %\textit{%#<i>#is#</i>#%}% on much better formal footing than a recursive Ltac function. The dependent type of the proof guarantees that it "works" on any input formula. This is all in addition to the proof-size improvement that we have already seen. *) It is worth considering how the reflective tactic improves on a pure-Ltac implementation. The formula reflection process is just as ad-hoc as before, so we gain little there. In general, proofs will be more complicated than formula translation, and the %``%#"#generic proof rule#"#%''% that we apply here %\textit{%#<i>#is#</i>#%}% on much better formal footing than a recursive Ltac function. The dependent type of the proof guarantees that it %``%#"#works#"#%''% on any input formula. This is all in addition to the proof-size improvement that we have already seen. *)
(** * A Monoid Expression Simplifier *) (** * A Monoid Expression Simplifier *)
(** Proof by reflection does not require encoding of all of the syntax in a goal. We can insert "variables" in our syntax types to allow injection of arbitrary pieces, even if we cannot apply specialized reasoning to them. In this section, we explore that possibility by writing a tactic for normalizing monoid equations. *) (** Proof by reflection does not require encoding of all of the syntax in a goal. We can insert %``%#"#variables#"#%''% in our syntax types to allow injection of arbitrary pieces, even if we cannot apply specialized reasoning to them. In this section, we explore that possibility by writing a tactic for normalizing monoid equations. *)
Section monoid. Section monoid.
Variable A : Set. Variable A : Set.
...@@ -249,7 +251,7 @@ Section monoid. ...@@ -249,7 +251,7 @@ Section monoid.
(** We add variables and hypotheses characterizing an arbitrary instance of the algebraic structure of monoids. We have an associative binary operator and an identity element for it. (** We add variables and hypotheses characterizing an arbitrary instance of the algebraic structure of monoids. We have an associative binary operator and an identity element for it.
It is easy to define an expression tree type for monoid expressions. A [Var] constructor is a "catch-all" case for subexpressions that we cannot model. These subexpressions could be actual Gallina variables, or they could just use functions that our tactic is unable to understand. *) It is easy to define an expression tree type for monoid expressions. A [Var] constructor is a %``%#"#catch-all#"#%''% case for subexpressions that we cannot model. These subexpressions could be actual Gallina variables, or they could just use functions that our tactic is unable to understand. *)
(* begin thide *) (* begin thide *)
Inductive mexp : Set := Inductive mexp : Set :=
...@@ -257,7 +259,7 @@ Section monoid. ...@@ -257,7 +259,7 @@ Section monoid.
| Var : A -> mexp | Var : A -> mexp
| Op : mexp -> mexp -> mexp. | Op : mexp -> mexp -> mexp.
(** Next, we write an "un-reflect" function. *) (** Next, we write an %``%#"#un-reflect#"#%''% function. *)
Fixpoint mdenote (me : mexp) : A := Fixpoint mdenote (me : mexp) : A :=
match me with match me with
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment