Commit c025fcf4 authored by Adam Chlipala's avatar Adam Chlipala

Prose for OpSem

parent 9f264135
...@@ -18,22 +18,75 @@ Set Implicit Arguments. ...@@ -18,22 +18,75 @@ Set Implicit Arguments.
(** %\chapter{Higher-Order Operational Semantics}% *) (** %\chapter{Higher-Order Operational Semantics}% *)
(** The last few chapters have shown how PHOAS can make it relatively painless to reason about program transformations. Each of our example languages so far has had a semantics that is easy to implement with an interpreter in Gallina. Since Gallina is designed to rule out non-termination, we cannot hope to give interpreter-based semantics to Turing-complete programming languages. Falling back on standard operational semantics leaves us with the old bureaucratic concerns about capture-avoiding substitution. Can we encode Turing-complete, higher-order languages in Coq without sacrificing the advantages of higher-order encoding?
Any approach that applies to basic untyped lambda calculus is likely to extend to most object languages of interest. We can attempt the "obvious" way of equipping a PHOAS definition for use in an operational semantics, without mentioning substitution explicitly. Specifically, we try to work with expressions with [var] instantiated with a type of values. *)
Section exp.
Variable var : Type.
Inductive exp : Type :=
| Var : var -> exp
| App : exp -> exp -> exp
| Abs : (var -> exp) -> exp.
End exp.
(** [[
Inductive val : Type :=
| VAbs : (val -> exp val) -> val.
Error: Non strictly positive occurrence of "val" in
"(val -> exp val) -> val".
]]
We would like to represent values (which are all function abstractions) as functions from variables to expressions, where we represent variables as the same value type that we are defining. That way, a value can be substituted in a function body simply by applying the body to the value. Unfortunately, the positivity restriction rejects this definition, for much the same reason that we could not use the classical HOAS encoding.
We can try an alternate approach based on defining [val] like a usual class of syntax. *)
Section val.
Variable var : Type.
Inductive val : Type :=
| VAbs : (var -> exp var) -> val.
End val.
(** Now the puzzle is how to write the type of an expression whose variables are represented as values. We would like to be able to write a recursive definition like this one:
[[
Fixpoint expV := exp (val expV).
]]
Of course, this kind of definition is not structurally recursive, so Coq will not allow it. Getting "substitution for free" seems to require some similar kind of self-reference.
In this chapter, we will consider an alternate take on the problem. We add a level of indirection, introducing more explicit syntax to break the cycle in type definitions. Specifically, we represent function values as numbers that index into a %\emph{%#<i>#closure heap#</i>#%}% that our operational semantics maintains alongside the expression being evaluated. *)
(** * Closure Heaps *) (** * Closure Heaps *)
(** The essence of the technique is to store function bodies in lists that are extended monotonically as function abstractions are evaluated. We can define a set of functions and theorems that implement the core functionality generically. *)
Section lookup. Section lookup.
Variable A : Type. Variable A : Type.
(** We start with a [lookup] function that generalizes last chapter's function of the same name. It selects the element at a particular position in a list, where we number the elements starting from the end of the list, so that prepending new elements does not change the indices of old elements. *)
Fixpoint lookup (ls : list A) (n : nat) : option A := Fixpoint lookup (ls : list A) (n : nat) : option A :=
match ls with match ls with
| nil => None | nil => None
| v :: ls' => if eq_nat_dec n (length ls') then Some v else lookup ls' n | v :: ls' => if eq_nat_dec n (length ls') then Some v else lookup ls' n
end. end.
Definition extends (ls1 ls2 : list A) := exists ls, ls2 = ls ++ ls1.
Infix "##" := lookup (left associativity, at level 1). Infix "##" := lookup (left associativity, at level 1).
(** The second of our two definitions expresses when one list extends another. We will write [ls1 ~> ls2] to indicate that [ls1] could evolve into [ls2]; that is, [ls1] is a suffix of [ls2]. *)
Definition extends (ls1 ls2 : list A) := exists ls, ls2 = ls ++ ls1.
Infix "~>" := extends (no associativity, at level 80). Infix "~>" := extends (no associativity, at level 80).
(** We prove and add as hints a few basic theorems about [lookup] and [extends]. *)
Theorem lookup1 : forall x ls, Theorem lookup1 : forall x ls,
(x :: ls) ## (length ls) = Some x. (x :: ls) ## (length ls) = Some x.
crush; match goal with crush; match goal with
...@@ -84,10 +137,16 @@ Infix "~>" := extends (no associativity, at level 80). ...@@ -84,10 +137,16 @@ Infix "~>" := extends (no associativity, at level 80).
Hint Resolve lookup1 extends_refl extends1 extends_trans extends_lookup. Hint Resolve lookup1 extends_refl extends1 extends_trans extends_lookup.
(** We are dealing explicitly with the nitty-gritty of closure heaps. Why is this better than dealing with the nitty-gritty of variables? The inconvenience of modeling lambda calculus-style binders comes from the presence of nested scopes. Program evaluation will only involve one %\emph{%#<i>#global#</i>#%}% closure heap. Also, the short development that we just finished can be reused for many different object languages. None of these definitions or theorems needs to be redone to handle specific object language features. By adding the theorems as hints, no per-object-language effort is required to apply the critical facts as needed. *)
(** * Languages and Translation *) (** * Languages and Translation *)
(** For the rest of this chapter, we will consider the example of CPS translation for untyped lambda calculus with boolean constants. It is convenient to include these constants, because their presence makes it easy to state a final translation correctness theorem. *)
Module Source. Module Source.
(** We define the syntax of source expressions in our usual way. *)
Section exp. Section exp.
Variable var : Type. Variable var : Type.
...@@ -102,13 +161,19 @@ Module Source. ...@@ -102,13 +161,19 @@ Module Source.
Definition Exp := forall var, exp var. Definition Exp := forall var, exp var.
(** We will implement a big-step operational semantics, where expressions are mapped to values. A value is either a function or a boolean. We represent a function as a number that will be interpreted as an index into the global closure heap. *)
Inductive val : Set := Inductive val : Set :=
| VFun : nat -> val | VFun : nat -> val
| VBool : bool -> val. | VBool : bool -> val.
(** A closure, then, follows the usual representation of function abstraction bodies, where we represent variables as values. *)
Definition closure := val -> exp val. Definition closure := val -> exp val.
Definition closures := list closure. Definition closures := list closure.
(** Our evaluation relation has four places. We map an initial closure heap and an expression into a final closure heap and a value. The interesting cases are for [Abs], where we push the body onto the closure heap; and for [App], where we perform a lookup in a closure heap, to find the proper function body to execute next. *)
Inductive eval : closures -> exp val -> closures -> val -> Prop := Inductive eval : closures -> exp val -> closures -> val -> Prop :=
| EvVar : forall cs v, | EvVar : forall cs v,
eval cs (Var v) cs v eval cs (Var v) cs v
...@@ -126,7 +191,12 @@ Module Source. ...@@ -126,7 +191,12 @@ Module Source.
| EvBool : forall cs b, | EvBool : forall cs b,
eval cs (Bool b) cs (VBool b). eval cs (Bool b) cs (VBool b).
Definition Eval (cs1 : closures) (E : Exp) (cs2 : closures) (v : val) := eval cs1 (E _) cs2 v. (** A simple wrapper produces an evaluation relation suitable for use on the main expression type [Exp]. *)
Definition Eval (cs1 : closures) (E : Exp) (cs2 : closures) (v : val) :=
eval cs1 (E _) cs2 v.
(** To prove our translation's correctness, we will need the usual notions of expression equivalence and well-formedness. *)
Section exp_equiv. Section exp_equiv.
Variables var1 var2 : Type. Variables var1 var2 : Type.
...@@ -151,6 +221,8 @@ Module Source. ...@@ -151,6 +221,8 @@ Module Source.
Definition Wf (E : Exp) := forall var1 var2, exp_equiv nil (E var1) (E var2). Definition Wf (E : Exp) := forall var1 var2, exp_equiv nil (E var1) (E var2).
End Source. End Source.
(** Our target language can be defined without introducing any additional tricks. *)
Module CPS. Module CPS.
Section exp. Section exp.
Variable var : Type. Variable var : Type.
...@@ -218,6 +290,8 @@ End CPS. ...@@ -218,6 +290,8 @@ End CPS.
Import Source CPS. Import Source CPS.
(** Finally, we define a CPS translation in the same way as in our previous example for simply-typed lambda calculus. *)
Reserved Notation "x <-- e1 ; e2" (right associativity, at level 76, e1 at next level). Reserved Notation "x <-- e1 ; e2" (right associativity, at level 76, e1 at next level).
Section cpsExp. Section cpsExp.
...@@ -259,6 +333,8 @@ Definition CpsExp (E : Exp) : Prog := fun _ => cpsExp (E _) (Halt (var := _)). ...@@ -259,6 +333,8 @@ Definition CpsExp (E : Exp) : Prog := fun _ => cpsExp (E _) (Halt (var := _)).
(** * Correctness Proof *) (** * Correctness Proof *)
(** Our proof for simply-typed lambda calculus relied on a logical relation to state the key induction hypothesis. Since logical relations proceed by recursion on type structure, we cannot apply them directly in an untyped setting. Instead, we will use an inductive judgment to relate source-level and CPS-level values. First, it is helpful to define an abbreviation for the compiled version of a function body. *)
Definition cpsFunc var (e' : var -> Source.exp var) := Definition cpsFunc var (e' : var -> Source.exp var) :=
fun p : var => fun p : var =>
x <- Fst p; x <- Fst p;
...@@ -266,34 +342,42 @@ Definition cpsFunc var (e' : var -> Source.exp var) := ...@@ -266,34 +342,42 @@ Definition cpsFunc var (e' : var -> Source.exp var) :=
r <-- e' x; r <-- e' x;
CPS.App kf r. CPS.App kf r.
(** Now we can define our correctness relation [cr], which is parameterized by source-level and CPS-level closure heaps. *)
Section cr. Section cr.
Variable s1 : Source.closures. Variable s1 : Source.closures.
Variable s2 : CPS.closures. Variable s2 : CPS.closures.
Import Source. Import Source.
(** Only equal booleans are related. For two function addresses [l1] and [l2] to be related, they must point to valid functions in their respective closure heaps. The address [l1] must point to a function [f1], and [l2] must point to the result of compiling function [f2]. Further, [f1] and [f2] must be equivalent syntactically in some variable environment [G], and every variable pair in [G] must itself belong to the relation we are defining. *)
Inductive cr : Source.val -> CPS.val -> Prop := Inductive cr : Source.val -> CPS.val -> Prop :=
| CrBool : forall b,
cr (Source.VBool b) (CPS.VBool b)
| CrFun : forall l1 l2 G f1 f2, | CrFun : forall l1 l2 G f1 f2,
(forall x1 x2, exp_equiv ((x1, x2) :: G) (f1 x1) (f2 x2)) (forall x1 x2, exp_equiv ((x1, x2) :: G) (f1 x1) (f2 x2))
-> (forall x1 x2, In (x1, x2) G -> cr x1 x2) -> (forall x1 x2, In (x1, x2) G -> cr x1 x2)
-> s1 ## l1 = Some f1 -> s1 ## l1 = Some f1
-> s2 ## l2 = Some (cpsFunc f2) -> s2 ## l2 = Some (cpsFunc f2)
-> cr (Source.VFun l1) (CPS.VFun l2) -> cr (Source.VFun l1) (CPS.VFun l2).
| CrBool : forall b,
cr (Source.VBool b) (CPS.VBool b).
End cr. End cr.
Notation "s1 & s2 |-- v1 ~~ v2" := (cr s1 s2 v1 v2) (no associativity, at level 70). Notation "s1 & s2 |-- v1 ~~ v2" := (cr s1 s2 v1 v2) (no associativity, at level 70).
Hint Constructors cr. Hint Constructors cr.
(** To prove our main lemma, it will be useful to know that source-level evaluation never removes old closures from a closure heap. *)
Lemma eval_monotone : forall cs1 e cs2 v, Lemma eval_monotone : forall cs1 e cs2 v,
Source.eval cs1 e cs2 v Source.eval cs1 e cs2 v
-> cs1 ~> cs2. -> cs1 ~> cs2.
induction 1; crush; eauto. induction 1; crush; eauto.
Qed. Qed.
(** Further, [cr] continues to hold when its closure heap arguments are evolved in legal ways. *)
Lemma cr_monotone : forall cs1 cs2 cs1' cs2', Lemma cr_monotone : forall cs1 cs2 cs1' cs2',
cs1 ~> cs1' cs1 ~> cs1'
-> cs2 ~> cs2' -> cs2 ~> cs2'
...@@ -304,6 +388,8 @@ Qed. ...@@ -304,6 +388,8 @@ Qed.
Hint Resolve eval_monotone cr_monotone. Hint Resolve eval_monotone cr_monotone.
(** We state a trivial fact about the validity of variable environments, so that we may add this fact as a hint that [eauto] will apply. *)
Lemma push : forall G s1 s2 v1' v2', Lemma push : forall G s1 s2 v1' v2',
(forall v1 v2, In (v1, v2) G -> s1 & s2 |-- v1 ~~ v2) (forall v1 v2, In (v1, v2) G -> s1 & s2 |-- v1 ~~ v2)
-> s1 & s2 |-- v1' ~~ v2' -> s1 & s2 |-- v1' ~~ v2'
...@@ -313,6 +399,8 @@ Qed. ...@@ -313,6 +399,8 @@ Qed.
Hint Resolve push. Hint Resolve push.
(** Our final preparation for the main lemma involves adding effective hints about the CPS language's operational semantics. The following tactic performs one step of evaluation. It uses the Ltac code [eval hnf in e] to compute the %\emph{%#<i>#head normal form#</i>#%}% of [e], where the head normal form of an expression in an inductive type is an application of one of that inductive type's constructors. The final line below uses [solve] to ensure that we only take a [Bind] step if a full evaluation derivation for the associated primop may be found before proceeding. *)
Ltac evalOne := Ltac evalOne :=
match goal with match goal with
| [ |- CPS.eval ?cs ?e ?v ] => | [ |- CPS.eval ?cs ?e ?v ] =>
...@@ -321,9 +409,15 @@ Ltac evalOne := ...@@ -321,9 +409,15 @@ Ltac evalOne :=
econstructor; [ solve [ eauto ] | ] econstructor; [ solve [ eauto ] | ]
end. end.
(** For primops, we rely on [eauto]'s usual approach. For goals that evaluate programs, we instead ask to treat one or more applications of [evalOne] as a single step, which helps us avoid passing [eauto] an excessively large bound on proof tree depth. *)
Hint Constructors evalP. Hint Constructors evalP.
Hint Extern 1 (CPS.eval _ _ _) => evalOne; repeat evalOne. Hint Extern 1 (CPS.eval _ _ _) => evalOne; repeat evalOne.
(** The final lemma proceeds by induction on an evaluation derivation for an expression [e1] that is equivalent to some [e2] in some environment [G]. An initial closure heap for each language is quantified over, such that all variable pairs in [G] are compatible. The lemma's conclusion applies to an arbitrary continuation [k], asserting that a final CPS-level closure heap [s2] and a CPS-level program result value [r2] exist.
Three conditions establish that [s2] and [r2] are chosen properly: Evaluation of [e2]'s compilation with continuation [k] must be equivalent to evaluation of [k r2]. The original program result [r1] must be compatible with [r2] in the final closure heaps. Finally, [s2'] must be a proper evolution of the original CPS-level heap [s2]. *)
Lemma cpsExp_correct : forall s1 e1 s1' r1, Lemma cpsExp_correct : forall s1 e1 s1' r1,
Source.eval s1 e1 s1' r1 Source.eval s1 e1 s1' r1
-> forall G (e2 : exp CPS.val), -> forall G (e2 : exp CPS.val),
...@@ -335,6 +429,13 @@ Lemma cpsExp_correct : forall s1 e1 s1' r1, ...@@ -335,6 +429,13 @@ Lemma cpsExp_correct : forall s1 e1 s1' r1,
-> CPS.eval s2 (cpsExp e2 k) r) -> CPS.eval s2 (cpsExp e2 k) r)
/\ s1' & s2' |-- r1 ~~ r2 /\ s1' & s2' |-- r1 ~~ r2
/\ s2 ~> s2'. /\ s2 ~> s2'.
(** The proof script follows our standard approach. Its main loop applies three hints. First, we perform inversion on any derivation of equivalence between a source-level function value and some other value. Second, we eliminate redundant equality hypotheses. Finally, we look for opportunities to instantiate inductive hypotheses.
We identify an IH by its syntactic form, noting the expression [E] that it applies to. It is important to instantiate IHes in the right order, since existentially-quantified variables in the conclusion of one IH may need to be used in instantiating the universal quantifiers of a different IH. Thus, we perform a quick check to [fail 1] if the IH we found applies to an expression that was evaluated after another expression [E'] whose IH we did not yet instantiate. The flow of closure heaps through source-level evaluation is used to implement the check.
If the hypothesis [H] is indeed the right IH to handle next, we use the [guess] tactic to guess values for its universal quantifiers and prove its hypotheses with [eauto]. This tactic is very similar to [inster] from Chapter 12. It takes two arguments: the first is a value to use for any properly-typed universal quantifier, and the second is the hypothesis to instantiate. The final inner [match] deduces if we are at the point of executing the body of a called function. If so, we help [guess] by saying that the initial closure heap will be the current closure heap [cs] extended with the current continuation [k]. In all other cases, [guess] is smart enough to operate alone. *)
induction 1; inversion 1; crush; induction 1; inversion 1; crush;
repeat (match goal with repeat (match goal with
| [ H : _ & _ |-- Source.VFun _ ~~ _ |- _ ] => inversion H; clear H | [ H : _ & _ |-- Source.VFun _ ~~ _ |- _ ] => inversion H; clear H
...@@ -344,7 +445,8 @@ Lemma cpsExp_correct : forall s1 e1 s1' r1, ...@@ -344,7 +445,8 @@ Lemma cpsExp_correct : forall s1 e1 s1' r1,
| [ _ : Source.eval ?CS E _ _, _ : Source.eval _ ?E' ?CS _, | [ _ : Source.eval ?CS E _ _, _ : Source.eval _ ?E' ?CS _,
_ : forall G e2, exp_equiv G ?E' e2 -> _ |- _ ] => fail 1 _ : forall G e2, exp_equiv G ?E' e2 -> _ |- _ ] => fail 1
| _ => match goal with | _ => match goal with
| [ k : val -> prog val, _ : _ & ?cs |-- _ ~~ _, _ : context[VFun] |- _ ] => | [ k : val -> prog val, _ : _ & ?cs |-- _ ~~ _,
_ : context[VFun] |- _ ] =>
guess (k :: cs) H guess (k :: cs) H
| _ => guess tt H | _ => guess tt H
end end
...@@ -352,7 +454,9 @@ Lemma cpsExp_correct : forall s1 e1 s1' r1, ...@@ -352,7 +454,9 @@ Lemma cpsExp_correct : forall s1 e1 s1' r1,
end; crush); eauto 13. end; crush); eauto 13.
Qed. Qed.
Lemma CpsExp_correct : forall E cs b, (** The final theorem follows easily from this lemma. *)
Theorem CpsExp_correct : forall E cs b,
Source.Eval nil E cs (Source.VBool b) Source.Eval nil E cs (Source.VBool b)
-> Wf E -> Wf E
-> CPS.Eval nil (CpsExp E) (CPS.VBool b). -> CPS.Eval nil (CpsExp E) (CPS.VBool b).
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment