Commit 69d15854 authored by Adam Chlipala's avatar Adam Chlipala

Remove Part IV

parent 6d5b22fd
......@@ -2,7 +2,7 @@ MODULES_NODOC := CpdtTactics MoreSpecif DepList
MODULES_PROSE := Intro
MODULES_CODE := StackMachine InductiveTypes Predicates Coinductive Subset \
MoreDep DataStruct Equality Generic Universes LogicProg Match Reflection \
Large Firstorder DeBruijn Hoas Interps Extensional Intensional OpSem
Large
MODULES_DOC := $(MODULES_PROSE) $(MODULES_CODE)
MODULES := $(MODULES_NODOC) $(MODULES_DOC)
VS := $(MODULES:%=src/%.v)
......
......@@ -50,13 +50,6 @@ The license text is available at:
\include{Match.v}
\include{Reflection.v}
\include{Large.v}
\include{Firstorder.v}
\include{DeBruijn.v}
\include{Hoas.v}
\include{Interps.v}
\include{Extensional.v}
\include{Intensional.v}
\include{OpSem.v}
\clearpage
\addcontentsline{toc}{chapter}{Bibliography}
......
(* Copyright (c) 2009-2010, Adam Chlipala
*
* This work is licensed under a
* Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
* Unported License.
* The license text is available at:
* http://creativecommons.org/licenses/by-nc-nd/3.0/
*)
(* begin hide *)
Set Implicit Arguments.
Require Import Eqdep.
Require Import DepList CpdtTactics.
(* end hide *)
(** remove printing * *)
(** printing || $\mid\mid$ *)
(** %\chapter{Dependent De Bruijn Indices}% *)
(** The previous chapter introduced the most common form of de Bruijn indices, without essential use of dependent types. In earlier chapters, we used dependent de Bruijn indices to illustrate tricks for working with dependent types. This chapter presents one complete case study with dependent de Bruijn indices, focusing on producing the most maintainable proof possible of a classic theorem about lambda calculus.
The proof that follows does not provide a complete guide to all kinds of formalization with de Bruijn indices. Rather, it is intended as an example of some simple design patterns that can make proving standard theorems much easier.
We will prove commutativity of capture-avoiding substitution for basic untyped lambda calculus:
%$$x_1 \neq x_2 \Rightarrow [e_1/x_1][e_2/x_2]e = [e_2/x_2][[e_2/x_2]e_1/x_1]e$$%
#<tt>x1 <> x2 => [e1/x1][e2/x2]e = [e2/x2][[e2/x2]e1/x1]e</tt>#
*)
(** * Defining Syntax and Its Associated Operations *)
(** Our definition of expression syntax should be unsurprising. An expression of type [exp n] may refer to [n] different free variables. *)
Inductive exp : nat -> Type :=
| Var : forall n, fin n -> exp n
| App : forall n, exp n -> exp n -> exp n
| Abs : forall n, exp (S n) -> exp n.
(** The classic implementation of substitution in de Bruijn terms requires an auxiliary operation, %\emph{%lifting%}%, which increments the indices of all free variables in an expression. We need to lift whenever we %``%#"#go under a binder.#"#%''% It is useful to write an auxiliary function [liftVar] that lifts a variable; that is, [liftVar x y] will return [y + 1] if [y >= x], and it will return [y] otherwise. This simple description uses numbers rather than our dependent [fin] family, so the actual specification is more involved.
Combining a number of dependent types tricks, we wind up with this concrete realization. *)
Fixpoint liftVar n (x : fin n) : fin (pred n) -> fin n :=
match x with
| First _ => fun y => Next y
| Next _ x' => fun y =>
match y in fin n' return fin n' -> (fin (pred n') -> fin n')
-> fin (S n') with
| First _ => fun x' _ => First
| Next _ y' => fun _ fx' => Next (fx' y')
end x' (liftVar x')
end.
(** Now it is easy to implement the main lifting operation. *)
Fixpoint lift n (e : exp n) : fin (S n) -> exp (S n) :=
match e with
| Var _ f' => fun f => Var (liftVar f f')
| App _ e1 e2 => fun f => App (lift e1 f) (lift e2 f)
| Abs _ e1 => fun f => Abs (lift e1 (Next f))
end.
(** To define substitution itself, we will need to apply some explicit type casts, based on equalities between types. A single equality will suffice for all of our casts. Its statement is somewhat strange: it quantifies over a variable [f] of type [fin n], but then never mentions [f]. Rather, quantifying over [f] is useful because [fin] is a dependent type that is inhabited or not depending on its index. The body of the theorem, [S (pred n) = n], is true only for [n] $> 0$, but we can prove it by contradiction when [n = 0], because we have around a value [f] of the uninhabited type [fin 0]. *)
Theorem nzf : forall n (f : fin n), S (pred n) = n.
destruct 1; trivial.
Qed.
(** Now we define a notation to streamline our cast expressions. The code [[f return n, r for e]] denotes a cast of expression [e] whose type can be obtained by substituting some number [n1] for [n] in [r]. [f] should be a proof that [n1 = n2], for any [n2]. In that case, the type of the cast expression is [r] with [n2] substituted for [n]. *)
Notation "[ f 'return' n , r 'for' e ]" :=
match f in _ = n return r with
| refl_equal => e
end.
(** This notation is useful in defining a variable substitution operation. The idea is that [substVar x y] returns [None] if [x = y]; otherwise, it returns a %``%#"#squished#"#%''% version of [y] with a smaller [fin] index, reflecting that variable [x] has been substituted away. Without dependent types, this would be a simple definition. With dependency, it is reasonably intricate, and our main task in automating proofs about it will be hiding that intricacy. *)
Fixpoint substVar n (x : fin n) : fin n -> option (fin (pred n)) :=
match x with
| First _ => fun y =>
match y in fin n' return option (fin (pred n')) with
| First _ => None
| Next _ f' => Some f'
end
| Next _ x' => fun y =>
match y in fin n'
return fin (pred n') -> (fin (pred n') -> option (fin (pred (pred n'))))
-> option (fin (pred n')) with
| First _ => fun x' _ => Some [nzf x' return n, fin n for First]
| Next _ y' => fun _ fx' =>
match fx' y' with
| None => None
| Some f => Some [nzf y' return n, fin n for Next f]
end
end x' (substVar x')
end.
(** It is now easy to define our final substitution function. The abstraction case involves two casts, where one uses the [sym_eq] function to convert a proof of [n1 = n2] into a proof of [n2 = n1]. *)
Fixpoint subst n (e : exp n) : fin n -> exp (pred n) -> exp (pred n) :=
match e with
| Var _ f' => fun f v => match substVar f f' with
| None => v
| Some f'' => Var f''
end
| App _ e1 e2 => fun f v => App (subst e1 f v) (subst e2 f v)
| Abs _ e1 => fun f v => Abs [sym_eq (nzf f) return n, exp n for
subst e1 (Next f) [nzf f return n, exp n for lift v First]]
end.
(** Our final commutativity theorem is about [subst], but our proofs will rely on a few more auxiliary definitions. First, we will want an operation [more] that increments the index of a [fin] while preserving its interpretation as a number. *)
Fixpoint more n (f : fin n) : fin (S n) :=
match f with
| First _ => First
| Next _ f' => Next (more f')
end.
(** Second, we will want a kind of inverse to [liftVar]. *)
Fixpoint unliftVar n (f : fin n) : fin (pred n) -> fin (pred n) :=
match f with
| First _ => fun g => [nzf g return n, fin n for First]
| Next _ f' => fun g =>
match g in fin n'
return fin n' -> (fin (pred n') -> fin (pred n')) -> fin n' with
| First _ => fun f' _ => f'
| Next _ g' => fun _ unlift => Next (unlift g')
end f' (unliftVar f')
end.
(** * Custom Tactics *)
(* begin hide *)
Ltac simp := repeat progress (crush; try discriminate;
repeat match goal with
| [ |- context[match ?pf with refl_equal => _ end] ] => rewrite (UIP_refl _ _ pf)
| [ _ : context[match ?pf with refl_equal => _ end] |- _ ] => rewrite (UIP_refl _ _ pf) in *
| [ H : Next _ = Next _ |- _ ] => injection H; clear H
| [ H : ?E = _, H' : ?E = _ |- _ ] => rewrite H in H'
| [ H : ?P |- _ ] => rewrite H in *; [match P with
| forall x, _ => clear H
| _ => idtac
end]
end).
(* end hide *)
(** Less than a page of tactic code will be sufficient to automate our proof of commutativity. We start by defining a workhorse simplification tactic [simp], which extends [crush] in a few ways.
[[
Ltac simp := repeat progress (crush; try discriminate;
]]
We enter an inner loop of applying hints specific to our domain.
[[
repeat match goal with
]]
Our first two hints find places where equality proofs are pattern-matched on. The first hint matches pattern-matches in the conclusion, while the second hint matches pattern-matches in hypotheses. In each case, we apply the library theorem [UIP_refl], which says that any proof of a fact like [e = e] is itself equal to [refl_equal]. Rewriting with this fact enables reduction of the pattern-match that we found.
[[
| [ |- context[match ?pf with refl_equal => _ end] ] =>
rewrite (UIP_refl _ _ pf)
| [ _ : context[match ?pf with refl_equal => _ end] |- _ ] =>
rewrite (UIP_refl _ _ pf) in *
]]
The next hint finds an opportunity to invert a [fin] equality hypothesis.
[[
| [ H : Next _ = Next _ |- _ ] => injection H; clear H
]]
If we have two equality hypotheses that share a lefthand side, we can use one to rewrite the other, bringing the hypotheses' righthand sides together in a single equation.
[[
| [ H : ?E = _, H' : ?E = _ |- _ ] => rewrite H in H'
]]
Finally, we would like automatic use of quantified equality hypotheses to perform rewriting. We pattern-match a hypothesis [H] asserting proposition [P]. We try to use [H] to perform rewriting everywhere in our goal. The rewrite succeeds if it generates no additional hypotheses, and, to prevent infinite loops in proof search, we clear [H] if it begins with universal quantification.
[[
| [ H : ?P |- _ ] => rewrite H in *; [match P with
| forall x, _ => clear H
| _ => idtac
end]
end).
]]
In implementing another level of automation, it will be useful to mark which free variables we generated with tactics, as opposed to which were present in the original theorem statement. We use a dummy marker predicate [Generated] to record that information. A tactic [not_generated] fails if and only if its argument is a generated variable, and a tactic [generate] records that its argument is generated. *)
Definition Generated n (_ : fin n) := True.
Ltac not_generated x :=
match goal with
| [ _ : Generated x |- _ ] => fail 1
| _ => idtac
end.
Ltac generate x := assert (Generated x); [ constructor | ].
(** A tactic [destructG] performs case analysis on [fin] values. The built-in case analysis tactics are not smart enough to handle all situations, and we also want to mark new variables as generated, to avoid infinite loops of case analysis. Our [destructG] tactic will only proceed if its argument is not generated. *)
Theorem fin_inv : forall n (f : fin (S n)), f = First \/ exists f', f = Next f'.
intros; dep_destruct f; eauto.
Qed.
Ltac destructG E :=
not_generated E; let x := fresh "x" in
(destruct (fin_inv E) as [ | [x]] || destruct E as [ | ? x]);
[ | generate x ].
(* begin hide *)
Ltac dester := simp;
repeat (match goal with
| [ x : fin _, H : _ = _, IH : forall f : fin _, _ |- _ ] =>
generalize (IH x); clear IH; intro IH; rewrite H in IH
| [ x : fin _, y : fin _, H : _ = _, IH : forall (f : fin _) (g : fin _), _ |- _ ] =>
generalize (IH x y); clear IH; intro IH; rewrite H in IH
| [ |- context[match ?E with First _ => _ | Next _ _ => _ end] ] => destructG E
| [ _ : context[match ?E with First _ => _ | Next _ _ => _ end] |- _ ] => destructG E
| [ |- context[more ?E] ] => destructG E
| [ x : fin ?n |- _ ] =>
match goal with
| [ |- context[nzf x] ] =>
destruct n; [ inversion x | ]
end
| [ x : fin (pred ?n), y : fin ?n |- _ ] =>
match goal with
| [ |- context[nzf x] ] =>
destruct n; [ inversion y | ]
end
| [ |- context[match ?E with None => _ | Some _ => _ end] ] =>
match E with
| match _ with None => _ | Some _ => _ end => fail 1
| _ => case_eq E; firstorder
end
end; simp); eauto.
(* end hide *)
(** Our most powerful workhorse tactic will be [dester], which incorporates all of [simp]'s simplifications and adds heuristics for automatic case analysis and automatic quantifier instantiation.
[[
Ltac dester := simp;
repeat (match goal with
]]
The first hint expresses our main insight into quantifier instantiation. We identify a hypothesis [IH] that begins with quantification over [fin] values. We also identify a free [fin] variable [x] and an arbitrary equality hypothesis [H]. Given these, we try instantiating [IH] with [x]. We know we chose correctly if the instantiated proposition includes an opportunity to rewrite using [H].
[[
| [ x : fin _, H : _ = _, IH : forall f : fin _, _ |- _ ] =>
generalize (IH x); clear IH; intro IH; rewrite H in IH
]]
This basic idea suffices for all of our explicit quantifier instantiation. We add one more variant that handles cases where an opportunity for rewriting is only exposed if two different quantifiers are instantiated at once.
[[
| [ x : fin _, y : fin _, H : _ = _,
IH : forall (f : fin _) (g : fin _), _ |- _ ] =>
generalize (IH x y); clear IH; intro IH; rewrite H in IH
]]
We want to case-analyze on any [fin] expression that is the discriminee of a [match] expression or an argument to [more].
[[
| [ |- context[match ?E with First _ => _ | Next _ _ => _ end] ] =>
destructG E
| [ _ : context[match ?E with First _ => _ | Next _ _ => _ end] |- _ ] =>
destructG E
| [ |- context[more ?E] ] => destructG E
]]
Recall that [simp] will simplify equality proof terms of facts like [e = e]. The proofs in question will either be of [n = S (pred n)] or [S (pred n) = n], for some [n]. These equations do not have syntactically equal sides. We can get to the point where they %\emph{%#<i>#do#</i>#%}% have equal sides by performing case analysis on [n]. Whenever we do so, the [n = 0] case will be contradictory, allowing us to discharge it by finding a free variable of type [fin 0] and performing inversion on it. In the [n = S n'] case, the sides of these equalities will simplify to equal values, as needed. The next two hints identify [n] values that are good candidates for such case analysis.
[[
| [ x : fin ?n |- _ ] =>
match goal with
| [ |- context[nzf x] ] =>
destruct n; [ inversion x | ]
end
| [ x : fin (pred ?n), y : fin ?n |- _ ] =>
match goal with
| [ |- context[nzf x] ] =>
destruct n; [ inversion y | ]
end
]]
Finally, we find [match] discriminees of [option] type, enforcing that we do not destruct any discriminees that are themselves [match] expressions. Crucially, we do these case analyses with [case_eq] instead of [destruct]. The former adds equality hypotheses to record the relationships between old variables and their new deduced forms. These equalities will be used by our quantifier instantiation heuristic.
[[
| [ |- context[match ?E with None => _ | Some _ => _ end] ] =>
match E with
| match _ with None => _ | Some _ => _ end => fail 1
| _ => case_eq E; firstorder
end
]]
Each iteration of the loop ends by calling [simp] again, and, after no more progress can be made, we finish by calling [eauto].
[[
end; simp); eauto.
]]
*)
(** * Theorems *)
(** We are now ready to prove our main theorem, by way of a progression of lemmas.
The first pair of lemmas characterizes the interaction of substitution and lifting at the variable level. *)
Lemma substVar_unliftVar : forall n (f0 : fin n) f g,
match substVar f0 f, substVar (liftVar f0 g) f with
| Some f1, Some f2 => exists f', substVar g f1 = Some f'
/\ substVar (unliftVar f0 g) f2 = Some f'
| Some f1, None => substVar g f1 = None
| None, Some f2 => substVar (unliftVar f0 g) f2 = None
| None, None => False
end.
induction f0; dester.
Qed.
Lemma substVar_liftVar : forall n (f0 : fin n) f,
substVar f0 (liftVar f0 f) = Some f.
induction f0; dester.
Qed.
(** Next, we define a notion of %``%#"#greater-than-or-equal#"#%''% for [fin] values, prove an inversion theorem for it, and add that theorem as a hint. *)
Inductive fin_ge : forall n1, fin n1 -> forall n2, fin n2 -> Prop :=
| GeO : forall n1 (f1 : fin n1) n2,
fin_ge f1 (First : fin (S n2))
| GeS : forall n1 (f1 : fin n1) n2 (f2 : fin n2),
fin_ge f1 f2
-> fin_ge (Next f1) (Next f2).
Hint Constructors fin_ge.
Lemma fin_ge_inv' : forall n1 n2 (f1 : fin n1) (f2 : fin n2),
fin_ge f1 f2
-> match f1, f2 with
| Next _ f1', Next _ f2' => fin_ge f1' f2'
| _, _ => True
end.
destruct 1; dester.
Qed.
Lemma fin_ge_inv : forall n1 n2 (f1 : fin n1) (f2 : fin n2),
fin_ge (Next f1) (Next f2)
-> fin_ge f1 f2.
intros; generalize (fin_ge_inv' (f1 := Next f1) (f2 := Next f2)); dester.
Qed.
Hint Resolve fin_ge_inv.
(** A congruence lemma for the [fin] constructor [Next] is similarly useful. *)
Lemma Next_cong : forall n (f1 f2 : fin n),
f1 = f2
-> Next f1 = Next f2.
dester.
Qed.
Hint Resolve Next_cong.
(** We prove a crucial lemma about [liftVar] in terms of [fin_ge]. *)
Lemma liftVar_more : forall n (f : fin n) (f0 : fin (S n)) g,
fin_ge g f0
-> match liftVar f0 f in fin n'
return fin n' -> (fin (pred n') -> fin n') -> fin (S n') with
| First n0 => fun _ _ => First
| Next n0 y' => fun _ fx' => Next (fx' y')
end g (liftVar g) = liftVar (more f0) (liftVar g f).
induction f; inversion 1; dester.
Qed.
Hint Resolve liftVar_more.
(** We suggest a particular way of changing the form of a goal, so that other hints are able to match. *)
Hint Extern 1 (_ = lift _ (Next (more ?f))) =>
change (Next (more f)) with (more (Next f)).
(** We suggest applying the [f_equal] tactic to simplify equalities over expressions. For instance, this would reduce a goal [App f1 x1 = App f2 x2] to two goals [f1 = f2] and [x1 = x2]. *)
Hint Extern 1 (eq (A := exp _) _ _) => f_equal.
(** Our consideration of lifting in isolation finishes with another hint lemma. The auxiliary lemma with a strengthened induction hypothesis is where we put [fin_ge] to use, and we do not need to mention that predicate again afteward. *)
Lemma double_lift' : forall n (e : exp n) f g,
fin_ge g f
-> lift (lift e f) (Next g) = lift (lift e g) (more f).
induction e; dester.
Qed.
Lemma double_lift : forall n (e : exp n) g,
lift (lift e First) (Next g) = lift (lift e g) First.
intros; apply double_lift'; dester.
Qed.
Hint Resolve double_lift.
(** Now we characterize the interaction of substitution and lifting on variables. We start with a more general form [substVar_lift'] of the final lemma [substVar_lift], with the latter proved as a direct corollary of the former. *)
Lemma substVar_lift' : forall n (f0 : fin n) f g,
substVar [nzf f0 return n, fin (S n) for
liftVar (more g) [sym_eq (nzf f0) return n, fin n for f0]]
(liftVar (liftVar (Next f0) [nzf f0 return n, fin n for g]) f)
= match substVar f0 f with
| Some f'' => Some [nzf f0 return n, fin n for liftVar g f'']
| None => None
end.
induction f0; dester.
Qed.
Lemma substVar_lift : forall n (f0 f g : fin (S n)),
substVar (liftVar (more g) f0) (liftVar (liftVar (Next f0) g) f)
= match substVar f0 f with
| Some f'' => Some (liftVar g f'')
| None => None
end.
intros; generalize (substVar_lift' f0 f g); dester.
Qed.
(** We follow a similar decomposition for the expression-level theorem about substitution and lifting. *)
Lemma lift_subst' : forall n (e1 : exp n) f g e2,
lift (subst e1 f e2) g
= [sym_eq (nzf f) return n, exp n for
subst
(lift e1 (liftVar (Next f) [nzf f return n, fin n for g]))
[nzf f return n, fin (S n) for
liftVar (more g) [sym_eq (nzf f) return n, fin n for f]]
[nzf f return n, exp n for lift e2 g]].
induction e1; generalize substVar_lift; dester.
Qed.
Lemma lift_subst : forall n g (e2 : exp (S n)) e3,
subst (lift e2 First) (Next g) (lift e3 First) = lift (n := n) (subst e2 g e3) First.
intros; generalize (lift_subst' e2 g First e3); dester.
Qed.
Hint Resolve lift_subst.
(** Our last auxiliary lemma characterizes a situation where substitution can undo the effects of lifting. *)
Lemma undo_lift' : forall n (e1 : exp n) e2 f,
subst (lift e1 f) f e2 = e1.
induction e1; generalize substVar_liftVar; dester.
Qed.
Lemma undo_lift : forall n e2 e3 (f0 : fin (S (S n))) g,
e3 = subst (lift e3 (unliftVar f0 g)) (unliftVar f0 g)
(subst (n := S n) e2 g e3).
generalize undo_lift'; dester.
Qed.
Hint Resolve undo_lift.
(** Finally, we arrive at the substitution commutativity theorem. *)
Lemma subst_comm' : forall n (e1 : exp n) f g e2 e3,
subst (subst e1 f e2) g e3
= subst
(subst e1 (liftVar f g) [nzf g return n, exp n for
lift e3 [sym_eq (nzf g) return n, fin n for unliftVar f g]])
(unliftVar f g)
(subst e2 g e3).
induction e1; generalize (substVar_unliftVar (n := n)); dester.
Qed.
Theorem subst_comm : forall (e1 : exp 2) e2 e3,
subst (subst e1 First e2) First e3
= subst (subst e1 (Next First) (lift e3 First)) First (subst e2 First e3).
intros; generalize (subst_comm' e1 First First e2 e3); dester.
Qed.
(** The final theorem is specialized to the case of substituting in an expression with exactly two free variables, which yields a statement that is readable enough, as statements about de Bruijn indices go.
This proof script is resilient to specification changes. It is easy to add new constructors to the language being treated. The proofs adapt automatically to the addition of any constructor whose subterms each involve zero or one new bound variables. That is, to add such a constructor, we only need to add it to the definition of [exp] and add (quite obvious) cases for it in the definitions of [lift] and [subst]. *)
(* Copyright (c) 2008-2011, Adam Chlipala
*
* This work is licensed under a
* Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
* Unported License.
* The license text is available at:
* http://creativecommons.org/licenses/by-nc-nd/3.0/
*)
(* begin hide *)
Require Import String List.
Require Import CpdtTactics DepList.
Set Implicit Arguments.
(* end hide *)
(** %\chapter{Extensional Transformations}% *)
(** Last chapter's constant folding example was particularly easy to verify, because that transformation used the same source and target language. In this chapter, we verify a different translation, illustrating the added complexities in translating between languages.
Program transformations can be classified as %\textit{%#<i>#intensional#</i>#%}%, when they require some notion of inequality between variables; or %\textit{%#<i>#extensional#</i>#%}%, otherwise. This chapter's example is extensional, and the next chapter deals with the trickier intensional case. *)
(** * CPS Conversion for Simply-Typed Lambda Calculus *)
(** A convenient method for compiling functional programs begins with conversion to %\textit{%#<i>#continuation-passing style#</i>#%}%, or CPS. In this restricted form, function calls never return; instead, we pass explicit return pointers, much as in assembly language. Additionally, we make order of evaluation explicit, breaking complex expressions into sequences of primitive operations.
Our translation will operate over the same source language that we used in the first part of last chapter, so we omit most of the language definition. However, we do make one significant change: since we will be working with multiple languages that involve similar constructs, we use Coq's %\textit{%#<i>#notation scope#</i>#%}% mechanism to disambiguate. For instance, the span of code dealing with type notations looks like this: *)
(* begin hide *)
Module Source.
Inductive type : Type :=
| TNat : type
| Arrow : type -> type -> type.
(* end hide *)
Notation "'Nat'" := TNat : source_scope.
(** printing --> $\longrightarrow$ *)
Infix "-->" := Arrow (right associativity, at level 60) : source_scope.
Open Scope source_scope.
Bind Scope source_scope with type.
Delimit Scope source_scope with source.
(** We explicitly place our notations inside a scope named [source_scope], and we associate a delimiting key [source] with [source_scope]. Without further commands, our notations would only be used in expressions like [(...)%source]. We also open our scope locally within this module, so that we avoid repeating [%source] in many places. Further, we %\textit{%#<i>#bind#</i>#%}% our scope to [type]. In some circumstances where Coq is able to infer that some subexpression has type [type], that subexpression will automatically be parsed in [source_scope]. *)
(* begin hide *)
Section vars.
Variable var : type -> Type.
Inductive exp : type -> Type :=
| Var : forall t,
var t
-> exp t
| Const : nat -> exp Nat
| Plus : exp Nat -> exp Nat -> exp Nat
| App : forall t1 t2,
exp (t1 --> t2)
-> exp t1
-> exp t2
| Abs : forall t1 t2,
(var t1 -> exp t2)
-> exp (t1 --> t2).
End vars.
Definition Exp t := forall var, exp var t.
Implicit Arguments Var [var t].
Implicit Arguments Const [var].
Implicit Arguments Plus [var].
Implicit Arguments App [var t1 t2].
Implicit Arguments Abs [var t1 t2].
Notation "# v" := (Var v) (at level 70) : source_scope.
Notation "^ n" := (Const n) (at level 70) : source_scope.
Infix "+^" := Plus (left associativity, at level 79) : source_scope.
Infix "@" := App (left associativity, at level 77) : source_scope.
Notation "\ x , e" := (Abs (fun x => e)) (at level 78) : source_scope.
Notation "\ ! , e" := (Abs (fun _ => e)) (at level 78) : source_scope.
Bind Scope source_scope with exp.
Example zero : Exp Nat := fun _ => ^0.
Example one : Exp Nat := fun _ => ^1.
Example zpo : Exp Nat := fun _ => zero _ +^ one _.
Example ident : Exp (Nat --> Nat) := fun _ => \x, #x.
Example app_ident : Exp Nat := fun _ => ident _ @ zpo _.
Example app : Exp ((Nat --> Nat) --> Nat --> Nat) := fun _ =>
\f, \x, #f @ #x.
Example app_ident' : Exp Nat := fun _ => app _ @ ident _ @ zpo _.
Fixpoint typeDenote (t : type) : Set :=
match t with
| Nat => nat
| t1 --> t2 => typeDenote t1 -> typeDenote t2
end.
Fixpoint expDenote t (e : exp typeDenote t) : typeDenote t :=
match e with
| Var _ v => v
| Const n => n
| Plus e1 e2 => expDenote e1 + expDenote e2
| App _ _ e1 e2 => (expDenote e1) (expDenote e2)
| Abs _ _ e' => fun x => expDenote (e' x)
end.
Definition ExpDenote t (e : Exp t) := expDenote (e _).
(* end hide *)
(** The other critical new ingredient is a generalization of the [Closed] relation from two chapters ago. The new relation [exp_equiv] characters when two expressions may be considered syntactically equal. We need to be able to handle cases where each expression uses a different [var] type. Intuitively, we will want to compare expressions that use their variables to store source-level and target-level values. We express pairs of equivalent variables using a list parameter to the relation; variable expressions will be considered equivalent if and only if their variables belong to this list. The rule for function abstraction extends the list in a higher-order way. The remaining rules just implement the obvious congruence over expressions. *)
(* begin thide *)
Section exp_equiv.
Variables var1 var2 : type -> Type.
Inductive exp_equiv : list { t : type & var1 t * var2 t }%type
-> forall t, exp var1 t -> exp var2 t -> Prop :=
| EqVar : forall G t (v1 : var1 t) v2,
In (existT _ t (v1, v2)) G
-> exp_equiv G (#v1) (#v2)
| EqConst : forall G n,
exp_equiv G (^ n) (^ n)
| EqPlus : forall G x1 y1 x2 y2,
exp_equiv G x1 x2
-> exp_equiv G y1 y2
-> exp_equiv G (x1 +^ y1) (x2 +^ y2)
| EqApp : forall G t1 t2 (f1 : exp _ (t1 --> t2)) (x1 : exp _ t1) f2 x2,
exp_equiv G f1 f2
-> exp_equiv G x1 x2
-> exp_equiv G (f1 @ x1) (f2 @ x2)
| EqAbs : forall G t1 t2 (f1 : var1 t1 -> exp var1 t2) f2,
(forall v1 v2, exp_equiv (existT _ t1 (v1, v2) :: G) (f1 v1) (f2 v2))
-> exp_equiv G (Abs f1) (Abs f2).
End exp_equiv.
(** It turns out that, for any parametric expression [E], any two instantiations of [E] with particular [var] types must be equivalent, with respect to an empty variable list. The parametricity of Gallina guarantees this, in much the same way that it guaranteed the truth of the axiom about [Closed]. Thus, we assert an analogous axiom here. *)
Axiom Exp_equiv : forall t (E : Exp t) var1 var2,
exp_equiv nil (E var1) (E var2).
(* end thide *)
End Source.
(** Now we need to define the CPS language, where binary function types are replaced with unary continuation types, and we add product types because they will be useful in our translation. *)
Module CPS.
Inductive type : Type :=
| TNat : type
| Cont : type -> type
| Prod : type -> type -> type.
Notation "'Nat'" := TNat : cps_scope.
(** printing ---> $\Longrightarrow$ *)
Notation "t --->" := (Cont t) (at level 61) : cps_scope.
Infix "**" := Prod (right associativity, at level 60) : cps_scope.
Bind Scope cps_scope with type.
Delimit Scope cps_scope with cps.
Section vars.
Variable var : type -> Type.
(** A CPS program is a series of bindings of primitive operations (primops), followed by either a halt with a final program result or by a call to a continuation. The arguments to these program-ending operations are enforced to be variables. To use the values of compound expressions instead, those expressions must be decomposed into bindings of primops. The primop language itself similarly forces variables for all arguments besides bodies of function abstractions. *)
Inductive prog : Type :=
| PHalt :
var Nat
-> prog
| App : forall t,
var (t --->)
-> var t
-> prog
| Bind : forall t,
primop t
-> (var t -> prog)
-> prog
with primop : type -> Type :=
| Const : nat -> primop Nat
| Plus : var Nat -> var Nat -> primop Nat
| Abs : forall t,
(var t -> prog)
-> primop (t --->)
| Pair : forall t1 t2,
var t1
-> var t2
-> primop (t1 ** t2)
| Fst : forall t1 t2,
var (t1 ** t2)
-> primop t1
| Snd : forall t1 t2,
var (t1 ** t2)
-> primop t2.
End vars.
Implicit Arguments PHalt [var].
Implicit Arguments App [var t].
Implicit Arguments Const [var].
Implicit Arguments Plus [var].
Implicit Arguments Abs [var t].
Implicit Arguments Pair [var t1 t2].
Implicit Arguments Fst [var t1 t2].
Implicit Arguments Snd [var t1 t2].
Notation "'Halt' x" := (PHalt x) (no associativity, at level 75) : cps_scope.
Infix "@@" := App (no associativity, at level 75) : cps_scope.
Notation "x <- p ; e" := (Bind p (fun x => e))
(right associativity, at level 76, p at next level) : cps_scope.
Notation "! <- p ; e" := (Bind p (fun _ => e))
(right associativity, at level 76, p at next level) : cps_scope.
Notation "^ n" := (Const n) (at level 70) : cps_scope.
Infix "+^" := Plus (left associativity, at level 79) : cps_scope.
Notation "\ x , e" := (Abs (fun x => e)) (at level 78) : cps_scope.
Notation "\ ! , e" := (Abs (fun _ => e)) (at level 78) : cps_scope.
Notation "[ x1 , x2 ]" := (Pair x1 x2) : cps_scope.
Notation "#1 x" := (Fst x) (at level 72) : cps_scope.
Notation "#2 x" := (Snd x) (at level 72) : cps_scope.
Bind Scope cps_scope with prog primop.
Open Scope cps_scope.
(** In interpreting types, we treat continuations as functions with codomain [nat], choosing [nat] as our arbitrary program result type. *)
Fixpoint typeDenote (t : type) : Set :=
match t with
| Nat => nat
| t' ---> => typeDenote t' -> nat
| t1 ** t2 => (typeDenote t1 * typeDenote t2)%type
end.
(** A mutually-recursive definition establishes the meanings of programs and primops. *)
Fixpoint progDenote (e : prog typeDenote) : nat :=
match e with
| PHalt n => n
| App _ f x => f x
| Bind _ p x => progDenote (x (primopDenote p))
end
with primopDenote t (p : primop typeDenote t) : typeDenote t :=
match p with
| Const n => n
| Plus n1 n2 => n1 + n2
| Abs _ e => fun x => progDenote (e x)
| Pair _ _ v1 v2 => (v1, v2)
| Fst _ _ v => fst v
| Snd _ _ v => snd v
end.
Definition Prog := forall var, prog var.
Definition Primop t := forall var, primop var t.
Definition ProgDenote (E : Prog) := progDenote (E _).
Definition PrimopDenote t (P : Primop t) := primopDenote (P _).
End CPS.
Import Source CPS.
(** The translation itself begins with a type-level compilation function. We change every function into a continuation whose argument is a pair, consisting of the translation of the original argument and of an explicit return pointer. *)
(* begin thide *)
Fixpoint cpsType (t : Source.type) : CPS.type :=
match t with
| Nat => Nat%cps
| t1 --> t2 => (cpsType t1 ** (cpsType t2 ---> ) ---> )%cps
end%source.
(** Now we can define the expression translation. The notation [x <-- e1; e2] stands for translating source-level expression [e1], binding [x] to the CPS-level result of running the translated program, and then evaluating CPS-level expression [e2] in that context. *)
(** printing <-- $\longleftarrow$ *)
Reserved Notation "x <-- e1 ; e2" (right associativity, at level 76, e1 at next level).
Section cpsExp.
Variable var : CPS.type -> Type.
Import Source.
Open Scope cps_scope.
(** We implement a well-known variety of higher-order, one-pass CPS translation. The translation [cpsExp] is parameterized not only by the expression [e] to translate, but also by a meta-level continuation. The idea is that [cpsExp] evaluates the translation of [e] and calls the continuation on the result. With this convention, [cpsExp] itself is a natural match for the notation we just reserved. *)
Fixpoint cpsExp t (e : exp (fun t => var (cpsType t)) t)
: (var (cpsType t) -> prog var) -> prog var :=
match e with
| Var _ v => fun k => k v
| Const n => fun k =>
x <- ^ n;
k x
| Plus e1 e2 => fun k =>
x1 <-- e1;
x2 <-- e2;
x <- x1 +^ x2;
k x
| App _ _ e1 e2 => fun k =>
f <-- e1;
x <-- e2;
kf <- \ r, k r;
p <- [x, kf];
f @@ p
| Abs _ _ e' => fun k =>
f <- CPS.Abs (var := var) (fun p =>
x <- #1 p;
kf <- #2 p;
r <-- e' x;
kf @@ r);
k f
end
where "x <-- e1 ; e2" := (cpsExp e1 (fun x => e2)).
End cpsExp.
(** Since notations do not survive the closing of sections, we redefine the notation associated with [cpsExp]. *)
Notation "x <-- e1 ; e2" := (cpsExp e1 (fun x => e2)) : cps_scope.
Implicit Arguments cpsExp [var t].
(** We wrap [cpsExp] into the parametric version [CpsExp], passing an always-halt continuation at the root of the recursion. *)
Definition CpsExp (E : Exp Nat) : Prog :=
fun _ => cpsExp (E _) (PHalt (var := _)).
(* end thide *)
Eval compute in CpsExp zero.
(** %\vspace{-.15in}% [[
= fun var : type -> Type => x <- ^0; Halt x
: Prog
]]
*)
Eval compute in CpsExp one.
(** %\vspace{-.15in}% [[
= fun var : type -> Type => x <- ^1; Halt x
: Prog
]]
*)
Eval compute in CpsExp zpo.
(** %\vspace{-.15in}% [[
= fun var : type -> Type => x <- ^0; x0 <- ^1; x1 <- (x +^ x0); Halt x1
: Prog
]]
*)
Eval compute in CpsExp app_ident.
(** %\vspace{-.15in}% [[
= fun var : type -> Type =>
f <- (\ p, x <- #1 p; kf <- #2 p; kf @@ x);
x <- ^0;
x0 <- ^1; x1 <- (x +^ x0); kf <- (\ r, Halt r); p <- [x1, kf]; f @@ p
: Prog
]]
*)
Eval compute in CpsExp app_ident'.
(** %\vspace{-.15in}% [[
= fun var : type -> Type =>
f <-
(\ p,
x <- #1 p;
kf <- #2 p;
f <-
(\ p0,
x0 <- #1 p0;
kf0 <- #2 p0; kf1 <- (\ r, kf0 @@ r); p1 <- [x0, kf1]; x @@ p1);
kf @@ f);
f0 <- (\ p, x <- #1 p; kf <- #2 p; kf @@ x);
kf <-
(\ r,
x <- ^0;
x0 <- ^1;
x1 <- (x +^ x0); kf <- (\ r0, Halt r0); p <- [x1, kf]; r @@ p);
p <- [f0, kf]; f @@ p
: Prog
]]
*)
Eval compute in ProgDenote (CpsExp zero).
(** %\vspace{-.15in}% [[
= 0
: nat
]]
*)
Eval compute in ProgDenote (CpsExp one).
(** %\vspace{-.15in}% [[
= 1
: nat
]]
*)
Eval compute in ProgDenote (CpsExp zpo).
(** %\vspace{-.15in}% [[
= 1
: nat
]]
*)
Eval compute in ProgDenote (CpsExp app_ident).
(** %\vspace{-.15in}% [[
= 1
: nat
]]
*)
Eval compute in ProgDenote (CpsExp app_ident').
(** %\vspace{-.15in}% [[
= 1
: nat
]]
*)
(** Our main inductive lemma about [cpsExp] needs a notion of compatibility between source-level and CPS-level values. We express compatibility with a %\textit{%#<i>#logical relation#</i>#%}%; that is, we define a binary relation by recursion on type structure, and the function case of the relation considers functions related if they map related arguments to related results. In detail, the function case is slightly more complicated, since it must deal with our continuation-based calling convention. *)
(* begin thide *)
Fixpoint lr (t : Source.type)
: Source.typeDenote t -> CPS.typeDenote (cpsType t) -> Prop :=
match t with
| Nat => fun n1 n2 => n1 = n2
| t1 --> t2 => fun f1 f2 =>
forall x1 x2, lr _ x1 x2
-> forall k, exists r,
f2 (x2, k) = k r
/\ lr _ (f1 x1) r
end%source.
(** The main lemma is now easily stated and proved. The most surprising aspect of the statement is the presence of %\textit{%#<i>#two#</i>#%}% versions of the expression to be compiled. The first, [e1], uses a [var] choice that makes it a suitable argument to [expDenote]. The second expression, [e2], uses a [var] choice that makes its compilation, [cpsExp e2 k], a suitable argument to [progDenote]. We use [exp_equiv] to assert that [e1] and [e2] have the same underlying structure, up to a variable correspondence list [G]. A hypothesis about [G] ensures that all of its pairs of variables belong to the logical relation [lr]. We also use [lr], in concert with some quantification over continuations and program results, in the conclusion of the lemma.
The lemma's proof should be unsurprising by now. It uses our standard bag of Ltac tricks to help out with quantifier instantiation; [crush] and [eauto] can handle the rest. *)
Lemma cpsExp_correct : forall G t (e1 : exp _ t) (e2 : exp _ t),
exp_equiv G e1 e2
-> (forall t v1 v2, In (existT _ t (v1, v2)) G -> lr t v1 v2)
-> forall k, exists r,
progDenote (cpsExp e2 k) = progDenote (k r)
/\ lr t (expDenote e1) r.
induction 1; crush;
repeat (match goal with
| [ H : forall k, exists r, progDenote (cpsExp ?E k) = _ /\ _
|- context[cpsExp ?E ?K] ] =>
generalize (H K); clear H
| [ |- exists r, progDenote (_ ?R) = progDenote (_ r) /\ _ ] =>
exists R
| [ t1 : Source.type |- _ ] =>
match goal with
| [ Hlr : lr t1 ?X1 ?X2, IH : forall v1 v2, _ |- _ ] =>
generalize (IH X1 X2); clear IH; intro IH;
match type of IH with
| ?P -> _ => assert P
end
end
end; crush); eauto.
Qed.
(** A simple lemma establishes the degenerate case of [cpsExp_correct]'s hypothesis about [G]. *)
Lemma vars_easy : forall t v1 v2,
In (existT (fun t0 => (Source.typeDenote t0 * typeDenote (cpsType t0))%type) t
(v1, v2)) nil -> lr t v1 v2.
crush.
Qed.
(** A manual application of [cpsExp_correct] proves a version applicable to [CpsExp]. This is where we use the axiom [Exp_equiv]. *)
Theorem CpsExp_correct : forall (E : Exp Nat),
ProgDenote (CpsExp E) = ExpDenote E.
unfold ProgDenote, CpsExp, ExpDenote; intros;
generalize (cpsExp_correct (e1 := E _) (e2 := E _)
(Exp_equiv _ _ _) vars_easy (PHalt (var := _))); crush.
Qed.
(* end thide *)
(** * Exercises *)
(** %\begin{enumerate}%#<ol>#
%\item%#<li># When in the last chapter we implemented constant folding for simply-typed lambda calculus, it may have seemed natural to try applying beta reductions. This would have been a lot more trouble than is apparent at first, because we would have needed to convince Coq that our normalizing function always terminated.
It might also seem that beta reduction is a lost cause because we have no effective way of substituting in the [exp] type; we only managed to write a substitution function for the parametric [Exp] type. This is not as big of a problem as it seems. For instance, for the language we built by extending simply-typed lambda calculus with products and sums, it also appears that we need substitution for simplifying [case] expressions whose discriminees are known to be [inl] or [inr], but the function is still implementable.
For this exercise, extend the products and sums constant folder from the last chapter so that it simplifies [case] expressions as well, by checking if the discriminee is a known [inl] or known [inr]. Also extend the correctness theorem to apply to your new definition. You will probably want to assert an axiom relating to an expression equivalence relation like the one defined in this chapter. Any such axiom should only mention syntax; it should not mention any compilation or denotation functions. Following the format of the axiom from the last chapter is the safest bet to avoid proving a worthless theorem.
#</li>#
#</ol>#%\end{enumerate}% *)
(* Copyright (c) 2008-2010, Adam Chlipala
*
* This work is licensed under a
* Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
* Unported License.
* The license text is available at:
* http://creativecommons.org/licenses/by-nc-nd/3.0/
*)
(* begin hide *)
Require Import Arith String List.
Require Import CpdtTactics.
Set Implicit Arguments.
(* end hide *)
(** %\part{Formalizing Programming Languages and Compilers}
\chapter{First-Order Abstract Syntax}% *)
(** Many people interested in interactive theorem-proving want to prove theorems about programming languages. That domain also provides a good setting for demonstrating how to apply the ideas from the earlier parts of this book. This part introduces some techniques for encoding the syntax and semantics of programming languages, along with some example proofs designed to be as practical as possible, rather than to illustrate basic Coq technique.
To prove anything about a language, we must first formalize the language's syntax. We have a broad design space to choose from, and it makes sense to start with the simplest options, so-called %\textit{%#<i>#first-order#</i>#%}% syntax encodings that do not use dependent types. These encodings are first-order because they do not use Coq function types in a critical way. In this chapter, we consider the most popular first-order encodings, using each to prove a basic type soundness theorem. *)
(** * Concrete Binding *)
(** The most obvious encoding of the syntax of programming languages follows usual context-free grammars literally. We represent variables as strings and include a variable in our AST definition wherever a variable appears in the informal grammar. Concrete binding turns out to involve a surprisingly large amount of menial bookkeeping, especially when we encode higher-order languages with nested binder scopes. This section's example should give a flavor of what is required. *)
Module Concrete.
(** We need our variable type and its decidable equality operation. *)
Definition var := string.
Definition var_eq := string_dec.
(** We will formalize basic simply-typed lambda calculus. The syntax of expressions and types follows what we would write in a context-free grammar. *)
Inductive exp : Set :=
| Const : bool -> exp
| Var : var -> exp
| App : exp -> exp -> exp
| Abs : var -> exp -> exp.
Inductive type : Set :=
| Bool : type
| Arrow : type -> type -> type.
(** It is useful to define a syntax extension that lets us write function types in more standard notation. *)
(** printing --> $\longrightarrow$ *)
Infix "-->" := Arrow (right associativity, at level 60).
(** Now we turn to a typing judgment. We will need to define it in terms of typing contexts, which we represent as lists of pairs of variables and types. *)
Definition ctx := list (var * type).
(** The definitions of our judgments will be prettier if we write them using mixfix syntax. To define a judgment for looking up the type of a variable in a context, we first %\textit{%#</i>#reserve#</i>#%}% a notation for the judgment. Reserved notations enable mutually-recursive definition of a judgment and its notation; in this sense, the reservation is like a forward declaration in C. *)
(** printing |-v $\vdash_\mathsf{v}$ *)
Reserved Notation "G |-v x : t" (no associativity, at level 90, x at next level).
(** Now we define the judgment itself, for variable typing, using a [where] clause to associate a notation definition. *)
Inductive lookup : ctx -> var -> type -> Prop :=
| First : forall x t G,
(x, t) :: G |-v x : t
| Next : forall x t x' t' G,
x <> x'
-> G |-v x : t
-> (x', t') :: G |-v x : t
where "G |-v x : t" := (lookup G x t).
Hint Constructors lookup.
(** The same technique applies to defining the main typing judgment. We use an [at next level] clause to cause the argument [e] of the notation to be parsed at a low enough precedence level. *)
(** printing |-e $\vdash_\mathsf{e}$ *)
Reserved Notation "G |-e e : t" (no associativity, at level 90, e at next level).
Inductive hasType : ctx -> exp -> type -> Prop :=
| TConst : forall G b,
G |-e Const b : Bool
| TVar : forall G v t,
G |-v v : t
-> G |-e Var v : t
| TApp : forall G e1 e2 dom ran,
G |-e e1 : dom --> ran
-> G |-e e2 : dom
-> G |-e App e1 e2 : ran
| TAbs : forall G x e' dom ran,
(x, dom) :: G |-e e' : ran
-> G |-e Abs x e' : dom --> ran
where "G |-e e : t" := (hasType G e t).
Hint Constructors hasType.
(** It is useful to know that variable lookup results are unchanged by adding extra bindings to the end of a context. *)
Lemma weaken_lookup : forall x t G' G1,
G1 |-v x : t
-> G1 ++ G' |-v x : t.
induction G1 as [ | [? ?] ? ]; crush;
match goal with
| [ H : _ |-v _ : _ |- _ ] => inversion H; crush
end.
Qed.
Hint Resolve weaken_lookup.
(** The same property extends to the full typing judgment. *)
Theorem weaken_hasType' : forall G' G e t,
G |-e e : t
-> G ++ G' |-e e : t.
induction 1; crush; eauto.
Qed.
Theorem weaken_hasType : forall e t,
nil |-e e : t
-> forall G', G' |-e e : t.
intros; change G' with (nil ++ G');
eapply weaken_hasType'; eauto.
Qed.
Hint Resolve weaken_hasType.
(** Much of the inconvenience of first-order encodings comes from the need to treat capture-avoiding substitution explicitly. We must start by defining a substitution function. *)
Section subst.
Variable x : var.
Variable e1 : exp.
(** We are substituting expression [e1] for every free occurrence of [x]. Note that this definition is specialized to the case where [e1] is closed; substitution is substantially more complicated otherwise, potentially involving explicit alpha-variation. Luckily, our example of type safety for a call-by-value semantics only requires this restricted variety of substitution. *)
Fixpoint subst (e2 : exp) : exp :=
match e2 with
| Const _ => e2
| Var x' => if var_eq x' x then e1 else e2
| App e1 e2 => App (subst e1) (subst e2)
| Abs x' e' => Abs x' (if var_eq x' x then e' else subst e')
end.
(** We can prove a few theorems about substitution in well-typed terms, where we assume that [e1] is closed and has type [xt]. *)
Variable xt : type.
Hypothesis Ht' : nil |-e e1 : xt.
(** It is helpful to establish a notation asserting the freshness of a particular variable in a context. *)
Notation "x # G" := (forall t' : type, In (x, t') G -> False) (no associativity, at level 90).
(** To prove type preservation, we will need lemmas proving consequences of variable lookup proofs. *)
Lemma subst_lookup' : forall x' t,
x <> x'
-> forall G1, G1 ++ (x, xt) :: nil |-v x' : t
-> G1 |-v x' : t.
induction G1 as [ | [? ?] ? ]; crush;
match goal with
| [ H : _ |-v _ : _ |- _ ] => inversion H
end; crush.
Qed.
Hint Resolve subst_lookup'.
Lemma subst_lookup : forall x' t G1,
x' # G1
-> G1 ++ (x, xt) :: nil |-v x' : t
-> t = xt.
induction G1 as [ | [? ?] ? ]; crush; eauto;
match goal with
| [ H : _ |-v _ : _ |- _ ] => inversion H
end; crush; (elimtype False; eauto;
match goal with
| [ H : nil |-v _ : _ |- _ ] => inversion H
end)
|| match goal with
| [ H : _ |- _ ] => apply H; crush; eauto
end.
Qed.
Implicit Arguments subst_lookup [x' t G1].
(** Another set of lemmas allows us to remove provably unused variables from the ends of typing contexts. *)
Lemma shadow_lookup : forall v t t' G1,
G1 |-v x : t'
-> G1 ++ (x, xt) :: nil |-v v : t
-> G1 |-v v : t.
induction G1 as [ | [? ?] ? ]; crush;
match goal with
| [ H : nil |-v _ : _ |- _ ] => inversion H
| [ H1 : _ |-v _ : _, H2 : _ |-v _ : _ |- _ ] =>
inversion H1; crush; inversion H2; crush
end.
Qed.
Lemma shadow_hasType' : forall G e t,
G |-e e : t
-> forall G1, G = G1 ++ (x, xt) :: nil
-> forall t'', G1 |-v x : t''
-> G1 |-e e : t.
Hint Resolve shadow_lookup.
induction 1; crush; eauto;
match goal with
| [ H : (?x0, _) :: _ ++ (?x, _) :: _ |-e _ : _ |- _ ] =>
destruct (var_eq x0 x); subst; eauto
end.
Qed.
Lemma shadow_hasType : forall G1 e t t'',
G1 ++ (x, xt) :: nil |-e e : t
-> G1 |-v x : t''
-> G1 |-e e : t.
intros; eapply shadow_hasType'; eauto.
Qed.
Hint Resolve shadow_hasType.
(** Disjointness facts may be extended to larger contexts when the appropriate obligations are met. *)
Lemma disjoint_cons : forall x x' t (G : ctx),
x # G
-> x' <> x
-> x # (x', t) :: G.
firstorder;
match goal with
| [ H : (_, _) = (_, _) |- _ ] => injection H
end; crush.
Qed.
Hint Resolve disjoint_cons.
(** Finally, we arrive at the main theorem about substitution: it preserves typing. *)
Theorem subst_hasType : forall G e2 t,
G |-e e2 : t
-> forall G1, G = G1 ++ (x, xt) :: nil
-> x # G1
-> G1 |-e subst e2 : t.
induction 1; crush;
try match goal with
| [ |- context[if ?E then _ else _] ] => destruct E
end; crush; eauto 6;
match goal with
| [ H1 : x # _, H2 : _ |-v x : _ |- _ ] =>
rewrite (subst_lookup H1 H2)
end; crush.
Qed.
(** We wrap the last theorem into an easier-to-apply form specialized to closed expressions. *)
Theorem subst_hasType_closed : forall e2 t,
(x, xt) :: nil |-e e2 : t
-> nil |-e subst e2 : t.
intros; eapply subst_hasType; eauto.
Qed.
End subst.
Hint Resolve subst_hasType_closed.
(** A notation for substitution will make the operational semantics easier to read. %The ASCII operator \texttt{\textasciitilde{}>} will afterward be rendered as $\leadsto$.% *)
(** printing ~> $\leadsto$ *)
Notation "[ x ~> e1 ] e2" := (subst x e1 e2) (no associativity, at level 80).
(** To define a call-by-value small-step semantics, we rely on a standard judgment characterizing which expressions are values. *)
Inductive val : exp -> Prop :=
| VConst : forall b, val (Const b)
| VAbs : forall x e, val (Abs x e).
Hint Constructors val.
(** Now the step relation is easy to define. *)
Reserved Notation "e1 ==> e2" (no associativity, at level 90).
Inductive step : exp -> exp -> Prop :=
| Beta : forall x e1 e2,
val e2
-> App (Abs x e1) e2 ==> [x ~> e2] e1
| Cong1 : forall e1 e2 e1',
e1 ==> e1'
-> App e1 e2 ==> App e1' e2
| Cong2 : forall e1 e2 e2',
val e1
-> e2 ==> e2'
-> App e1 e2 ==> App e1 e2'
where "e1 ==> e2" := (step e1 e2).
Hint Constructors step.
(** The progress theorem says that any well-typed expression that is not a value can take a step. To deal with limitations of the [induction] tactic, we put most of the proof in a lemma whose statement uses the usual trick of introducing extra equality hypotheses. *)
Lemma progress' : forall G e t, G |-e e : t
-> G = nil
-> val e \/ exists e', e ==> e'.
induction 1; crush; eauto;
try match goal with
| [ H : _ |-e _ : _ --> _ |- _ ] => inversion H
end;
match goal with
| [ H : _ |- _ ] => solve [ inversion H; crush; eauto ]
end.
Qed.
Theorem progress : forall e t, nil |-e e : t
-> val e \/ exists e', e ==> e'.
intros; eapply progress'; eauto.
Qed.
(** A similar pattern works for the preservation theorem, which says that any step of execution preserves an expression's type. *)
Lemma preservation' : forall G e t, G |-e e : t
-> G = nil
-> forall e', e ==> e'
-> nil |-e e' : t.
induction 1; inversion 2; crush; eauto;
match goal with
| [ H : _ |-e Abs _ _ : _ |- _ ] => inversion H
end; eauto.
Qed.
Theorem preservation : forall e t, nil |-e e : t
-> forall e', e ==> e'
-> nil |-e e' : t.
intros; eapply preservation'; eauto.
Qed.
End Concrete.
(** This was a relatively simple example, giving only a taste of the proof burden associated with concrete syntax. We were helped by the fact that, with call-by-value semantics, we only need to reason about substitution in closed expressions. There was also no need to alpha-vary an expression. *)
(** * De Bruijn Indices *)
(** De Bruijn indices are much more popular than concrete syntax. This technique provides a %\textit{%#<i>#canonical#</i>#%}% representation of syntax, where any two alpha-equivalent expressions have syntactically equal encodings, removing the need for explicit reasoning about alpha conversion. Variables are represented as natural numbers, where variable [n] denotes a reference to the [n]th closest enclosing binder. Because variable references in effect point to binders, there is no need to label binders, such as function abstraction, with variables. *)
Module DeBruijn.
Definition var := nat.
Definition var_eq := eq_nat_dec.
Inductive exp : Set :=
| Const : bool -> exp
| Var : var -> exp
| App : exp -> exp -> exp
| Abs : exp -> exp.
Inductive type : Set :=
| Bool : type
| Arrow : type -> type -> type.
Infix "-->" := Arrow (right associativity, at level 60).
(** The definition of typing proceeds much the same as in the last section. Since variables are numbers, contexts can be simple lists of types. This makes it possible to write the lookup judgment without mentioning inequality of variables. *)
Definition ctx := list type.
Reserved Notation "G |-v x : t" (no associativity, at level 90, x at next level).
Inductive lookup : ctx -> var -> type -> Prop :=
| First : forall t G,
t :: G |-v O : t
| Next : forall x t t' G,
G |-v x : t
-> t' :: G |-v S x : t
where "G |-v x : t" := (lookup G x t).
Hint Constructors lookup.
Reserved Notation "G |-e e : t" (no associativity, at level 90, e at next level).
Inductive hasType : ctx -> exp -> type -> Prop :=
| TConst : forall G b,
G |-e Const b : Bool
| TVar : forall G v t,
G |-v v : t
-> G |-e Var v : t
| TApp : forall G e1 e2 dom ran,
G |-e e1 : dom --> ran
-> G |-e e2 : dom
-> G |-e App e1 e2 : ran
| TAbs : forall G e' dom ran,
dom :: G |-e e' : ran
-> G |-e Abs e' : dom --> ran
where "G |-e e : t" := (hasType G e t).
(** In the [hasType] case for function abstraction, there is no need to choose a variable name. We simply push the function domain type onto the context [G]. *)
Hint Constructors hasType.
(** We prove roughly the same weakening theorems as before. *)
Lemma weaken_lookup : forall G' v t G,
G |-v v : t
-> G ++ G' |-v v : t.
induction 1; crush.
Qed.
Hint Resolve weaken_lookup.
Theorem weaken_hasType' : forall G' G e t,
G |-e e : t
-> G ++ G' |-e e : t.
induction 1; crush; eauto.
Qed.
Theorem weaken_hasType : forall e t,
nil |-e e : t
-> forall G', G' |-e e : t.
intros; change G' with (nil ++ G');
eapply weaken_hasType'; eauto.
Qed.
Hint Resolve weaken_hasType.
Section subst.
Variable e1 : exp.
(** Substitution is easier to define than with concrete syntax. While our old definition needed to use two comparisons for equality of variables, the de Bruijn substitution only needs one comparison. *)
Fixpoint subst (x : var) (e2 : exp) : exp :=
match e2 with
| Const _ => e2
| Var x' => if var_eq x' x then e1 else e2
| App e1 e2 => App (subst x e1) (subst x e2)
| Abs e' => Abs (subst (S x) e')
end.
Variable xt : type.
(** We prove similar theorems about inversion of variable lookup. *)
Lemma subst_eq : forall t G1,
G1 ++ xt :: nil |-v length G1 : t
-> t = xt.
induction G1; inversion 1; crush.
Qed.
Implicit Arguments subst_eq [t G1].
Lemma subst_neq' : forall t G1 x,
G1 ++ xt :: nil |-v x : t
-> x <> length G1
-> G1 |-v x : t.
induction G1; inversion 1; crush;
match goal with
| [ H : nil |-v _ : _ |- _ ] => inversion H
end.
Qed.
Hint Resolve subst_neq'.
Lemma subst_neq : forall v t G1,
G1 ++ xt :: nil |-v v : t
-> v <> length G1
-> G1 |-e Var v : t.
induction G1; inversion 1; crush.
Qed.
Hint Resolve subst_neq.
Hypothesis Ht' : nil |-e e1 : xt.
(** The next lemma is included solely to guide [eauto], which will not apply computational equivalences automatically. *)
Lemma hasType_push : forall dom G1 e' ran,
dom :: G1 |-e subst (length (dom :: G1)) e' : ran
-> dom :: G1 |-e subst (S (length G1)) e' : ran.
trivial.
Qed.
Hint Resolve hasType_push.
(** Finally, we are ready for the main theorem about substitution and typing. *)
Theorem subst_hasType : forall G e2 t,
G |-e e2 : t
-> forall G1, G = G1 ++ xt :: nil
-> G1 |-e subst (length G1) e2 : t.
induction 1; crush;
try match goal with
| [ |- context[if ?E then _ else _] ] => destruct E
end; crush; eauto 6;
try match goal with
| [ H : _ |-v _ : _ |- _ ] =>
rewrite (subst_eq H)
end; crush.
Qed.
Theorem subst_hasType_closed : forall e2 t,
xt :: nil |-e e2 : t
-> nil |-e subst O e2 : t.
intros; change O with (length (@nil type)); eapply subst_hasType; eauto.
Qed.
End subst.
Hint Resolve subst_hasType_closed.
(** We define the operational semantics much as before. *)
Notation "[ x ~> e1 ] e2" := (subst e1 x e2) (no associativity, at level 80).
Inductive val : exp -> Prop :=
| VConst : forall b, val (Const b)
| VAbs : forall e, val (Abs e).
Hint Constructors val.
Reserved Notation "e1 ==> e2" (no associativity, at level 90).
Inductive step : exp -> exp -> Prop :=
| Beta : forall e1 e2,
val e2
-> App (Abs e1) e2 ==> [O ~> e2] e1
| Cong1 : forall e1 e2 e1',
e1 ==> e1'
-> App e1 e2 ==> App e1' e2
| Cong2 : forall e1 e2 e2',
val e1
-> e2 ==> e2'
-> App e1 e2 ==> App e1 e2'
where "e1 ==> e2" := (step e1 e2).
Hint Constructors step.
(** Since we have added the right hints, the progress and preservation theorem statements and proofs are exactly the same as in the concrete encoding example. *)
Lemma progress' : forall G e t, G |-e e : t
-> G = nil
-> val e \/ exists e', e ==> e'.
induction 1; crush; eauto;
try match goal with
| [ H : _ |-e _ : _ --> _ |- _ ] => inversion H
end;
repeat match goal with
| [ H : _ |- _ ] => solve [ inversion H; crush; eauto ]
end.
Qed.
Theorem progress : forall e t, nil |-e e : t
-> val e \/ exists e', e ==> e'.
intros; eapply progress'; eauto.
Qed.
Lemma preservation' : forall G e t, G |-e e : t
-> G = nil
-> forall e', e ==> e'
-> nil |-e e' : t.
induction 1; inversion 2; crush; eauto;
match goal with
| [ H : _ |-e Abs _ : _ |- _ ] => inversion H
end; eauto.
Qed.
Theorem preservation : forall e t, nil |-e e : t
-> forall e', e ==> e'
-> nil |-e e' : t.
intros; eapply preservation'; eauto.
Qed.
End DeBruijn.
(** * Locally Nameless Syntax *)
(** The most popular Coq syntax encoding today is the %\textit{%#<i>#locally nameless#</i>#%}% style, which has been around for a while but was popularized recently by Aydemir et al., following a methodology summarized in their paper %``%#"#Engineering Formal Metatheory.#"#%''% #<a href="http://www.cis.upenn.edu/~plclub/oregon08/">#A specialized tutorial by that group#</a>#%\footnote{\url{http://www.cis.upenn.edu/~plclub/oregon08/}}% explains the approach, based on a library. In this section, we will build up all of the necessary ingredients from scratch.
The one-sentence summary of locally nameless encoding is that we represent free variables as concrete syntax does, and we represent bound variables with de Bruijn indices. Many proofs involve reasoning about terms transplanted into different free variable contexts; concrete encoding of free variables means that, to perform such a transplanting, we need no fix-up operation to adjust de Bruijn indices. At the same time, use of de Bruijn indices for local variables gives us canonical representations of expressions, with respect to the usual convention of alpha-equivalence. This makes many operations, including substitution of open terms in open terms, easier to implement.
The %``%#"#Engineering Formal Metatheory#"#%''% methodology involves a number of subtle design decisions, which we will describe as they appear in the latest version of our running example. *)
Module LocallyNameless.
Definition free_var := string.
Definition bound_var := nat.
Inductive exp : Set :=
| Const : bool -> exp
| FreeVar : free_var -> exp
| BoundVar : bound_var -> exp
| App : exp -> exp -> exp
| Abs : exp -> exp.
(** Note the different constructors for free vs. bound variables, and note that the lack of a variable annotation on [Abs] nodes is inherited from the de Bruijn convention. *)
Inductive type : Set :=
| Bool : type
| Arrow : type -> type -> type.
Infix "-->" := Arrow (right associativity, at level 60).
(** As typing only depends on types of free variables, our contexts borrow their form from the concrete binding example. *)
Definition ctx := list (free_var * type).
Reserved Notation "G |-v x : t" (no associativity, at level 90, x at next level).
Inductive lookup : ctx -> free_var -> type -> Prop :=
| First : forall x t G,
(x, t) :: G |-v x : t
| Next : forall x t x' t' G,
x <> x'
-> G |-v x : t
-> (x', t') :: G |-v x : t
where "G |-v x : t" := (lookup G x t).
Hint Constructors lookup.
(** The first unusual operation we need is %\textit{%#<i>#opening#</i>#%}%, where we replace a particular bound variable with a particular free variable. Whenever we %``%#"#go under a binder,#"#%''% in the typing judgment or elsewhere, we choose a new free variable to replace the old bound variable of the binder. Opening implements the replacement of one by the other. It is like a specialized version of the substitution function we used for pure de Bruijn terms. We implement opening with both the richly-typed natural number equality test [eq_nat_dec] that we have discussed previously and with another standard library function [le_lt_dec], which is an analogous richly-typed test for [<=]. *)
Section open.
Variable x : free_var.
Fixpoint open (n : bound_var) (e : exp) : exp :=
match e with
| Const _ => e
| FreeVar _ => e
| BoundVar n' =>
if eq_nat_dec n' n
then FreeVar x
else if le_lt_dec n' n
then e
else BoundVar (pred n')
| App e1 e2 => App (open n e1) (open n e2)
| Abs e1 => Abs (open (S n) e1)
end.
End open.
(** We will also need to reason about an expression's set of free variables. To keep things simple, we represent sets as lists that may contain duplicates. Note how much easier this operation is to implement than over pure de Bruijn terms, since we do not need to maintain a separate numeric argument that keeps track of how deeply we have descended into the input expression. *)
Fixpoint freeVars (e : exp) : list free_var :=
match e with
| Const _ => nil
| FreeVar x => x :: nil
| BoundVar _ => nil
| App e1 e2 => freeVars e1 ++ freeVars e2
| Abs e1 => freeVars e1
end.
(** It will be useful to have a well-formedness judgment for our terms. This notion is called %\textit{%#<i>#local closure#</i>#%}%. An expression may be declared to be closed, up to a particular maximum de Bruijn index. *)
Inductive lclosed : nat -> exp -> Prop :=
| CConst : forall n b, lclosed n (Const b)
| CFreeVar : forall n v, lclosed n (FreeVar v)
| CBoundVar : forall n v, v < n -> lclosed n (BoundVar v)
| CApp : forall n e1 e2, lclosed n e1 -> lclosed n e2 -> lclosed n (App e1 e2)
| CAbs : forall n e1, lclosed (S n) e1 -> lclosed n (Abs e1).
Hint Constructors lclosed.
(** Now we are ready to define the typing judgment. *)
Reserved Notation "G |-e e : t" (no associativity, at level 90, e at next level).
Inductive hasType : ctx -> exp -> type -> Prop :=
| TConst : forall G b,
G |-e Const b : Bool
| TFreeVar : forall G v t,
G |-v v : t
-> G |-e FreeVar v : t
| TApp : forall G e1 e2 dom ran,
G |-e e1 : dom --> ran
-> G |-e e2 : dom
-> G |-e App e1 e2 : ran
| TAbs : forall G e' dom ran L,
(forall x, ~ In x L -> (x, dom) :: G |-e open x O e' : ran)
-> G |-e Abs e' : dom --> ran
where "G |-e e : t" := (hasType G e t).
(** Compared to the previous versions, only the [TAbs] rule is surprising. The rule uses %\textit{%#<i>#co-finite quantiifcation#</i>#%}%. That is, the premise of the rule quantifies over all [x] values that are not members of a finite set [L]. A proof may choose any value of [L] when applying [TAbs]. An alternate, more intuitive version of the rule would fix [L] to be [freeVars e']. It turns out that the greater flexibility of the rule above simplifies many proofs significantly. This typing judgment may be proved equivalent to the more intuitive version, though we will not carry out the proof here.
Specifially, what our version of [TAbs] says is that, to prove that [Abs e'] has a function type, we must prove that any opening of [e'] with a variable not in [L] has the proper type. For each [x] choice, we extend the context [G] in the usual way. *)
Hint Constructors hasType.
(** We prove a standard weakening theorem for typing, adopting a more general form than in the previous sections. *)
Lemma lookup_push : forall G G' x t x' t',
(forall x t, G |-v x : t -> G' |-v x : t)
-> (x, t) :: G |-v x' : t'
-> (x, t) :: G' |-v x' : t'.
inversion 2; crush.
Qed.
Hint Resolve lookup_push.
Theorem weaken_hasType : forall G e t,
G |-e e : t
-> forall G', (forall x t, G |-v x : t -> G' |-v x : t)
-> G' |-e e : t.
induction 1; crush; eauto.
Qed.
Hint Resolve weaken_hasType.
(** We define a simple extension of [crush] to apply in many of the lemmas that follow. *)
Ltac ln := crush;
repeat (match goal with
| [ |- context[if ?E then _ else _] ] => destruct E
| [ _ : context[if ?E then _ else _] |- _ ] => destruct E
end; crush); eauto.
(** Two basic properties of local closure will be useful later. *)
Lemma lclosed_S : forall x e n,
lclosed n (open x n e)
-> lclosed (S n) e.
induction e; inversion 1; ln.
Qed.
Hint Resolve lclosed_S.
Lemma lclosed_weaken : forall n e,
lclosed n e
-> forall n', n' >= n
-> lclosed n' e.
induction 1; crush.
Qed.
Hint Resolve lclosed_weaken.
Hint Extern 1 (_ >= _) => omega.
(** To prove some further properties, we need the ability to choose a variable that is disjoint from a particular finite set. We implement a specific choice function [fresh]; its details do not matter, as all we need is the final theorem about it, [freshOk]. Concretely, to choose a variable disjoint from set [L], we sum the lengths of the variable names in [L] and choose a new variable name that is one longer than that sum. This variable can be the string ["x"], followed by a number of primes equal to the sum. *)
Open Scope string_scope.
Fixpoint primes (n : nat) : string :=
match n with
| O => "x"
| S n' => primes n' ++ "'"
end.
Fixpoint sumLengths (L : list free_var) : nat :=
match L with
| nil => O
| x :: L' => String.length x + sumLengths L'
end.
Definition fresh (L : list free_var) := primes (sumLengths L).
(** A few lemmas suffice to establish the correctness theorem [freshOk] for [fresh]. *)
Theorem freshOk' : forall x L, String.length x > sumLengths L
-> ~ In x L.
induction L; crush.
Qed.
Lemma length_app : forall s2 s1,
String.length (s1 ++ s2) = String.length s1 + String.length s2.
induction s1; crush.
Qed.
Hint Rewrite length_app : cpdt.
Lemma length_primes : forall n, String.length (primes n) = S n.
induction n; crush.
Qed.
Hint Rewrite length_primes : cpdt.
Theorem freshOk : forall L, ~ In (fresh L) L.
intros; apply freshOk'; unfold fresh; crush.
Qed.
Hint Resolve freshOk.
(** Now we can prove that well-typedness implies local closure. [fresh] will be used for us automatically by [eauto] in the [Abs] case, driven by the presence of [freshOk] as a hint. *)
Lemma hasType_lclosed : forall G e t,
G |-e e : t
-> lclosed O e.
induction 1; eauto.
Qed.
(** An important consequence of local closure is that certain openings are idempotent. *)
Lemma lclosed_open : forall n e, lclosed n e
-> forall x, open x n e = e.
induction 1; ln.
Qed.
Hint Resolve lclosed_open hasType_lclosed.
Open Scope list_scope.
(** We are now almost ready to get down to the details of substitution. First, we prove six lemmas related to treating lists as sets. *)
Lemma In_cons1 : forall T (x x' : T) ls,
x = x'
-> In x (x' :: ls).
crush.
Qed.
Lemma In_cons2 : forall T (x x' : T) ls,
In x ls
-> In x (x' :: ls).
crush.
Qed.
Lemma In_app1 : forall T (x : T) ls2 ls1,
In x ls1
-> In x (ls1 ++ ls2).
induction ls1; crush.
Qed.
Lemma In_app2 : forall T (x : T) ls2 ls1,
In x ls2
-> In x (ls1 ++ ls2).
induction ls1; crush.
Qed.
Lemma freshOk_app1 : forall L1 L2,
~ In (fresh (L1 ++ L2)) L1.
intros; generalize (freshOk (L1 ++ L2)); crush.
Qed.
Lemma freshOk_app2 : forall L1 L2,
~ In (fresh (L1 ++ L2)) L2.
intros; generalize (freshOk (L1 ++ L2)); crush.
Qed.
Hint Resolve In_cons1 In_cons2 In_app1 In_app2.
(** Now we can define our simplest substitution function yet, thanks to the fact that we only subsitute for free variables, which are distinguished syntactically from bound variables. *)
Section subst.
Hint Resolve freshOk_app1 freshOk_app2.
Variable x : free_var.
Variable e1 : exp.
Fixpoint subst (e2 : exp) : exp :=
match e2 with
| Const _ => e2
| FreeVar x' => if string_dec x' x then e1 else e2
| BoundVar _ => e2
| App e1 e2 => App (subst e1) (subst e2)
| Abs e' => Abs (subst e')
end.
Variable xt : type.
(** It comes in handy to define disjointness of a variable and a context differently than in previous examples. We use the standard list function [map], as well as the function [fst] for projecting the first element of a pair. We write [@fst _ _] rather than just [fst] to ask that [fst]'s implicit arguments be instantiated with inferred values. *)
Definition disj x (G : ctx) := In x (map (@fst _ _) G) -> False.
Infix "#" := disj (no associativity, at level 90).
Ltac disj := crush;
match goal with
| [ _ : _ :: _ = ?G0 ++ _ |- _ ] => destruct G0
end; crush; eauto.
(** Some basic properties of variable lookup will be needed on the road to our usual theorem connecting substitution and typing. *)
Lemma lookup_disj' : forall t G1,
G1 |-v x : t
-> forall G, x # G
-> G1 = G ++ (x, xt) :: nil
-> t = xt.
unfold disj; induction 1; disj.
Qed.
Lemma lookup_disj : forall t G,
x # G
-> G ++ (x, xt) :: nil |-v x : t
-> t = xt.
intros; eapply lookup_disj'; eauto.
Qed.
Lemma lookup_ne' : forall G1 v t,
G1 |-v v : t
-> forall G, G1 = G ++ (x, xt) :: nil
-> v <> x
-> G |-v v : t.
induction 1; disj.
Qed.
Lemma lookup_ne : forall G v t,
G ++ (x, xt) :: nil |-v v : t
-> v <> x
-> G |-v v : t.
intros; eapply lookup_ne'; eauto.
Qed.
Hint Extern 1 (_ |-e _ : _) =>
match goal with
| [ H1 : _, H2 : _ |- _ ] => rewrite (lookup_disj H1 H2)
end.
Hint Resolve lookup_ne.
Hint Extern 1 (@eq exp _ _) => f_equal.
(** We need to know that substitution and opening commute under appropriate circumstances. *)
Lemma open_subst : forall x0 e' n,
lclosed n e1
-> x <> x0
-> open x0 n (subst e') = subst (open x0 n e').
induction e'; ln.
Qed.
(** We state a corollary of the last result which will work more smoothly with [eauto]. *)
Lemma hasType_open_subst : forall G x0 e t,
G |-e subst (open x0 0 e) : t
-> x <> x0
-> lclosed 0 e1
-> G |-e open x0 0 (subst e) : t.
intros; rewrite open_subst; eauto.
Qed.
Hint Resolve hasType_open_subst.
(** Another lemma establishes the validity of weakening variable lookup judgments with fresh variables. *)
Lemma disj_push : forall x0 (t : type) G,
x # G
-> x <> x0
-> x # (x0, t) :: G.
unfold disj; crush.
Qed.
Hint Resolve disj_push.
Lemma lookup_cons : forall x0 dom G x1 t,
G |-v x1 : t
-> ~ In x0 (map (@fst _ _) G)
-> (x0, dom) :: G |-v x1 : t.
induction 1; crush;
match goal with
| [ H : _ |-v _ : _ |- _ ] => inversion H
end; crush.
Qed.
Hint Resolve lookup_cons.
Hint Unfold disj.
(** Finally, it is useful to state a version of the [TAbs] rule specialized to the choice of [L] that is useful in our main substitution proof. *)
Lemma TAbs_specialized : forall G e' dom ran L x1,
(forall x, ~ In x (x1 :: L ++ map (@fst _ _) G) -> (x, dom) :: G |-e open x O e' : ran)
-> G |-e Abs e' : dom --> ran.
eauto.
Qed.
(** Now we can prove the main inductive lemma in a manner similar to what worked for concrete binding. *)
Lemma hasType_subst' : forall G1 e t,
G1 |-e e : t
-> forall G, G1 = G ++ (x, xt) :: nil
-> x # G
-> G |-e e1 : xt
-> G |-e subst e : t.
induction 1; ln;
match goal with
| [ L : list free_var, _ : ?x # _ |- _ ] =>
apply TAbs_specialized with L x; eauto 20
end.
Qed.
(** The main theorem about substitution of closed expressions follows easily. *)
Theorem hasType_subst : forall e t,
(x, xt) :: nil |-e e : t
-> nil |-e e1 : xt
-> nil |-e subst e : t.
intros; eapply hasType_subst'; eauto.
Qed.
End subst.
Hint Resolve hasType_subst.
(** We can define the operational semantics in almost the same way as in previous examples. *)
Notation "[ x ~> e1 ] e2" := (subst x e1 e2) (no associativity, at level 60).
Inductive val : exp -> Prop :=
| VConst : forall b, val (Const b)
| VAbs : forall e, val (Abs e).
Hint Constructors val.
Reserved Notation "e1 ==> e2" (no associativity, at level 90).
Inductive step : exp -> exp -> Prop :=
| Beta : forall e1 e2 x,
val e2
-> ~ In x (freeVars e1)
-> App (Abs e1) e2 ==> [x ~> e2] (open x O e1)
| Cong1 : forall e1 e2 e1',
e1 ==> e1'
-> App e1 e2 ==> App e1' e2
| Cong2 : forall e1 e2 e2',
val e1
-> e2 ==> e2'
-> App e1 e2 ==> App e1 e2'
where "e1 ==> e2" := (step e1 e2).
Hint Constructors step.
(** The only interesting change is that the [Beta] rule requires identifying a fresh variable [x] to use in opening the abstraction body. We could have avoided this by implementing a more general [open] that allows substituting expressions for variables, not just variables for variables, but it simplifies the proofs to have just one general substitution function.
Now we are ready to prove progress and preservation. The same proof script from the last examples suffices to prove progress, though significantly different lemmas are applied for us by [eauto]. *)
Lemma progress' : forall G e t, G |-e e : t
-> G = nil
-> val e \/ exists e', e ==> e'.
induction 1; crush; eauto;
try match goal with
| [ H : _ |-e _ : _ --> _ |- _ ] => inversion H
end;
repeat match goal with
| [ H : _ |- _ ] => solve [ inversion H; crush; eauto ]
end.
Qed.
Theorem progress : forall e t, nil |-e e : t
-> val e \/ exists e', e ==> e'.
intros; eapply progress'; eauto.
Qed.
(** To establish preservation, it is useful to formalize a principle of sound alpha-variation. In particular, when we open an expression with a particular variable and then immediately substitute for the same variable, we can replace that variable with any other that is not free in the body of the opened expression. *)
Lemma alpha_open : forall x1 x2 e1 e2 n,
~ In x1 (freeVars e2)
-> ~ In x2 (freeVars e2)
-> [x1 ~> e1](open x1 n e2) = [x2 ~> e1](open x2 n e2).
induction e2; ln.
Qed.
Hint Resolve freshOk_app1 freshOk_app2.
(** Again it is useful to state a direct corollary which is easier to apply in proof search. *)
Lemma hasType_alpha_open : forall G L e0 e2 x t,
~ In x (freeVars e0)
-> G |-e [fresh (L ++ freeVars e0) ~> e2](open (fresh (L ++ freeVars e0)) 0 e0) : t
-> G |-e [x ~> e2](open x 0 e0) : t.
intros; rewrite (alpha_open x (fresh (L ++ freeVars e0))); auto.
Qed.
Hint Resolve hasType_alpha_open.
(** Now the previous sections' preservation proof scripts finish the job. *)
Lemma preservation' : forall G e t, G |-e e : t
-> G = nil
-> forall e', e ==> e'
-> nil |-e e' : t.
induction 1; inversion 2; crush; eauto;
match goal with
| [ H : _ |-e Abs _ : _ |- _ ] => inversion H
end; eauto.
Qed.
Theorem preservation : forall e t, nil |-e e : t
-> forall e', e ==> e'
-> nil |-e e' : t.
intros; eapply preservation'; eauto.
Qed.
End LocallyNameless.
(* Copyright (c) 2008-2011, Adam Chlipala
*
* This work is licensed under a
* Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
* Unported License.
* The license text is available at:
* http://creativecommons.org/licenses/by-nc-nd/3.0/
*)
(* begin hide *)
Require Import Eqdep String List FunctionalExtensionality.
Require Import CpdtTactics.
Set Implicit Arguments.
(* end hide *)
(** %\chapter{Higher-Order Abstract Syntax}% *)
(** In many cases, detailed reasoning about variable binders and substitution is a small annoyance; in other cases, it becomes the dominant cost of proving a theorem formally. No matter which of these possibilities prevails, it is clear that it would be very pragmatic to find a way to avoid reasoning about variable identity or freshness. A well-established alternative to first-order encodings is %\textit{%#<i>#higher-order abstract syntax#</i>#%}%, or HOAS. In mechanized theorem-proving, HOAS is most closely associated with the LF meta logic and the tools based on it, including Twelf.
In this chapter, we will see that HOAS cannot be implemented directly in Coq. However, a few very similar encodings are possible and are in fact very effective in some important domains. *)
(** * Classic HOAS *)
(** The motto of HOAS is simple: represent object language binders using meta language binders. Here, %``%#"#object language#"#%''% refers to the language being formalized, while the meta language is the language in which the formalization is done. Our usual meta language, Coq's Gallina, contains the standard binding facilities of functional programming, making it a promising base for higher-order encodings.
Recall the concrete encoding of basic untyped lambda calculus expressions. *)
Inductive uexp : Set :=
| UVar : string -> uexp
| UApp : uexp -> uexp -> uexp
| UAbs : string -> uexp -> uexp.
(** The explicit presence of variable names forces us to think about issues of freshness and variable capture. The HOAS alternative would look like this. *)
Reset uexp.
(** [[
Inductive uexp : Set :=
| UApp : uexp -> uexp -> uexp
| UAbs : (uexp -> uexp) -> uexp.
]]
We have avoided any mention of variables. Instead, we encode the binding done by abstraction using the binding facilities associated with Gallina functions. For instance, we might represent the term $\lambda x. \; x \; x$#<tt>\x. x x</tt># as [UAbs (fun x => UApp x x)]. Coq has built-in support for matching binders in anonymous [fun] expressions to their uses, so we avoid needing to implement our own binder-matching logic.
This definition is not quite HOAS, because of the broad variety of functions that Coq would allow us to pass as arguments to [UAbs]. We can thus construct many [uexp]s that do not correspond to normal lambda terms. These deviants are called %\textit{%#<i>#exotic terms#</i>#%}%. In LF, functions may only be written in a very restrictive computational language, lacking, among other things, pattern matching and recursive function definitions. Thus, thanks to a careful balancing act of design decisions, exotic terms are not possible with usual HOAS encodings in LF.
Our definition of [uexp] has a more fundamental problem: it is invalid in Gallina.
[[
Error: Non strictly positive occurrence of "uexp" in
"(uexp -> uexp) -> uexp".
]]
We have violated a rule that we considered before: an inductive type may not be defined in terms of functions over itself. Way back in Chapter 3, we considered this example and the reasons why we should be glad that Coq rejects it. Thus, we will need to use more cleverness to reap similar benefits.
The root problem is that our expressions contain variables representing expressions of the same kind. Many useful kinds of syntax involve no such cycles. For instance, it is easy to use HOAS to encode standard first-order logic in Coq. *)
Inductive prop : Type :=
| Eq : forall T, T -> T -> prop
| Not : prop -> prop
| And : prop -> prop -> prop
| Or : prop -> prop -> prop
| Forall : forall T, (T -> prop) -> prop
| Exists : forall T, (T -> prop) -> prop.
Fixpoint propDenote (p : prop) : Prop :=
match p with
| Eq _ x y => x = y
| Not p => ~ (propDenote p)
| And p1 p2 => propDenote p1 /\ propDenote p2
| Or p1 p2 => propDenote p1 \/ propDenote p2
| Forall _ f => forall x, propDenote (f x)
| Exists _ f => exists x, propDenote (f x)
end.
(** Unfortunately, there are other recursive functions that we might like to write but cannot. One simple example is a function to count the number of constructors used to build a [prop]. To look inside a [Forall] or [Exists], we need to look inside the quantifier's body, which is represented as a function. In Gallina, as in most statically-typed functional languages, the only way to interact with a function is to call it. We have no hope of doing that here; the domain of the function in question has an arbitary type [T], so [T] may even be uninhabited. If we had a universal way of constructing values to look inside functions, we would have uncovered a consistency bug in Coq!
We are still suffering from the possibility of writing exotic terms, such as this example: *)
Example true_prop := Eq 1 1.
Example false_prop := Not true_prop.
Example exotic_prop := Forall (fun b : bool => if b then true_prop else false_prop).
(** Thus, the idea of a uniform way of looking inside a binder to find another well-defined [prop] is hopelessly doomed.
A clever HOAS variant called %\textit{%#<i>#weak HOAS#</i>#%}% manages to rule out exotic terms in Coq. Here is a weak HOAS version of untyped lambda terms. *)
Parameter var : Set.
Inductive uexp : Set :=
| UVar : var -> uexp
| UApp : uexp -> uexp -> uexp
| UAbs : (var -> uexp) -> uexp.
(** We postulate the existence of some set [var] of variables, and variable nodes appear explicitly in our syntax. A binder is represented as a function over %\textit{%#<i>#variables#</i>#%}%, rather than as a function over %\textit{%#<i>#expressions#</i>#%}%. This breaks the cycle that led Coq to reject the literal HOAS definition. It is easy to encode our previous example, $\lambda x. \; x \; x$#<tt>\x. x x</tt>#: *)
Example self_app := UAbs (fun x => UApp (UVar x) (UVar x)).
(** What about exotic terms? The problems they caused earlier came from the fact that Gallina is expressive enough to allow us to perform case analysis on the types we used as the domains of binder functions. With weak HOAS, we use an abstract type [var] as the domain. Since we assume the existence of no functions for deconstructing [var]s, Coq's type soundness enforces that no Gallina term of type [uexp] can take different values depending on the value of a [var] available in the typing context, %\textit{%#<i>#except#</i>#%}% by incorporating those variables into a [uexp] value in a legal way.
Weak HOAS retains the other disadvantage of our previous example: it is hard to write recursive functions that deconstruct terms. As with the previous example, some functions %\textit{%#<i>#are#</i>#%}% implementable. For instance, we can write a function to reverse the function and argument positions of every [UApp] node. *)
Fixpoint swap (e : uexp) : uexp :=
match e with
| UVar _ => e
| UApp e1 e2 => UApp (swap e2) (swap e1)
| UAbs e1 => UAbs (fun x => swap (e1 x))
end.
(** However, it is still impossible to write a function to compute the size of an expression. We would still need to manufacture a value of type [var] to peer under a binder, and that is impossible, because [var] is an abstract type. *)
(** * Parametric HOAS *)
(** In the context of Haskell, Washburn and Weirich introduced a technique called %\emph{%#<i>#parametric HOAS#</i>#%}%, or PHOAS. By making a slight alteration in the spirit of weak HOAS, we arrive at an encoding that addresses all three of the complaints above: the encoding is legal in Coq, exotic terms are impossible, and it is possible to write any syntax-deconstructing function that we can write with first-order encodings. The last of these advantages is not even present with HOAS in Twelf. In a sense, we receive it in exchange for giving up a free implementation of capture-avoiding substitution.
The first step is to change the weak HOAS type so that [var] is a variable inside a section, rather than a global parameter. *)
Reset uexp.
Section uexp.
Variable var : Set.
Inductive uexp : Set :=
| UVar : var -> uexp
| UApp : uexp -> uexp -> uexp
| UAbs : (var -> uexp) -> uexp.
End uexp.
(** Next, we can encapsulate choices of [var] inside a polymorphic function type. *)
Definition Uexp := forall var, uexp var.
(** This type [Uexp] is our final, exotic-term-free representation of lambda terms. Inside the body of a [Uexp] function, [var] values may not be deconstructed illegaly, for much the same reason as with weak HOAS. We simply trade an abstract type for parametric polymorphism.
Our running example $\lambda x. \; x \; x$#<tt>\x. x x</tt># is easily expressed: *)
Example self_app : Uexp := fun var => UAbs (var := var)
(fun x : var => UApp (var := var) (UVar (var := var) x) (UVar (var := var) x)).
(** Including all mentions of [var] explicitly helps clarify what is happening here, but it is convenient to let Coq's local type inference fill in these occurrences for us. *)
Example self_app' : Uexp := fun _ => UAbs (fun x => UApp (UVar x) (UVar x)).
(** We can go further and apply the PHOAS technique to dependently-typed ASTs, where Gallina typing guarantees that only well-typed terms can be represented. For the rest of this chapter, we consider the example of simply-typed lambda calculus with natural numbers and addition. We start with a conventional definition of the type language. *)
Inductive type : Type :=
| Nat : type
| Arrow : type -> type -> type.
Infix "-->" := Arrow (right associativity, at level 60).
(** Our definition of the expression type follows the definition for untyped lambda calculus, with one important change. Now our section variable [var] is not just a type. Rather, it is a %\textit{%#<i>#function#</i>#%}% returning types. The idea is that a variable of object language type [t] is represented by a [var t]. Note how this enables us to avoid indexing the [exp] type with a representation of typing contexts. *)
Section exp.
Variable var : type -> Type.
Inductive exp : type -> Type :=
| Const' : nat -> exp Nat
| Plus' : exp Nat -> exp Nat -> exp Nat
| Var : forall t, var t -> exp t
| App' : forall dom ran, exp (dom --> ran) -> exp dom -> exp ran
| Abs' : forall dom ran, (var dom -> exp ran) -> exp (dom --> ran).
End exp.
Implicit Arguments Const' [var].
Implicit Arguments Var [var t].
Implicit Arguments Abs' [var dom ran].
(** Our final representation type wraps [exp] as before. *)
Definition Exp t := forall var, exp var t.
(* begin thide *)
(** We can define some smart constructors to make it easier to build [Exp]s without using polymorphism explicitly. *)
Definition Const (n : nat) : Exp Nat :=
fun _ => Const' n.
Definition Plus (E1 E2 : Exp Nat) : Exp Nat :=
fun _ => Plus' (E1 _) (E2 _).
Definition App dom ran (F : Exp (dom --> ran)) (X : Exp dom) : Exp ran :=
fun _ => App' (F _) (X _).
(** A case for function abstraction is not as natural, but we can implement one candidate in terms of a type family [Exp1], such that [Exp1 free result] represents an expression of type [result] with one free variable of type [free]. *)
Definition Exp1 t1 t2 := forall var, var t1 -> exp var t2.
Definition Abs dom ran (B : Exp1 dom ran) : Exp (dom --> ran) :=
fun _ => Abs' (B _).
(* end thide *)
(* EX: Define appropriate shorthands, so that these definitions type-check. *)
(** Now it is easy to encode a number of example programs. *)
Example zero := Const 0.
Example one := Const 1.
Example one_again := Plus zero one.
Example ident : Exp (Nat --> Nat) := Abs (fun _ X => Var X).
Example app_ident := App ident one_again.
Example app : Exp ((Nat --> Nat) --> Nat --> Nat) := fun _ =>
Abs' (fun f => Abs' (fun x => App' (Var f) (Var x))).
Example app_ident' := App (App app ident) one_again.
(* EX: Define a function to count the number of variable occurrences in an [Exp]. *)
(** We can write syntax-deconstructing functions, such as [CountVars], which counts how many [Var] nodes appear in an [Exp]. First, we write a version [countVars] for [exp]s. The main trick is to specialize [countVars] to work over expressions where [var] is instantiated as [fun _ => unit]. That is, every variable is just a value of type [unit], such that variables carry no information. The important thing is that we have a value [tt] of type [unit] available, to use in descending into binders. *)
(* begin thide *)
Fixpoint countVars t (e : exp (fun _ => unit) t) : nat :=
match e with
| Const' _ => 0
| Plus' e1 e2 => countVars e1 + countVars e2
| Var _ _ => 1
| App' _ _ e1 e2 => countVars e1 + countVars e2
| Abs' _ _ e' => countVars (e' tt)
end.
(** We turn [countVars] into [CountVars] with explicit instantiation of a polymorphic [Exp] value [E]. We can write an underscore for the paramter to [E], because local type inference is able to infer the proper value. *)
Definition CountVars t (E : Exp t) : nat := countVars (E _).
(* end thide *)
(** A few evaluations establish that [CountVars] behaves plausibly. *)
Eval compute in CountVars zero.
(** %\vspace{-.15in}% [[
= 0
: nat
]]
*)
Eval compute in CountVars one.
(** %\vspace{-.15in}% [[
= 0
: nat
]]
*)
Eval compute in CountVars one_again.
(** %\vspace{-.15in}% [[
= 0
: nat
]]
*)
Eval compute in CountVars ident.
(** %\vspace{-.15in}% [[
= 1
: nat
]]
*)
Eval compute in CountVars app_ident.
(** %\vspace{-.15in}% [[
= 1
: nat
]]
*)
Eval compute in CountVars app.
(** %\vspace{-.15in}% [[
= 2
: nat
]]
*)
Eval compute in CountVars app_ident'.
(** %\vspace{-.15in}% [[
= 3
: nat
]]
*)
(* EX: Define a function to count the number of occurrences of a single distinguished variable. *)
(** We might want to go further and count occurrences of a single distinguished free variable in an expression. In this case, it is useful to instantiate [var] as [fun _ => bool]. We will represent the distinguished variable with [true] and all other variables with [false]. *)
(* begin thide *)
Fixpoint countOne t (e : exp (fun _ => bool) t) : nat :=
match e with
| Const' _ => 0
| Plus' e1 e2 => countOne e1 + countOne e2
| Var _ true => 1
| Var _ false => 0
| App' _ _ e1 e2 => countOne e1 + countOne e2
| Abs' _ _ e' => countOne (e' false)
end.
(** We wrap [countOne] into [CountOne], which we type using the [Exp1] definition from before. [CountOne] operates on an expression [E] with a single free variable. We apply an instantiated [E] to [true] to mark this variable as the one [countOne] should look for. [countOne] itself is careful to instantiate all other variables with [false]. *)
Definition CountOne t1 t2 (E : Exp1 t1 t2) : nat :=
countOne (E _ true).
(* end thide *)
(** We can check the behavior of [CountOne] on a few examples. *)
Example ident1 : Exp1 Nat Nat := fun _ X => Var X.
Example add_self : Exp1 Nat Nat := fun _ X => Plus' (Var X) (Var X).
Example app_zero : Exp1 (Nat --> Nat) Nat := fun _ X => App' (Var X) (Const' 0).
Example app_ident1 : Exp1 Nat Nat := fun _ X =>
App' (Abs' (fun Y => Var Y)) (Var X).
Eval compute in CountOne ident1.
(** %\vspace{-.15in}% [[
= 1
: nat
]]
*)
Eval compute in CountOne add_self.
(** %\vspace{-.15in}% [[
= 2
: nat
]]
*)
Eval compute in CountOne app_zero.
(** %\vspace{-.15in}% [[
= 1
: nat
]]
*)
Eval compute in CountOne app_ident1.
(** %\vspace{-.15in}% [[
= 1
: nat
]]
*)
(* EX: Define a function to pretty-print [Exp]s as strings. *)
(** The PHOAS encoding turns out to be just as general as the first-order encodings we saw previously. To provide a taste of that generality, we implement a translation into concrete syntax, rendered in human-readable strings. This is as easy as representing variables as strings. *)
(* begin thide *)
Section ToString.
Open Scope string_scope.
Fixpoint natToString (n : nat) : string :=
match n with
| O => "O"
| S n' => "S(" ++ natToString n' ++ ")"
end.
(** Function [toString] takes an extra argument [cur], which holds the last variable name assigned to a binder. We build new variable names by extending [cur] with primes. The function returns a pair of the next available variable name and of the actual expression rendering. *)
Fixpoint toString t (e : exp (fun _ => string) t) (cur : string) : string * string :=
match e with
| Const' n => (cur, natToString n)
| Plus' e1 e2 =>
let (cur', s1) := toString e1 cur in
let (cur'', s2) := toString e2 cur' in
(cur'', "(" ++ s1 ++ ") + (" ++ s2 ++ ")")
| Var _ s => (cur, s)
| App' _ _ e1 e2 =>
let (cur', s1) := toString e1 cur in
let (cur'', s2) := toString e2 cur' in
(cur'', "(" ++ s1 ++ ") (" ++ s2 ++ ")")
| Abs' _ _ e' =>
let (cur', s) := toString (e' cur) (cur ++ "'") in
(cur', "(\" ++ cur ++ ", " ++ s ++ ")")
end.
Definition ToString t (E : Exp t) : string := snd (toString (E _) "x").
End ToString.
(* end thide *)
Eval compute in ToString zero.
(** %\vspace{-.15in}% [[
= "O"%string
: string
]]
*)
Eval compute in ToString one.
(** %\vspace{-.15in}% [[
= "S(O)"%string
: string
]]
*)
Eval compute in ToString one_again.
(** %\vspace{-.15in}% [[
= "(O) + (S(O))"%string
: string
]]
*)
Eval compute in ToString ident.
(** %\vspace{-.15in}% [[
= "(\x, x)"%string
: string
]]
*)
Eval compute in ToString app_ident.
(** %\vspace{-.15in}% [[
= "((\x, x)) ((O) + (S(O)))"%string
: string
]]
*)
Eval compute in ToString app.
(** %\vspace{-.15in}% [[
= "(\x, (\x', (x) (x')))"%string
: string
]]
*)
Eval compute in ToString app_ident'.
(** %\vspace{-.15in}% [[
= "(((\x, (\x', (x) (x')))) ((\x'', x''))) ((O) + (S(O)))"%string
: string
]]
*)
(* EX: Define a substitution function. *)
(** Our final example is crucial to using PHOAS to encode standard operational semantics. We define capture-avoiding substitution, in terms of a function [flatten] which takes in an expression that represents variables as expressions. [flatten] replaces every node [Var e] with [e]. *)
(* begin thide *)
Section flatten.
Variable var : type -> Type.
Fixpoint flatten t (e : exp (exp var) t) : exp var t :=
match e with
| Const' n => Const' n
| Plus' e1 e2 => Plus' (flatten e1) (flatten e2)
| Var _ e' => e'
| App' _ _ e1 e2 => App' (flatten e1) (flatten e2)
| Abs' _ _ e' => Abs' (fun x => flatten (e' (Var x)))
end.
End flatten.
(** Flattening turns out to implement the heart of substitution. We apply [E2], which has one free variable, to [E1], replacing the occurrences of the free variable by copies of [E1]. [flatten] takes care of removing the extra [Var] applications around these copies. *)
Definition Subst t1 t2 (E1 : Exp t1) (E2 : Exp1 t1 t2) : Exp t2 := fun _ =>
flatten (E2 _ (E1 _)).
(* end thide *)
Eval compute in Subst one ident1.
(** %\vspace{-.15in}% [[
= fun var : type -> Type => Const' 1
: Exp Nat
]]
*)
Eval compute in Subst one add_self.
(** %\vspace{-.15in}% [[
= fun var : type -> Type => Plus' (Const' 1) (Const' 1)
: Exp Nat
]]
*)
Eval compute in Subst ident app_zero.
(** %\vspace{-.15in}% [[
= fun var : type -> Type =>
App' (Abs' (fun X : var Nat => Var X)) (Const' 0)
: Exp Nat
]]
*)
Eval compute in Subst one app_ident1.
(** %\vspace{-.15in}% [[
= fun var : type -> Type =>
App' (Abs' (fun x : var Nat => Var x)) (Const' 1)
: Exp Nat
]]
*)
(** * A Type Soundness Proof *)
(** With [Subst] defined, there are few surprises encountered in defining a standard small-step, call-by-value semantics for our object language. We begin by classifying a subset of expressions as values. *)
Inductive Val : forall t, Exp t -> Prop :=
| VConst : forall n, Val (Const n)
| VAbs : forall dom ran (B : Exp1 dom ran), Val (Abs B).
Hint Constructors Val.
(** Since this language is more complicated than the one we considered in the chapter on first-order encodings, we will use explicit evaluation contexts to define the semantics. A value of type [Ctx t u] is a context that yields an expression of type [u] when filled by an expression of type [t]. We have one context for each position of the [App] and [Plus] constructors. *)
Inductive Ctx : type -> type -> Type :=
| AppCong1 : forall (dom ran : type),
Exp dom -> Ctx (dom --> ran) ran
| AppCong2 : forall (dom ran : type),
Exp (dom --> ran) -> Ctx dom ran
| PlusCong1 : Exp Nat -> Ctx Nat Nat
| PlusCong2 : Exp Nat -> Ctx Nat Nat.
(** A judgment characterizes when contexts are valid, enforcing the standard call-by-value restriction that certain positions must hold values. *)
Inductive isCtx : forall t1 t2, Ctx t1 t2 -> Prop :=
| IsApp1 : forall dom ran (X : Exp dom), isCtx (AppCong1 ran X)
| IsApp2 : forall dom ran (F : Exp (dom --> ran)), Val F -> isCtx (AppCong2 F)
| IsPlus1 : forall E2, isCtx (PlusCong1 E2)
| IsPlus2 : forall E1, Val E1 -> isCtx (PlusCong2 E1).
(** A simple definition implements plugging a context with a specific expression. *)
Definition plug t1 t2 (C : Ctx t1 t2) : Exp t1 -> Exp t2 :=
match C with
| AppCong1 _ _ X => fun F => App F X
| AppCong2 _ _ F => fun X => App F X
| PlusCong1 E2 => fun E1 => Plus E1 E2
| PlusCong2 E1 => fun E2 => Plus E1 E2
end.
Infix "@" := plug (no associativity, at level 60).
(** Finally, we have the step relation itself, which combines our ingredients in the standard way. In the congruence rule, we introduce the extra variable [E1] and its associated equality to make the rule easier for [eauto] to apply. *)
(** printing ==> $\Rightarrow$ *)
Reserved Notation "E1 ==> E2" (no associativity, at level 90).
Inductive Step : forall t, Exp t -> Exp t -> Prop :=
| Beta : forall dom ran (B : Exp1 dom ran) (X : Exp dom),
Val X
-> App (Abs B) X ==> Subst X B
| Sum : forall n1 n2,
Plus (Const n1) (Const n2) ==> Const (n1 + n2)
| Cong : forall t t' (C : Ctx t t') E E' E1,
isCtx C
-> E1 = C @ E
-> E ==> E'
-> E1 ==> C @ E'
where "E1 ==> E2" := (Step E1 E2).
Hint Constructors isCtx Step.
(* EX: Prove type soundness. *)
(** To prove type soundness for this semantics, we need to overcome one crucial obstacle. Standard proofs use induction on the structure of typing derivations. Our encoding mixes typing derivations with expression syntax, so we want to induct over expression structure. Our expressions are represented as functions, which do not, in general, admit induction in Coq. However, because of our use of parametric polymorphism, we know that our expressions do, in fact, have inductive structure. In particular, every closed value of [Exp] type must belong to the following relation. *)
(* begin thide *)
Inductive Closed : forall t, Exp t -> Prop :=
| CConst : forall n,
Closed (Const n)
| CPlus : forall E1 E2,
Closed E1
-> Closed E2
-> Closed (Plus E1 E2)
| CApp : forall dom ran (E1 : Exp (dom --> ran)) E2,
Closed E1
-> Closed E2
-> Closed (App E1 E2)
| CAbs : forall dom ran (E1 : Exp1 dom ran),
Closed (Abs E1).
(** How can we prove such a fact? It probably cannot be established in Coq without axioms. Rather, one would have to establish it metatheoretically, reasoning informally outside of Coq. For now, we assert the fact as an axiom. The later chapter on intensional transformations shows one approach to removing the need for an axiom. *)
Axiom closed : forall t (E : Exp t), Closed E.
(** The usual progress and preservation theorems are now very easy to prove. In fact, preservation is implicit in our dependently-typed definition of [Step]. This is a huge win, because we avoid completely the theorem about substitution and typing that made up the bulk of each proof in the chapter on first-order encodings. The progress theorem yields to a few lines of automation.
We define a slight variant of [crush] which also looks for chances to use the theorem [inj_pair2] on hypotheses. This theorem deals with an artifact of the way that [inversion] works on dependently-typed hypotheses. *)
Ltac my_crush' :=
crush;
repeat (match goal with
| [ H : _ |- _ ] => generalize (inj_pair2 _ _ _ _ _ H); clear H
end; crush).
Hint Extern 1 (_ = _ @ _) => simpl.
(** This is the point where we need to do induction over functions, in the form of expressions [E]. The judgment [Closed] provides the perfect framework; we induct over [Closed] derivations. *)
Lemma progress' : forall t (E : Exp t),
Closed E
-> Val E \/ exists E', E ==> E'.
induction 1; crush;
repeat match goal with
| [ H : Val _ |- _ ] => inversion H; []; clear H; my_crush'
end; eauto 6.
Qed.
(** Our final proof of progress makes one top-level use of the axiom [closed] that we asserted above. *)
Theorem progress : forall t (E : Exp t),
Val E \/ exists E', E ==> E'.
intros; apply progress'; apply closed.
Qed.
(* end thide *)
(** * Big-Step Semantics *)
(** Another standard exercise in operational semantics is proving the equivalence of small-step and big-step semantics. We can carry out this exercise for our PHOAS lambda calculus. Most of the steps are just as pleasant as in the previous section, but things get complicated near to the end.
We must start by defining the big-step semantics itself. The definition is completely standard. *)
(** printing ===> $\Longrightarrow$ *)
Reserved Notation "E1 ===> E2" (no associativity, at level 90).
Inductive BigStep : forall t, Exp t -> Exp t -> Prop :=
| SConst : forall n,
Const n ===> Const n
| SPlus : forall E1 E2 n1 n2,
E1 ===> Const n1
-> E2 ===> Const n2
-> Plus E1 E2 ===> Const (n1 + n2)
| SApp : forall dom ran (E1 : Exp (dom --> ran)) E2 B V2 V,
E1 ===> Abs B
-> E2 ===> V2
-> Subst V2 B ===> V
-> App E1 E2 ===> V
| SAbs : forall dom ran (B : Exp1 dom ran),
Abs B ===> Abs B
where "E1 ===> E2" := (BigStep E1 E2).
Hint Constructors BigStep.
(* EX: Prove the equivalence of the small- and big-step semantics. *)
(** To prove a crucial intermediate lemma, we will want to name the transitive-reflexive closure of the small-step relation. *)
(* begin thide *)
(** printing ==>* $\Rightarrow^*$ *)
Reserved Notation "E1 ==>* E2" (no associativity, at level 90).
Inductive MultiStep : forall t, Exp t -> Exp t -> Prop :=
| Done : forall t (E : Exp t), E ==>* E
| OneStep : forall t (E E' E'' : Exp t),
E ==> E'
-> E' ==>* E''
-> E ==>* E''
where "E1 ==>* E2" := (MultiStep E1 E2).
Hint Constructors MultiStep.
(** A few basic properties of evaluation and values admit easy proofs. *)
Theorem MultiStep_trans : forall t (E1 E2 E3 : Exp t),
E1 ==>* E2
-> E2 ==>* E3
-> E1 ==>* E3.
induction 1; eauto.
Qed.
Theorem Big_Val : forall t (E V : Exp t),
E ===> V
-> Val V.
induction 1; crush.
Qed.
Theorem Val_Big : forall t (V : Exp t),
Val V
-> V ===> V.
destruct 1; crush.
Qed.
Hint Resolve Big_Val Val_Big.
(** Another useful property deals with pushing multi-step evaluation inside of contexts. *)
Lemma Multi_Cong : forall t t' (C : Ctx t t'),
isCtx C
-> forall E E', E ==>* E'
-> C @ E ==>* C @ E'.
induction 2; crush; eauto.
Qed.
Lemma Multi_Cong' : forall t t' (C : Ctx t t') E1 E2 E E',
isCtx C
-> E1 = C @ E
-> E2 = C @ E'
-> E ==>* E'
-> E1 ==>* E2.
crush; apply Multi_Cong; auto.
Qed.
Hint Resolve Multi_Cong'.
(** Unrestricted use of transitivity of [==>*] can lead to very large [eauto] search spaces, which has very inconvenient efficiency consequences. Instead, we define a special tactic [mtrans] that tries applying transitivity with a particular intermediate expression. *)
Ltac mtrans E :=
match goal with
| [ |- E ==>* _ ] => fail 1
| _ => apply MultiStep_trans with E; [ solve [ eauto ] | eauto ]
end.
(** With [mtrans], we can give a reasonably short proof of one direction of the equivalence between big-step and small-step semantics. We include proof cases specific to rules of the big-step semantics, since leaving the details to [eauto] would lead to a very slow proof script. The use of [solve] in [mtrans]'s definition keeps us from going down unfruitful paths. *)
Theorem Big_Multi : forall t (E V : Exp t),
E ===> V
-> E ==>* V.
induction 1; crush; eauto;
repeat match goal with
| [ n1 : _, E2 : _ |- _ ] => mtrans (Plus (Const n1) E2)
| [ n1 : _, n2 : _ |- _ ] => mtrans (Plus (Const n1) (Const n2))
| [ B : _, E2 : _ |- _ ] => mtrans (App (Abs B) E2)
end.
Qed.
(** We are almost ready to prove the other direction of the equivalence. First, we wrap an earlier lemma in a form that will work better with [eauto]. *)
Lemma Big_Val' : forall t (V1 V2 : Exp t),
Val V2
-> V1 = V2
-> V1 ===> V2.
crush.
Qed.
Hint Resolve Big_Val'.
(** Now we build some quite involved tactic support for reasoning about equalities over PHOAS terms. First, we will call [equate_conj F G] to determine the consequences of an equality [F = G]. When [F = f e_1 ... e_n] and [G = f e'_1 ... e'_n], [equate_conj] will return a conjunction [e_1 = e'_1 /\ ... /\ e_n = e'_n]. We hardcode a pattern for each value of [n] from 1 to 5. *)
Ltac equate_conj F G :=
match constr:(F, G) with
| (_ ?x1, _ ?x2) => constr:(x1 = x2)
| (_ ?x1 ?y1, _ ?x2 ?y2) => constr:(x1 = x2 /\ y1 = y2)
| (_ ?x1 ?y1 ?z1, _ ?x2 ?y2 ?z2) => constr:(x1 = x2 /\ y1 = y2 /\ z1 = z2)
| (_ ?x1 ?y1 ?z1 ?u1, _ ?x2 ?y2 ?z2 ?u2) =>
constr:(x1 = x2 /\ y1 = y2 /\ z1 = z2 /\ u1 = u2)
| (_ ?x1 ?y1 ?z1 ?u1 ?v1, _ ?x2 ?y2 ?z2 ?u2 ?v2) =>
constr:(x1 = x2 /\ y1 = y2 /\ z1 = z2 /\ u1 = u2 /\ v1 = v2)
end.
(** The main tactic is [my_crush], which generalizes our earlier [my_crush'] by performing inversion on hypotheses that equate PHOAS terms. Coq's built-in [inversion] is only designed to be useful on equalities over inductive types. PHOAS terms are functions, so [inversion] is not very helpful on them. To perform the equivalent of [discriminate], we instantiate the terms with [var] as [fun _ => unit] and then appeal to normal [discriminate]. This eliminates some contradictory cases. To perform the equivalent of [injection], we must consider all possible [var] instantiations. Some fairly intricate logic strings together these elements. The details are not worth discussing, since our conclusion will be that one should avoid dealing with proofs of facts like this one. *)
Ltac my_crush :=
my_crush';
repeat (match goal with
| [ H : ?F = ?G |- _ ] =>
(let H' := fresh "H'" in
assert (H' : F (fun _ => unit) = G (fun _ => unit)); [ congruence
| discriminate || injection H'; clear H' ];
my_crush';
repeat match goal with
| [ H : context[fun _ => unit] |- _ ] => clear H
end;
match type of H with
| ?F = ?G =>
let ec := equate_conj F G in
let var := fresh "var" in
assert ec; [ intuition; unfold Exp; extensionality var;
assert (H' : F var = G var); try congruence;
match type of H' with
| ?X = ?Y =>
let X := eval hnf in X in
let Y := eval hnf in Y in
change (X = Y) in H'
end; injection H'; my_crush'; tauto
| intuition; subst ]
end);
clear H
end; my_crush');
my_crush'.
(** With that complicated tactic available, the proof of the main lemma is straightforward. *)
Lemma Multi_Big' : forall t (E E' : Exp t),
E ==> E'
-> forall E'', E' ===> E''
-> E ===> E''.
induction 1; crush; eauto;
match goal with
| [ H : _ ===> _ |- _ ] => inversion H; my_crush; eauto
end;
match goal with
| [ H : isCtx _ |- _ ] => inversion H; my_crush; eauto
end.
Qed.
Hint Resolve Multi_Big'.
(** The other direction of the overall equivalence follows as an easy corollary. *)
Theorem Multi_Big : forall t (E V : Exp t),
E ==>* V
-> Val V
-> E ===> V.
induction 1; crush; eauto.
Qed.
(* end thide *)
(** The lesson here is that working directly with PHOAS terms can easily lead to extremely intricate proofs. It is usually a better idea to stick to inductive proofs about %\textit{%#<i>#instantiated#</i>#%}% PHOAS terms; in the case of this example, that means proofs about [exp] instead of [Exp]. Such results can usually be wrapped into results about [Exp] without further induction. Different theorems demand different variants of this underlying advice, and we will consider several of them in the chapters to come. *)
(* Copyright (c) 2008-2009, 2011, Adam Chlipala
*
* This work is licensed under a
* Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
* Unported License.
* The license text is available at:
* http://creativecommons.org/licenses/by-nc-nd/3.0/
*)
(* begin hide *)
Require Import Arith Eqdep List FunctionalExtensionality.
Require Import DepList CpdtTactics.
Set Implicit Arguments.
(* end hide *)
(** %\chapter{Intensional Transformations}% *)
(** The essential benefit of higher-order encodings is that variable contexts are implicit. We represent the object language's contexts within the meta language's contexts. For translations like CPS conversion, this is a clear win, since such translations do not need to keep track of details of which variables are available. Other important translations, including closure conversion, need to work with variables as first-class, analyzable values.
Another example is conversion from PHOAS terms to de Bruijn terms. The output format makes the structure of variables explicit, so the translation requires explicit reasoning about variable identity. In this chapter, we implement verified translations in both directions between last chapter's PHOAS language and a de Bruijn version of it. Along the way, we show one approach to avoiding the use of axioms with PHOAS. *)
(* begin hide *)
Inductive type : Type :=
| Nat : type
| Arrow : type -> type -> type.
Infix "-->" := Arrow (right associativity, at level 60).
Fixpoint typeDenote (t : type) : Set :=
match t with
| Nat => nat
| t1 --> t2 => typeDenote t1 -> typeDenote t2
end.
Module Phoas.
Section vars.
Variable var : type -> Type.
Inductive exp : type -> Type :=
| Var : forall t,
var t
-> exp t
| Const : nat -> exp Nat
| Plus : exp Nat -> exp Nat -> exp Nat
| App : forall t1 t2,
exp (t1 --> t2)
-> exp t1
-> exp t2
| Abs : forall t1 t2,
(var t1 -> exp t2)
-> exp (t1 --> t2).
End vars.
Definition Exp t := forall var, exp var t.
Implicit Arguments Var [var t].
Implicit Arguments Const [var].
Implicit Arguments Plus [var].
Implicit Arguments App [var t1 t2].
Implicit Arguments Abs [var t1 t2].
Notation "# v" := (Var v) (at level 70).
Notation "^ n" := (Const n) (at level 70).
Infix "+^" := Plus (left associativity, at level 79).
Infix "@" := App (left associativity, at level 77).
Notation "\ x , e" := (Abs (fun x => e)) (at level 78).
Notation "\ ! , e" := (Abs (fun _ => e)) (at level 78).
Fixpoint expDenote t (e : exp typeDenote t) : typeDenote t :=
match e with
| Var _ v => v
| Const n => n
| Plus e1 e2 => expDenote e1 + expDenote e2
| App _ _ e1 e2 => (expDenote e1) (expDenote e2)
| Abs _ _ e' => fun x => expDenote (e' x)
end.
Definition ExpDenote t (e : Exp t) := expDenote (e _).
Section exp_equiv.
Variables var1 var2 : type -> Type.
Inductive exp_equiv : list { t : type & var1 t * var2 t }%type
-> forall t, exp var1 t -> exp var2 t -> Prop :=
| EqVar : forall G t (v1 : var1 t) v2,
In (existT _ t (v1, v2)) G
-> exp_equiv G (#v1) (#v2)
| EqConst : forall G n,
exp_equiv G (^n) (^n)
| EqPlus : forall G x1 y1 x2 y2,
exp_equiv G x1 x2
-> exp_equiv G y1 y2
-> exp_equiv G (x1 +^ y1) (x2 +^ y2)
| EqApp : forall G t1 t2 (f1 : exp _ (t1 --> t2)) (x1 : exp _ t1) f2 x2,
exp_equiv G f1 f2
-> exp_equiv G x1 x2
-> exp_equiv G (f1 @ x1) (f2 @ x2)
| EqAbs : forall G t1 t2 (f1 : var1 t1 -> exp var1 t2) f2,
(forall v1 v2, exp_equiv (existT _ t1 (v1, v2) :: G) (f1 v1) (f2 v2))
-> exp_equiv G (Abs f1) (Abs f2).
End exp_equiv.
Definition Wf t (E : Exp t) := forall var1 var2, exp_equiv nil (E var1) (E var2).
End Phoas.
(* end hide *)
(** The de Bruijn version of simply-typed lambda calculus is defined in a manner that should be old hat by now. *)
Module DeBruijn.
Inductive exp : list type -> type -> Type :=
| Var : forall G t,
member t G
-> exp G t
| Const : forall G, nat -> exp G Nat
| Plus : forall G, exp G Nat -> exp G Nat -> exp G Nat
| App : forall G t1 t2,
exp G (t1 --> t2)
-> exp G t1
-> exp G t2
| Abs : forall G t1 t2,
exp (t1 :: G) t2
-> exp G (t1 --> t2).
Implicit Arguments Const [G].
Fixpoint expDenote G t (e : exp G t) : hlist typeDenote G -> typeDenote t :=
match e with
| Var _ _ v => fun s => hget s v
| Const _ n => fun _ => n
| Plus _ e1 e2 => fun s => expDenote e1 s + expDenote e2 s
| App _ _ _ e1 e2 => fun s => (expDenote e1 s) (expDenote e2 s)
| Abs _ _ _ e' => fun s x => expDenote e' (x ::: s)
end.
End DeBruijn.
Import Phoas DeBruijn.
(** * From De Bruijn to PHOAS *)
(** The heart of the translation into PHOAS is this function [phoasify], which is parameterized by an [hlist] that represents a mapping from de Bruijn variables to PHOAS variables. *)
Section phoasify.
Variable var : type -> Type.
Fixpoint phoasify G t (e : DeBruijn.exp G t) : hlist var G -> Phoas.exp var t :=
match e with
| Var _ _ v => fun s => #(hget s v)
| Const _ n => fun _ => ^n
| Plus _ e1 e2 => fun s => phoasify e1 s +^ phoasify e2 s
| App _ _ _ e1 e2 => fun s => phoasify e1 s @ phoasify e2 s
| Abs _ _ _ e' => fun s => \x, phoasify e' (x ::: s)
end.
End phoasify.
Definition Phoasify t (e : DeBruijn.exp nil t) : Phoas.Exp t :=
fun _ => phoasify e HNil.
(** It turns out to be trivial to establish the translation's soundness. *)
Theorem phoasify_sound : forall G t (e : DeBruijn.exp G t) s,
Phoas.expDenote (phoasify e s) = DeBruijn.expDenote e s.
induction e; crush; (let x := fresh in extensionality x); crush.
Qed.
(** We can prove that any output of [Phoasify] is well-formed, in a sense strong enough to let us avoid asserting last chapter's axiom. *)
Print Wf.
(** %\vspace{-.15in}% [[
Wf =
fun (t : type) (E : Exp t) =>
forall var1 var2 : type -> Type, exp_equiv nil (E var1) (E var2)
: forall t : type, Exp t -> Prop
]]
*)
Section vars.
Variables var1 var2 : type -> Type.
(** In the course of proving well-formedness, we will need to translate back and forth between the de Bruijn and PHOAS representations of free variable information. The function [zip] combines two de Bruijn substitutions into a single PHOAS context. *)
Fixpoint zip G (s1 : hlist var1 G)
: hlist var2 G -> list {t : type & var1 t * var2 t}%type :=
match s1 with
| HNil => fun _ => nil
| HCons _ _ v1 s1' => fun s2 => existT _ _ (v1, hhd s2) :: zip s1' (htl s2)
end.
(** Two simple lemmas about [zip] will make useful hints. *)
Lemma In_zip : forall t G (s1 : hlist _ G) s2 (m : member t G),
In (existT _ t (hget s1 m, hget s2 m)) (zip s1 s2).
induction s1; intro s2; dep_destruct s2; intro m; dep_destruct m; crush.
Qed.
Lemma unsimpl_zip : forall t (v1 : var1 t) (v2 : var2 t)
G (s1 : hlist _ G) s2 t' (e1 : Phoas.exp _ t') e2,
exp_equiv (zip (v1 ::: s1) (v2 ::: s2)) e1 e2
-> exp_equiv (existT _ _ (v1, v2) :: zip s1 s2) e1 e2.
trivial.
Qed.
Hint Resolve In_zip unsimpl_zip.
(** Now it is trivial to prove the main inductive lemma about well-formedness. *)
Lemma phoasify_wf : forall G t (e : DeBruijn.exp G t) s1 s2,
exp_equiv (zip s1 s2) (phoasify e s1) (phoasify e s2).
Hint Constructors exp_equiv.
induction e; crush.
Qed.
End vars.
(** We apply [phoasify_wf] manually to prove the final theorem. *)
Theorem Phoasify_wf : forall t (e : DeBruijn.exp nil t),
Wf (Phoasify e).
unfold Wf, Phoasify; intros;
apply (phoasify_wf e (HNil (B := var1)) (HNil (B := var2))).
Qed.
(** Now, if we compose [Phoasify] with any translation over PHOAS terms, we can verify the composed translation without relying on axioms. The conclusion of [Phoasify_wf] is robustly useful in verifying a wide variety of translations that use a wide variety of [var] instantiations. *)
(** * From PHOAS to De Bruijn *)
(** The translation to de Bruijn terms is more involved. We will essentially be instantiating and using a PHOAS term following a convention isomorphic to %\textit{%#<i>#de Bruijn levels#</i>#%}%, which are different from the de Bruijn indices that we have treated so far. With levels, a given bound variable is referred to by the same number at each of its occurrences. In any expression, the binders that are not enclosed by other binders are assigned level 0, a binder with just one enclosing binder is assigned level 1, and so on. The uniformity of references to any binder will be critical to our translation, since it is compatible with the pattern of filling in all of a PHOAS variable's locations at once by applying a function.
We implement a special lookup function, for reading a numbered variable's type out of a de Bruijn level typing context. The last variable in the list is taken to have level 0, the next-to-last level 1, and so on. *)
Fixpoint lookup (ts : list type) (n : nat) : option type :=
match ts with
| nil => None
| t :: ts' => if eq_nat_dec n (length ts') then Some t else lookup ts' n
end.
Infix "##" := lookup (left associativity, at level 1).
(** With [lookup], we can define a notion of well-formedness for PHOAS expressions that we are treating according to the de Bruijn level convention. *)
Fixpoint wf (ts : list type) t (e : Phoas.exp (fun _ => nat) t) : Prop :=
match e with
| Phoas.Var t n => ts ## n = Some t
| Phoas.Const _ => True
| Phoas.Plus e1 e2 => wf ts e1 /\ wf ts e2
| Phoas.App _ _ e1 e2 => wf ts e1 /\ wf ts e2
| Phoas.Abs t1 _ e1 => wf (t1 :: ts) (e1 (length ts))
end.
(** ** Connecting Notions of Well-Formedness *)
(** Our first order of business now is to prove that any well-formed [Exp] instantiates to a well-formed de Bruijn level expression. We start by characterizing, as a function of de Bruijn level contexts, the set of PHOAS contexts that will occur in the proof, where we will be inducting over an [exp_equiv] derivation. *)
Fixpoint makeG (ts : list type) : list { t : type & nat * nat }%type :=
match ts with
| nil => nil
| t :: ts' => existT _ t (length ts', length ts') :: makeG ts'
end.
(** Now we prove a connection between [lookup] and [makeG], by way of a lemma about [lookup]. *)
Opaque eq_nat_dec.
Hint Extern 1 (_ >= _) => omega.
Lemma lookup_contra' : forall t ts n,
ts ## n = Some t
-> n >= length ts
-> False.
induction ts; crush;
match goal with
| [ _ : context[if ?E then _ else _] |- _ ] => destruct E; crush
end; eauto.
Qed.
Lemma lookup_contra : forall t ts,
ts ## (length ts) = Some t
-> False.
intros; eapply lookup_contra'; eauto.
Qed.
Hint Resolve lookup_contra.
Lemma lookup_In : forall t v1 v2 ts,
In (existT (fun _ : type => (nat * nat)%type) t (v1, v2)) (makeG ts)
-> ts ## v1 = Some t.
induction ts; crush;
match goal with
| [ |- context[if ?E then _ else _] ] => destruct E; crush
end; elimtype False; eauto.
Qed.
Hint Resolve lookup_In.
(** We can prove the main inductive lemma by induction over [exp_equiv] derivations. *)
Hint Extern 1 (_ :: _ = makeG (_ :: _)) => reflexivity.
Lemma Wf_wf' : forall G t e1 (e2 : Phoas.exp (fun _ => nat) t),
exp_equiv G e1 e2
-> forall ts, G = makeG ts
-> wf ts e1.
induction 1; crush; eauto.
Qed.
Lemma Wf_wf : forall t (E : Exp t),
Wf E
-> wf nil (E (fun _ => nat)).
intros; eapply Wf_wf'; eauto.
Qed.
(** ** The Translation *)
(** Implementing the translation itself will require some proofs. Our main function [dbify] will take [wf] proofs as arguments, and these proofs will be critical to constructing de Bruijn index terms. First, we use [congruence] to prove two basic theorems about [option]s. *)
Theorem None_Some : forall T (x : T),
None = Some x
-> False.
congruence.
Qed.
Theorem Some_Some : forall T (x y : T),
Some x = Some y
-> x = y.
congruence.
Qed.
(** We can use these theorems to implement [makeVar], which translates a proof about [lookup] into a de Bruijn index variable with a closely related type. *)
Fixpoint makeVar {ts n t} : ts ## n = Some t -> member t ts :=
match ts with
| nil => fun Heq => match None_Some Heq with end
| t' :: ts' => if eq_nat_dec n (length ts') as b return (if b then _ else _) = _ -> _
then fun Heq => match Some_Some Heq with refl_equal => HFirst end
else fun Heq => HNext (makeVar Heq)
end.
(** Now [dbify] is straightforward to define. We use the functions [proj1] and [proj2] to decompose proofs of conjunctions. *)
Fixpoint dbify {ts} t (e : Phoas.exp (fun _ => nat) t) : wf ts e -> DeBruijn.exp ts t :=
match e in Phoas.exp _ t return wf ts e -> DeBruijn.exp ts t with
| Phoas.Var _ n => fun wf => DeBruijn.Var (makeVar wf)
| Phoas.Const n => fun _ => DeBruijn.Const n
| Phoas.Plus e1 e2 => fun wf =>
DeBruijn.Plus (dbify e1 (proj1 wf)) (dbify e2 (proj2 wf))
| Phoas.App _ _ e1 e2 => fun wf =>
DeBruijn.App (dbify e1 (proj1 wf)) (dbify e2 (proj2 wf))
| Phoas.Abs _ _ e1 => fun wf => DeBruijn.Abs (dbify (e1 (length ts)) wf)
end.
(** We define the parametric translation [Dbify] by appealing to the well-formedness translation theorem [Wf_wf] that we proved earlier. *)
Definition Dbify t (E : Phoas.Exp t) (W : Wf E) : DeBruijn.exp nil t :=
dbify (E _) (Wf_wf W).
(** To prove soundness, it is helpful to classify a set of contexts which depends on a de Bruijn index substitution. *)
Fixpoint makeG' ts (s : hlist typeDenote ts)
: list { t : type & nat * typeDenote t }%type :=
match s with
| HNil => nil
| HCons _ ts' v s' => existT _ _ (length ts', v) :: makeG' s'
end.
(** We prove an analogous lemma to the one we proved connecting [makeG] and [lookup]. This time, we connect [makeG'] and [hget]. *)
Lemma In_makeG'_contra' : forall t v2 ts (s : hlist _ ts) n,
In (existT _ t (n, v2)) (makeG' s)
-> n >= length ts
-> False.
induction s; crush; eauto.
Qed.
Lemma In_makeG'_contra : forall t v2 ts (s : hlist _ ts),
In (existT _ t (length ts, v2)) (makeG' s)
-> False.
intros; eapply In_makeG'_contra'; eauto.
Qed.
Hint Resolve In_makeG'_contra.
Lemma In_makeG' : forall t v1 v2 ts s (w : ts ## v1 = Some t),
In (existT _ t (v1, v2)) (makeG' s)
-> hget s (makeVar w) = v2.
induction s; crush;
match goal with
| [ |- context[if ?E then _ else _] ] => destruct E; crush
end;
repeat match goal with
| [ |- context[match ?pf with refl_equal => _ end] ] =>
rewrite (UIP_refl _ _ pf)
end; crush; elimtype False; eauto.
Qed.
Hint Resolve In_makeG'.
(** Now the main inductive lemma can be stated and proved simply. *)
Lemma dbify_sound : forall G t (e1 : Phoas.exp _ t) (e2 : Phoas.exp _ t),
exp_equiv G e1 e2
-> forall ts (w : wf ts e1) s,
G = makeG' s
-> DeBruijn.expDenote (dbify e1 w) s = Phoas.expDenote e2.
induction 1; crush; (let x := fresh in extensionality x); crush.
Qed.
(** In the usual way, we wrap [dbify_sound] into the final soundness theorem, formally establishing the expressive equivalence of PHOAS and de Bruijn index terms. *)
Theorem Dbify_sound : forall t (E : Exp t) (W : Wf E),
DeBruijn.expDenote (Dbify W) HNil = Phoas.ExpDenote E.
unfold Dbify, Phoas.ExpDenote; intros; eapply dbify_sound; eauto.
Qed.
(* Copyright (c) 2008-2011, Adam Chlipala
*
* This work is licensed under a
* Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
* Unported License.
* The license text is available at:
* http://creativecommons.org/licenses/by-nc-nd/3.0/
*)
(* begin hide *)
Require Import String List FunctionalExtensionality.
Require Import CpdtTactics.
Set Implicit Arguments.
(* end hide *)
(** %\chapter{Type-Theoretic Interpreters}% *)
(** Throughout this book, we have given semantics for programming languages via executable interpreters written in Gallina. PHOAS is quite compatible with that model, when we want to formalize many of the wide variety of interesting non-Turing-complete programming languages. Most such languages have very straightforward elaborations into Gallina. In this chapter, we show how to extend our past approach to higher-order languages encoded with PHOAS, and we show how simple program transformations may be proved correct with respect to these elaborative semantics. *)
(** * Simply-Typed Lambda Calculus *)
(** We begin with a copy of last chapter's encoding of the syntax of simply-typed lambda calculus with natural numbers and addition. The primes at the ends of constructor names are gone, since here our primary subject is [exp]s instead of [Exp]s. *)
Module STLC.
Inductive type : Type :=
| Nat : type
| Arrow : type -> type -> type.
Infix "-->" := Arrow (right associativity, at level 60).
Section vars.
Variable var : type -> Type.
Inductive exp : type -> Type :=
| Var : forall t,
var t
-> exp t
| Const : nat -> exp Nat
| Plus : exp Nat -> exp Nat -> exp Nat
| App : forall t1 t2,
exp (t1 --> t2)
-> exp t1
-> exp t2
| Abs : forall t1 t2,
(var t1 -> exp t2)
-> exp (t1 --> t2).
End vars.
Definition Exp t := forall var, exp var t.
Implicit Arguments Var [var t].
Implicit Arguments Const [var].
Implicit Arguments Plus [var].
Implicit Arguments App [var t1 t2].
Implicit Arguments Abs [var t1 t2].
(** The definitions that follow will be easier to read if we define some parsing notations for the constructors. *)
Notation "# v" := (Var v) (at level 70).
(** printing ^ $\dag$ *)
Notation "^ n" := (Const n) (at level 70).
(** printing +^ $\hat{+}$ *)
Infix "+^" := Plus (left associativity, at level 79).
Infix "@" := App (left associativity, at level 77).
Notation "\ x , e" := (Abs (fun x => e)) (at level 78).
Notation "\ ! , e" := (Abs (fun _ => e)) (at level 78).
(** A few examples will be useful for testing the functions we will write. *)
Example zero : Exp Nat := fun _ => ^0.
Example one : Exp Nat := fun _ => ^1.
Example zpo : Exp Nat := fun _ => zero _ +^ one _.
Example ident : Exp (Nat --> Nat) := fun _ => \x, #x.
Example app_ident : Exp Nat := fun _ => ident _ @ zpo _.
Example app : Exp ((Nat --> Nat) --> Nat --> Nat) := fun _ => \f, \x, #f @ #x.
Example app_ident' : Exp Nat := fun _ => app _ @ ident _ @ zpo _.
(* EX: Define an interpreter for [Exp]s. *)
(* begin thide *)
(** To write our interpreter, we must first interpret object language types as meta language types. *)
Fixpoint typeDenote (t : type) : Set :=
match t with
| Nat => nat
| t1 --> t2 => typeDenote t1 -> typeDenote t2
end.
(** The crucial trick of the expression interpreter is to represent variables using the [typeDenote] function. Due to limitations in Coq's syntax extension system, we cannot take advantage of some of our notations when they appear in patterns, so, to be consistent, in patterns we avoid notations altogether. *)
Fixpoint expDenote t (e : exp typeDenote t) : typeDenote t :=
match e with
| Var _ v => v
| Const n => n
| Plus e1 e2 => expDenote e1 + expDenote e2
| App _ _ e1 e2 => (expDenote e1) (expDenote e2)
| Abs _ _ e' => fun x => expDenote (e' x)
end.
Definition ExpDenote t (e : Exp t) := expDenote (e _).
(* end thide *)
(** Some tests establish that [ExpDenote] produces Gallina terms like we might write manually. *)
Eval compute in ExpDenote zero.
(** %\vspace{-.15in}% [[
= 0
: typeDenote Nat
]]
*)
Eval compute in ExpDenote one.
(** %\vspace{-.15in}% [[
= 1
: typeDenote Nat
]]
*)
Eval compute in ExpDenote zpo.
(** %\vspace{-.15in}% [[
= 1
: typeDenote Nat
]]
*)
Eval compute in ExpDenote ident.
(** %\vspace{-.15in}% [[
= fun x : nat => x
: typeDenote (Nat --> Nat)
]]
*)
Eval compute in ExpDenote app_ident.
(** %\vspace{-.15in}% [[
= 1
: typeDenote Nat
]]
*)
Eval compute in ExpDenote app.
(** %\vspace{-.15in}% [[
= fun (x : nat -> nat) (x0 : nat) => x x0
: typeDenote ((Nat --> Nat) --> Nat --> Nat)
]]
*)
Eval compute in ExpDenote app_ident'.
(** %\vspace{-.15in}% [[
= 1
: typeDenote Nat
]]
*)
(* EX: Define a constant-folding function for [Exp]s. *)
(** We can update to the higher-order case our common example of a constant folding function. The workhorse function [cfold] is parameterized to apply to an [exp] that uses any [var] type. An output of [cfold] uses the same [var] type as the input. As in the definition of [expDenote], we cannot use most of our notations in patterns, but we use them freely to make the bodies of [match] cases easier to read. *)
(* begin thide *)
Section cfold.
Variable var : type -> Type.
Fixpoint cfold t (e : exp var t) : exp var t :=
match e with
| Var _ v => #v
| Const n => ^n
| Plus e1 e2 =>
let e1' := cfold e1 in
let e2' := cfold e2 in
match e1', e2' return _ with
| Const n1, Const n2 => ^(n1 + n2)
| _, _ => e1' +^ e2'
end
| App _ _ e1 e2 => cfold e1 @ cfold e2
| Abs _ _ e' => \x, cfold (e' x)
end.
End cfold.
Definition Cfold t (E : Exp t) : Exp t := fun _ => cfold (E _).
(* end thide *)
(* EX: Prove the correctness of constant-folding. *)
(** Now we would like to prove the correctness of [Cfold], which follows from a simple inductive lemma about [cfold]. *)
(* begin thide *)
Lemma cfold_correct : forall t (e : exp _ t),
expDenote (cfold e) = expDenote e.
induction e; crush; try (let x := fresh in extensionality x; crush);
repeat (match goal with
| [ |- context[cfold ?E] ] => dep_destruct (cfold E)
end; crush).
Qed.
Theorem Cfold_correct : forall t (E : Exp t),
ExpDenote (Cfold E) = ExpDenote E.
unfold ExpDenote, Cfold; intros; apply cfold_correct.
Qed.
(* end thide *)
End STLC.
(** * Adding Products and Sums *)
(** The example is easily adapted to support products and sums, the basis of non-recursive datatypes in ML and Haskell. *)
Module PSLC.
(* EX: Extend the development with products and sums. *)
(* begin thide *)
Inductive type : Type :=
| Nat : type
| Arrow : type -> type -> type
| Prod : type -> type -> type
| Sum : type -> type -> type.
(* end thide *)
Infix "-->" := Arrow (right associativity, at level 62).
Infix "**" := Prod (right associativity, at level 61).
Infix "++" := Sum (right associativity, at level 60).
(* begin thide *)
Section vars.
Variable var : type -> Type.
Inductive exp : type -> Type :=
| Var : forall t,
var t
-> exp t
| Const : nat -> exp Nat
| Plus : exp Nat -> exp Nat -> exp Nat
| App : forall t1 t2,
exp (t1 --> t2)
-> exp t1
-> exp t2
| Abs : forall t1 t2,
(var t1 -> exp t2)
-> exp (t1 --> t2)
| Pair : forall t1 t2,
exp t1
-> exp t2
-> exp (t1 ** t2)
| Fst : forall t1 t2,
exp (t1 ** t2)
-> exp t1
| Snd : forall t1 t2,
exp (t1 ** t2)
-> exp t2
| Inl : forall t1 t2,
exp t1
-> exp (t1 ++ t2)
| Inr : forall t1 t2,
exp t2
-> exp (t1 ++ t2)
| SumCase : forall t1 t2 t,
exp (t1 ++ t2)
-> (var t1 -> exp t)
-> (var t2 -> exp t)
-> exp t.
End vars.
Definition Exp t := forall var, exp var t.
(* end thide *)
Implicit Arguments Var [var t].
Implicit Arguments Const [var].
Implicit Arguments Abs [var t1 t2].
Implicit Arguments Inl [var t1 t2].
Implicit Arguments Inr [var t1 t2].
Notation "# v" := (Var v) (at level 70).
Notation "^ n" := (Const n) (at level 70).
Infix "+^" := Plus (left associativity, at level 78).
Infix "@" := App (left associativity, at level 77).
Notation "\ x , e" := (Abs (fun x => e)) (at level 78).
Notation "\ ! , e" := (Abs (fun _ => e)) (at level 78).
Notation "[ e1 , e2 ]" := (Pair e1 e2).
Notation "#1 e" := (Fst e) (at level 75).
Notation "#2 e" := (Snd e) (at level 75).
Notation "'case' e 'of' x => e1 | y => e2" := (SumCase e (fun x => e1) (fun y => e2))
(at level 79).
(** A few examples can be defined easily, using the notations above. *)
Example swap : Exp (Nat ** Nat --> Nat ** Nat) := fun _ => \p, [#2 #p, #1 #p].
Example zo : Exp (Nat ** Nat) := fun _ => [^0, ^1].
Example swap_zo : Exp (Nat ** Nat) := fun _ => swap _ @ zo _.
Example natOut : Exp (Nat ++ Nat --> Nat) := fun _ =>
\s, case #s of x => #x | y => #y +^ #y.
Example ns1 : Exp (Nat ++ Nat) := fun _ => Inl (^3).
Example ns2 : Exp (Nat ++ Nat) := fun _ => Inr (^5).
Example natOut_ns1 : Exp Nat := fun _ => natOut _ @ ns1 _.
Example natOut_ns2 : Exp Nat := fun _ => natOut _ @ ns2 _.
(** The semantics adapts without incident. *)
(* begin thide *)
Fixpoint typeDenote (t : type) : Set :=
match t with
| Nat => nat
| t1 --> t2 => typeDenote t1 -> typeDenote t2
| t1 ** t2 => typeDenote t1 * typeDenote t2
| t1 ++ t2 => typeDenote t1 + typeDenote t2
end%type.
Fixpoint expDenote t (e : exp typeDenote t) : typeDenote t :=
match e with
| Var _ v => v
| Const n => n
| Plus e1 e2 => expDenote e1 + expDenote e2
| App _ _ e1 e2 => (expDenote e1) (expDenote e2)
| Abs _ _ e' => fun x => expDenote (e' x)
| Pair _ _ e1 e2 => (expDenote e1, expDenote e2)
| Fst _ _ e' => fst (expDenote e')
| Snd _ _ e' => snd (expDenote e')
| Inl _ _ e' => inl _ (expDenote e')
| Inr _ _ e' => inr _ (expDenote e')
| SumCase _ _ _ e' e1 e2 =>
match expDenote e' with
| inl v => expDenote (e1 v)
| inr v => expDenote (e2 v)
end
end.
Definition ExpDenote t (e : Exp t) := expDenote (e _).
(* end thide *)
Eval compute in ExpDenote swap.
(** %\vspace{-.15in}% [[
= fun x : nat * nat => (let (_, y) := x in y, let (x0, _) := x in x0)
: typeDenote (Nat ** Nat --> Nat ** Nat)
]]
*)
Eval compute in ExpDenote zo.
(** %\vspace{-.15in}% [[
= (0, 1)
: typeDenote (Nat ** Nat)
]]
*)
Eval compute in ExpDenote swap_zo.
(** %\vspace{-.15in}% [[
= (1, 0)
: typeDenote (Nat ** Nat)
]]
*)
Eval cbv beta iota delta -[plus] in ExpDenote natOut.
(** %\vspace{-.15in}% [[
= fun x : nat + nat => match x with
| inl v => v
| inr v => v + v
end
: typeDenote (Nat ++ Nat --> Nat)
]]
*)
Eval compute in ExpDenote ns1.
(** %\vspace{-.15in}% [[
= inl nat 3
: typeDenote (Nat ++ Nat)
]]
*)
Eval compute in ExpDenote ns2.
(** %\vspace{-.15in}% [[
= inr nat 5
: typeDenote (Nat ++ Nat)
]]
*)
Eval compute in ExpDenote natOut_ns1.
(** %\vspace{-.15in}% [[
= 3
: typeDenote Nat
]]
*)
Eval compute in ExpDenote natOut_ns2.
(** %\vspace{-.15in}% [[
= 10
: typeDenote Nat
]]
*)
(** We adapt the [cfold] function using the same basic dependent-types trick that we applied in an earlier chapter to a very similar function for a language without variables. *)
(* begin thide *)
Section cfold.
Variable var : type -> Type.
Definition pairOutType t :=
match t return Type with
| t1 ** t2 => option (exp var t1 * exp var t2)
| _ => unit
end.
Definition pairOutDefault (t : type) : pairOutType t :=
match t with
| _ ** _ => None
| _ => tt
end.
Definition pairOut t1 t2 (e : exp var (t1 ** t2))
: option (exp var t1 * exp var t2) :=
match e in exp _ t return pairOutType t with
| Pair _ _ e1 e2 => Some (e1, e2)
| _ => pairOutDefault _
end.
Fixpoint cfold t (e : exp var t) : exp var t :=
match e with
| Var _ v => #v
| Const n => ^n
| Plus e1 e2 =>
let e1' := cfold e1 in
let e2' := cfold e2 in
match e1', e2' return _ with
| Const n1, Const n2 => ^(n1 + n2)
| _, _ => e1' +^ e2'
end
| App _ _ e1 e2 => cfold e1 @ cfold e2
| Abs _ _ e' => \x, cfold (e' x)
| Pair _ _ e1 e2 => [cfold e1, cfold e2]
| Fst t1 _ e' =>
let e'' := cfold e' in
match pairOut e'' with
| None => #1 e''
| Some (e1, _) => e1
end
| Snd _ _ e' =>
let e'' := cfold e' in
match pairOut e'' with
| None => #2 e''
| Some (_, e2) => e2
end
| Inl _ _ e' => Inl (cfold e')
| Inr _ _ e' => Inr (cfold e')
| SumCase _ _ _ e' e1 e2 =>
case cfold e' of
x => cfold (e1 x)
| y => cfold (e2 y)
end.
End cfold.
Definition Cfold t (E : Exp t) : Exp t := fun _ => cfold (E _).
(** The proofs are almost as straightforward as before. We first establish two simple theorems about pairs and their projections. *)
Section pairs.
Variables A B : Type.
Variable v1 : A.
Variable v2 : B.
Variable v : A * B.
Theorem pair_eta1 : (v1, v2) = v -> v1 = fst v.
destruct v; crush.
Qed.
Theorem pair_eta2 : (v1, v2) = v -> v2 = snd v.
destruct v; crush.
Qed.
End pairs.
Hint Resolve pair_eta1 pair_eta2.
(** To the proof script for the main lemma, we add just one more [match] case, detecting when case analysis is appropriate on discriminees of matches over sum types. *)
Lemma cfold_correct : forall t (e : exp _ t),
expDenote (cfold e) = expDenote e.
induction e; crush; try (let x := fresh in extensionality x; crush);
repeat (match goal with
| [ |- context[cfold ?E] ] => dep_destruct (cfold E)
| [ |- match ?E with inl _ => _ | inr _ => _ end = _ ] => destruct E
end; crush); eauto.
Qed.
Theorem Cfold_correct : forall t (E : Exp t),
ExpDenote (Cfold E) = ExpDenote E.
unfold ExpDenote, Cfold; intros; apply cfold_correct.
Qed.
(* end thide *)
End PSLC.
......@@ -137,7 +137,7 @@ If I do that job well, then this book should be of interest even to people who h
(**
I try to keep the required background knowledge to a minimum in this book. I will assume familiarity with the material from usual discrete math and logic courses taken by undergraduate computer science majors, and I will assume that readers have significant experience programming in one of the ML dialects, in Haskell, or in some other, closely related language. Experience with only dynamically typed functional languages might lead to befuddlement in some places, but a reader who has come to understand Scheme deeply will probably be fine.
Part IV of this manuscript is about formalizing programming languages and compilers. That part certainly depends on basic knowledge of formal type systems, operational semantics, and the theorems usually proved about such systems. As a reference on these topics, I recommend %\emph{%#<i>#Types and Programming Languages#</i>#%}~\cite{TAPL}%, by Benjamin C. Pierce. However, my current plan is to break Part IV into a separate, online-only document, since I expect the formalization interests of many readers of the book to lie outside of programming languages. I do often use examples from programming languages in the earlier parts of the book, but I have tried to design them to be comprehensible on the basis of ML or Haskell experience alone.
My background is in programming languages, formal semantics, and program verification. I sometimes use examples from that domain, As a reference on these topics, I recommend %\emph{%#<i>#Types and Programming Languages#</i>#%}~\cite{TAPL}%, by Benjamin C. Pierce; however, I have tried to choose examples so that they may be understood without background in semantics.
*)
......@@ -245,20 +245,6 @@ Proof by Reflection & \texttt{Reflection.v} \\
\hline
Proving in the Large & \texttt{Large.v} \\
\hline
First-Order Abstract Syntax & \texttt{Firstorder.v} \\
\hline
Dependent De Bruijn Indices & \texttt{DeBruijn.v} \\
\hline
Higher-Order Abstract Syntax & \texttt{Hoas.v} \\
\hline
Type-Theoretic Interpreters & \texttt{Interps.v} \\
\hline
Extensional Transformations & \texttt{Extensional.v} \\
\hline
Intensional Transformations & \texttt{Intensional.v} \\
\hline
Higher-Order Operational Semantics & \texttt{OpSem.v} \\
\hline
\end{tabular} \end{center}
% *)
(* Copyright (c) 2009-2010, Adam Chlipala
*
* This work is licensed under a
* Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
* Unported License.
* The license text is available at:
* http://creativecommons.org/licenses/by-nc-nd/3.0/
*)
(* begin hide *)
Require Import Arith List.
Require Import CpdtTactics.
Set Implicit Arguments.
(* end hide *)
(** %\chapter{Higher-Order Operational Semantics}% *)
(** The last few chapters have shown how PHOAS can make it relatively painless to reason about program transformations. Each of our example languages so far has had a semantics that is easy to implement with an interpreter in Gallina. Since Gallina is designed to rule out non-termination, we cannot hope to give interpreter-based semantics to Turing-complete programming languages. Falling back on standard operational semantics leaves us with the old bureaucratic concerns about capture-avoiding substitution. Can we encode Turing-complete, higher-order languages in Coq without sacrificing the advantages of higher-order encoding?
Any approach that applies to basic untyped lambda calculus is likely to extend to most object languages of interest. We can attempt the %``%#"#obvious#"#%''% way of equipping a PHOAS definition for use in an operational semantics, without mentioning substitution explicitly. Specifically, we try to work with expressions with [var] instantiated with a type of values. *)
Section exp.
Variable var : Type.
Inductive exp : Type :=
| Var : var -> exp
| App : exp -> exp -> exp
| Abs : (var -> exp) -> exp.
End exp.
(** [[
Inductive val : Type :=
| VAbs : (val -> exp val) -> val.
Error: Non strictly positive occurrence of "val" in
"(val -> exp val) -> val".
]]
We would like to represent values (which are all function abstractions) as functions from variables to expressions, where we represent variables as the same value type that we are defining. That way, a value can be substituted in a function body simply by applying the body to the value. Unfortunately, the positivity restriction rejects this definition, for much the same reason that we could not use the classical HOAS encoding.
We can try an alternate approach based on defining [val] like a usual class of syntax. *)
Section val.
Variable var : Type.
Inductive val : Type :=
| VAbs : (var -> exp var) -> val.
End val.
(** Now the puzzle is how to write the type of an expression whose variables are represented as values. We would like to be able to write a recursive definition like this one:
[[
Fixpoint expV := exp (val expV).
]]
Of course, this kind of definition is not structurally recursive, so Coq will not allow it. Getting %``%#"#substitution for free#"#%''% seems to require some similar kind of self-reference.
In this chapter, we will consider an alternate take on the problem. We add a level of indirection, introducing more explicit syntax to break the cycle in type definitions. Specifically, we represent function values as numbers that index into a %\emph{%#<i>#closure heap#</i>#%}% that our operational semantics maintains alongside the expression being evaluated. *)
(** * Closure Heaps *)
(** The essence of the technique is to store function bodies in lists that are extended monotonically as function abstractions are evaluated. We can define a set of functions and theorems that implement the core functionality generically. *)
Section lookup.
Variable A : Type.
(** We start with a [lookup] function that generalizes last chapter's function of the same name. It selects the element at a particular position in a list, where we number the elements starting from the end of the list, so that prepending new elements does not change the indices of old elements. *)
Fixpoint lookup (ls : list A) (n : nat) : option A :=
match ls with
| nil => None
| v :: ls' => if eq_nat_dec n (length ls') then Some v else lookup ls' n
end.
Infix "##" := lookup (left associativity, at level 1).
(** The second of our two definitions expresses when one list extends another. We will write [ls1 ~> ls2] to indicate that [ls1] could evolve into [ls2]; that is, [ls1] is a suffix of [ls2]. *)
Definition extends (ls1 ls2 : list A) := exists ls, ls2 = ls ++ ls1.
(** printing ~> $\leadsto$ *)
Infix "~>" := extends (no associativity, at level 80).
(** We prove and add as hints a few basic theorems about [lookup] and [extends]. *)
Theorem lookup1 : forall x ls,
(x :: ls) ## (length ls) = Some x.
crush; match goal with
| [ |- context[if ?E then _ else _] ] => destruct E
end; crush.
Qed.
Theorem extends_refl : forall ls, ls ~> ls.
exists nil; reflexivity.
Qed.
Theorem extends1 : forall v ls, ls ~> v :: ls.
intros; exists (v :: nil); reflexivity.
Qed.
Theorem extends_trans : forall ls1 ls2 ls3,
ls1 ~> ls2
-> ls2 ~> ls3
-> ls1 ~> ls3.
intros ? ? ? [l1 ?] [l2 ?]; exists (l2 ++ l1); crush.
Qed.
Lemma lookup_contra : forall n v ls,
ls ## n = Some v
-> n >= length ls
-> False.
induction ls; crush;
match goal with
| [ _ : context[if ?E then _ else _] |- _ ] => destruct E
end; crush.
Qed.
Hint Resolve lookup_contra.
Theorem extends_lookup : forall ls1 ls2 n v,
ls1 ~> ls2
-> ls1 ## n = Some v
-> ls2 ## n = Some v.
intros ? ? ? ? [l ?]; crush; induction l; crush;
match goal with
| [ |- context[if ?E then _ else _] ] => destruct E
end; crush; elimtype False; eauto.
Qed.
End lookup.
Infix "##" := lookup (left associativity, at level 1).
Infix "~>" := extends (no associativity, at level 80).
Hint Resolve lookup1 extends_refl extends1 extends_trans extends_lookup.
(** We are dealing explicitly with the nitty-gritty of closure heaps. Why is this better than dealing with the nitty-gritty of variables? The inconvenience of modeling lambda calculus-style binders comes from the presence of nested scopes. Program evaluation will only involve one %\emph{%#<i>#global#</i>#%}% closure heap. Also, the short development that we just finished can be reused for many different object languages. None of these definitions or theorems needs to be redone to handle specific object language features. By adding the theorems as hints, no per-object-language effort is required to apply the critical facts as needed. *)
(** * Languages and Translation *)
(** For the rest of this chapter, we will consider the example of CPS translation for untyped lambda calculus with boolean constants. It is convenient to include these constants, because their presence makes it easy to state a final translation correctness theorem. *)
Module Source.
(** We define the syntax of source expressions in our usual way. *)
Section exp.
Variable var : Type.
Inductive exp : Type :=
| Var : var -> exp
| App : exp -> exp -> exp
| Abs : (var -> exp) -> exp
| Bool : bool -> exp.
End exp.
Implicit Arguments Bool [var].
Definition Exp := forall var, exp var.
(** We will implement a big-step operational semantics, where expressions are mapped to values. A value is either a function or a boolean. We represent a function as a number that will be interpreted as an index into the global closure heap. *)
Inductive val : Set :=
| VFun : nat -> val
| VBool : bool -> val.
(** A closure, then, follows the usual representation of function abstraction bodies, where we represent variables as values. *)
Definition closure := val -> exp val.
Definition closures := list closure.
(** Our evaluation relation has four places. We map an initial closure heap and an expression into a final closure heap and a value. The interesting cases are for [Abs], where we push the body onto the closure heap; and for [App], where we perform a lookup in a closure heap, to find the proper function body to execute next. *)
Inductive eval : closures -> exp val -> closures -> val -> Prop :=
| EvVar : forall cs v,
eval cs (Var v) cs v
| EvApp : forall cs1 e1 e2 cs2 v1 c cs3 v2 cs4 v3,
eval cs1 e1 cs2 (VFun v1)
-> eval cs2 e2 cs3 v2
-> cs2 ## v1 = Some c
-> eval cs3 (c v2) cs4 v3
-> eval cs1 (App e1 e2) cs4 v3
| EvAbs : forall cs c,
eval cs (Abs c) (c :: cs) (VFun (length cs))
| EvBool : forall cs b,
eval cs (Bool b) cs (VBool b).
(** A simple wrapper produces an evaluation relation suitable for use on the main expression type [Exp]. *)
Definition Eval (cs1 : closures) (E : Exp) (cs2 : closures) (v : val) :=
eval cs1 (E _) cs2 v.
(** To prove our translation's correctness, we will need the usual notions of expression equivalence and well-formedness. *)
Section exp_equiv.
Variables var1 var2 : Type.
Inductive exp_equiv : list (var1 * var2) -> exp var1 -> exp var2 -> Prop :=
| EqVar : forall G v1 v2,
In (v1, v2) G
-> exp_equiv G (Var v1) (Var v2)
| EqApp : forall G f1 x1 f2 x2,
exp_equiv G f1 f2
-> exp_equiv G x1 x2
-> exp_equiv G (App f1 x1) (App f2 x2)
| EqAbs : forall G f1 f2,
(forall v1 v2, exp_equiv ((v1, v2) :: G) (f1 v1) (f2 v2))
-> exp_equiv G (Abs f1) (Abs f2)
| EqBool : forall G b,
exp_equiv G (Bool b) (Bool b).
End exp_equiv.
Definition Wf (E : Exp) := forall var1 var2, exp_equiv nil (E var1) (E var2).
End Source.
(** Our target language can be defined without introducing any additional tricks. *)
Module CPS.
Section exp.
Variable var : Type.
Inductive prog : Type :=
| Halt : var -> prog
| App : var -> var -> prog
| Bind : primop -> (var -> prog) -> prog
with primop : Type :=
| Abs : (var -> prog) -> primop
| Bool : bool -> primop
| Pair : var -> var -> primop
| Fst : var -> primop
| Snd : var -> primop.
End exp.
Implicit Arguments Bool [var].
Notation "x <- p ; e" := (Bind p (fun x => e))
(right associativity, at level 76, p at next level).
Definition Prog := forall var, prog var.
Definition Primop := forall var, primop var.
Inductive val : Type :=
| VFun : nat -> val
| VBool : bool -> val
| VPair : val -> val -> val.
Definition closure := val -> prog val.
Definition closures := list closure.
Inductive eval : closures -> prog val -> val -> Prop :=
| EvHalt : forall cs v,
eval cs (Halt v) v
| EvApp : forall cs n v2 c v3,
cs ## n = Some c
-> eval cs (c v2) v3
-> eval cs (App (VFun n) v2) v3
| EvBind : forall cs1 p e cs2 v1 v2,
evalP cs1 p cs2 v1
-> eval cs2 (e v1) v2
-> eval cs1 (Bind p e) v2
with evalP : closures -> primop val -> closures -> val -> Prop :=
| EvAbs : forall cs c,
evalP cs (Abs c) (c :: cs) (VFun (length cs))
| EvPair : forall cs v1 v2,
evalP cs (Pair v1 v2) cs (VPair v1 v2)
| EvFst : forall cs v1 v2,
evalP cs (Fst (VPair v1 v2)) cs v1
| EvSnd : forall cs v1 v2,
evalP cs (Snd (VPair v1 v2)) cs v2
| EvBool : forall cs b,
evalP cs (Bool b) cs (VBool b).
Definition Eval (cs : closures) (P : Prog) (v : val) := eval cs (P _) v.
End CPS.
Import Source CPS.
(** Finally, we define a CPS translation in the same way as in our previous example for simply-typed lambda calculus. *)
(** printing <-- $\longleftarrow$ *)
Reserved Notation "x <-- e1 ; e2" (right associativity, at level 76, e1 at next level).
Section cpsExp.
Variable var : Type.
Import Source.
Fixpoint cpsExp (e : exp var)
: (var -> prog var) -> prog var :=
match e with
| Var v => fun k => k v
| App e1 e2 => fun k =>
f <-- e1;
x <-- e2;
kf <- CPS.Abs k;
p <- Pair x kf;
CPS.App f p
| Abs e' => fun k =>
f <- CPS.Abs (var := var) (fun p =>
x <- Fst p;
kf <- Snd p;
r <-- e' x;
CPS.App kf r);
k f
| Bool b => fun k =>
x <- CPS.Bool b;
k x
end
where "x <-- e1 ; e2" := (cpsExp e1 (fun x => e2)).
End cpsExp.
Notation "x <-- e1 ; e2" := (cpsExp e1 (fun x => e2)).
Definition CpsExp (E : Exp) : Prog := fun _ => cpsExp (E _) (Halt (var := _)).
(** * Correctness Proof *)
(** Our proof for simply-typed lambda calculus relied on a logical relation to state the key induction hypothesis. Since logical relations proceed by recursion on type structure, we cannot apply them directly in an untyped setting. Instead, we will use an inductive judgment to relate source-level and CPS-level values. First, it is helpful to define an abbreviation for the compiled version of a function body. *)
Definition cpsFunc var (e' : var -> Source.exp var) :=
fun p : var =>
x <- Fst p;
kf <- Snd p;
r <-- e' x;
CPS.App kf r.
(** Now we can define our correctness relation [cr], which is parameterized by source-level and CPS-level closure heaps. *)
Section cr.
Variable s1 : Source.closures.
Variable s2 : CPS.closures.
Import Source.
(** Only equal booleans are related. For two function addresses [l1] and [l2] to be related, they must point to valid functions in their respective closure heaps. The address [l1] must point to a function [f1], and [l2] must point to the result of compiling function [f2]. Further, [f1] and [f2] must be equivalent syntactically in some variable environment [G], and every variable pair in [G] must itself belong to the relation we are defining. *)
Inductive cr : Source.val -> CPS.val -> Prop :=
| CrBool : forall b,
cr (Source.VBool b) (CPS.VBool b)
| CrFun : forall l1 l2 G f1 f2,
(forall x1 x2, exp_equiv ((x1, x2) :: G) (f1 x1) (f2 x2))
-> (forall x1 x2, In (x1, x2) G -> cr x1 x2)
-> s1 ## l1 = Some f1
-> s2 ## l2 = Some (cpsFunc f2)
-> cr (Source.VFun l1) (CPS.VFun l2).
End cr.
(** printing |-- $\vdash$ *)
(** printing ~~ $\sim$ *)
Notation "s1 & s2 |-- v1 ~~ v2" := (cr s1 s2 v1 v2) (no associativity, at level 70).
Hint Constructors cr.
(** To prove our main lemma, it will be useful to know that source-level evaluation never removes old closures from a closure heap. *)
Lemma eval_monotone : forall cs1 e cs2 v,
Source.eval cs1 e cs2 v
-> cs1 ~> cs2.
induction 1; crush; eauto.
Qed.
(** Further, [cr] continues to hold when its closure heap arguments are evolved in legal ways. *)
Lemma cr_monotone : forall cs1 cs2 cs1' cs2',
cs1 ~> cs1'
-> cs2 ~> cs2'
-> forall v1 v2, cs1 & cs2 |-- v1 ~~ v2
-> cs1' & cs2' |-- v1 ~~ v2.
induction 3; crush; eauto.
Qed.
Hint Resolve eval_monotone cr_monotone.
(** We state a trivial fact about the validity of variable environments, so that we may add this fact as a hint that [eauto] will apply. *)
Lemma push : forall G s1 s2 v1' v2',
(forall v1 v2, In (v1, v2) G -> s1 & s2 |-- v1 ~~ v2)
-> s1 & s2 |-- v1' ~~ v2'
-> (forall v1 v2, (v1', v2') = (v1, v2) \/ In (v1, v2) G -> s1 & s2 |-- v1 ~~ v2).
crush.
Qed.
Hint Resolve push.
(** Our final preparation for the main lemma involves adding effective hints about the CPS language's operational semantics. The following tactic performs one step of evaluation. It uses the Ltac code [eval hnf in e] to compute the %\emph{%#<i>#head normal form#</i>#%}% of [e], where the head normal form of an expression in an inductive type is an application of one of that inductive type's constructors. The final line below uses [solve] to ensure that we only take a [Bind] step if a full evaluation derivation for the associated primop may be found before proceeding. *)
Ltac evalOne :=
match goal with
| [ |- CPS.eval ?cs ?e ?v ] =>
let e := eval hnf in e in
change (CPS.eval cs e v);
econstructor; [ solve [ eauto ] | ]
end.
(** For primops, we rely on [eauto]'s usual approach. For goals that evaluate programs, we instead ask to treat one or more applications of [evalOne] as a single step, which helps us avoid passing [eauto] an excessively large bound on proof tree depth. *)
Hint Constructors evalP.
Hint Extern 1 (CPS.eval _ _ _) => evalOne; repeat evalOne.
(** The final lemma proceeds by induction on an evaluation derivation for an expression [e1] that is equivalent to some [e2] in some environment [G]. An initial closure heap for each language is quantified over, such that all variable pairs in [G] are compatible. The lemma's conclusion applies to an arbitrary continuation [k], asserting that a final CPS-level closure heap [s2] and a CPS-level program result value [r2] exist.
Three conditions establish that [s2] and [r2] are chosen properly: Evaluation of [e2]'s compilation with continuation [k] must be equivalent to evaluation of [k r2]. The original program result [r1] must be compatible with [r2] in the final closure heaps. Finally, [s2'] must be a proper evolution of the original CPS-level heap [s2]. *)
Lemma cpsExp_correct : forall s1 e1 s1' r1,
Source.eval s1 e1 s1' r1
-> forall G (e2 : exp CPS.val),
exp_equiv G e1 e2
-> forall s2,
(forall v1 v2, In (v1, v2) G -> s1 & s2 |-- v1 ~~ v2)
-> forall k, exists s2', exists r2,
(forall r, CPS.eval s2' (k r2) r
-> CPS.eval s2 (cpsExp e2 k) r)
/\ s1' & s2' |-- r1 ~~ r2
/\ s2 ~> s2'.
(** The proof script follows our standard approach. Its main loop applies three hints. First, we perform inversion on any derivation of equivalence between a source-level function value and some other value. Second, we eliminate redundant equality hypotheses. Finally, we look for opportunities to instantiate inductive hypotheses.
We identify an IH by its syntactic form, noting the expression [E] that it applies to. It is important to instantiate IHes in the right order, since existentially-quantified variables in the conclusion of one IH may need to be used in instantiating the universal quantifiers of a different IH. Thus, we perform a quick check to [fail 1] if the IH we found applies to an expression that was evaluated after another expression [E'] whose IH we did not yet instantiate. The flow of closure heaps through source-level evaluation is used to implement the check.
If the hypothesis [H] is indeed the right IH to handle next, we use the [guess] tactic to guess values for its universal quantifiers and prove its hypotheses with [eauto]. This tactic is very similar to [inster] from Chapter 12. It takes two arguments: the first is a value to use for any properly-typed universal quantifier, and the second is the hypothesis to instantiate. The final inner [match] deduces if we are at the point of executing the body of a called function. If so, we help [guess] by saying that the initial closure heap will be the current closure heap [cs] extended with the current continuation [k]. In all other cases, [guess] is smart enough to operate alone. *)
induction 1; inversion 1; crush;
repeat (match goal with
| [ H : _ & _ |-- Source.VFun _ ~~ _ |- _ ] => inversion H; clear H
| [ H1 : ?E = _, H2 : ?E = _ |- _ ] => rewrite H1 in H2; clear H1
| [ H : forall G e2, exp_equiv G ?E e2 -> _ |- _ ] =>
match goal with
| [ _ : Source.eval ?CS E _ _, _ : Source.eval _ ?E' ?CS _,
_ : forall G e2, exp_equiv G ?E' e2 -> _ |- _ ] => fail 1
| _ => match goal with
| [ k : val -> prog val, _ : _ & ?cs |-- _ ~~ _,
_ : context[VFun] |- _ ] =>
guess (k :: cs) H
| _ => guess tt H
end
end
end; crush); eauto 13.
Qed.
(** The final theorem follows easily from this lemma. *)
Theorem CpsExp_correct : forall E cs b,
Source.Eval nil E cs (Source.VBool b)
-> Wf E
-> CPS.Eval nil (CpsExp E) (CPS.VBool b).
Hint Constructors CPS.eval.
unfold Source.Eval, CPS.Eval, CpsExp; intros ? ? ? H1 H2;
generalize (cpsExp_correct H1 (H2 _ _) (s2 := nil)
(fun _ _ pf => match pf with end) (Halt (var := _))); crush;
match goal with
| [ H : _ & _ |-- _ ~~ _ |- _ ] => inversion H
end; crush.
Qed.
......@@ -19,12 +19,5 @@
<li><a href="Match.html">Proof Search in Ltac</a>
<li><a href="Reflection.html">Proof by Reflection</a>
<li><a href="Large.html">Proving in the Large</a>
<li><a href="Firstorder.html">First-Order Abstract Syntax</a>
<li><a href="DeBruijn.html">Dependent De Bruijn Indices</a>
<li><a href="Hoas.html">Higher-Order Abstract Syntax</a>
<li><a href="Interps.html">Type-Theoretic Interpreters</a>
<li><a href="Extensional.html">Extensional Transformations</a>
<li><a href="Intensional.html">Intensional Transformations</a>
<li><a href="OpSem.html">Higher-Order Operational Semantics</a>
</body></html>
......@@ -47,7 +47,7 @@
<p>The main omissions have to do with some supporting resources that I didn't get around to implementing when I used this book for a course. Some suggested exercises are present, but only at points where I was looking to assign an exercise in the course. Some chapters are lacking the annotations used to build reduced versions of their source code, where some definitions and proofs have been elided; the course instructor can step through such a file, guiding class participants in filling in the omitted code.</p>
<p>My current tentative plan is to separate out the final part, on programming languages and compilers, into a distinct, online-only document, but I might be persuaded otherwise.</p>
<p>Previous versions included a final Part on programming language semantics in particular. I have decided to separate that part out. I plan to put it up as a supplementary resource eventually; for now it is simply removed. (It remains present in the <a href="repo">Mercurial</a> history.)</p>
</div>
</body></html>
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment