- Project 3 Extension
- Mystery tactics
- Crypto theory
- Project 3 is posted
- Runtime Checks to Type Checks
- Dependent Register File
- Bounded register file
- Array model from class
- distr_rev
- Eval on theorems

Blog

The due date for Project 3 is officially extended to December 18 at 11:59pm. The Blackboard submission site is available with the updated due date.

I’ve used a few tactics in the `Crypto.v`

theory that I did not discuss in class. You *do not* need to understand any of them to use the theory. If you’re curious, all of them are in the Coq definition and/or used in Pierce or Chlipala. Here’s a list of basic descriptions to get you started if you want to learn more:

`exists`

- like intro, but for existential quantifiers.`tauto`

- a tautology solver. Like omega, but for propositional logic statements.`refine`

- tell Coq to set up and attempt proofs that fill in holes (i.e.`_`

) in the associated definition.

You’ll also see significant usage of `try`

and `;`

as presented in class. you don’t need to understand either one, but I did go through them briefly in class.

I just added a link to a simple Crypto theory for use in Project 3. I also updated the project to correct a typo or two, but there are no substantial changes to what I want you to do.

Project 3 is now available on website. We will discuss it in class on Tuesday and again on Thursday. It builds from our current in-class discussion of perfect cryptography by introducing a key server that signs and distributes public keys.

Note that the project is in draft form. I will be updating it for clarity over the next day or so. The basic requirements for the project will not change.

I’m getting lots of questions about the goals of Problem 3 and I think some clarification is in order.

In Problems 1 and 2 you need to account for the possibility of calling `pop`

and `top`

on an empty stack at runtime. Specifically, you need to check whether the input is empty and return either `Unknown`

or a value for `top`

and `empty`

or a value for `pop`

.

The best way to think about Problem 3 is to eliminate `Unknown`

and `empty`

as return values when the input stack is `empty`

. Instead, prove that `empty`

is never the input to `top`

and `pop`

.

This model moves the assurance that an index use to access a register file is in range from runtime to type check time using a simple dependent type. This is based on Chlipala’s first dependent typing example.

```
Require Import Omega.
Require Import Maybe.
Module RegFileDep.
```

First we’ll define a simple function that calculates less than or equals over naturals and returns a `bool`

rather than a `Prop`

. This will be useful later.

```
Fixpoint leb (m:nat) : nat -> bool :=
match m with
| O => fun _:nat => true
| S m' =>
fun n:nat => match n with
| O => false
| S n' => leb m' n'
end
end.
```

Not much changes in the basic definitions of register contents or the register file itself.

```
Inductive RC : Type :=
| val : nat -> RC
| unknown : RC.
Notation "??" := unknown.
Definition RFC : Type := nat -> RC.
Inductive RF : Type :=
| file : nat -> RFC -> RF.
Definition empty_rf(s:nat):RF := file s (fun n:nat => ??).
Definition bound_check(i:nat)(r:RF):Prop :=
match r with
| file s _ => i <= s
end.
```

`get`

and `put`

are the first places where definitions change significantly. In both cases, the return type is a function from a `Prop`

to their traditional value. The result is a function that takes a proof that `bound_check`

holds for the given index and returns a register contents value.

Note the new values returned by the `match`

expression. They are now functions and not stack or contents values.

```
Definition get(i:nat)(f:RF):bound_check i f -> RC :=
match f with
| file s c => (fun p:bound_check i f => (c i))
end.
Definition put(i:nat)(f:RF)(v:RC):bound_check i f -> RF :=
match f with
| file s c => (fun p:bound_check i f =>
file s (fun n:nat => if (beq_nat i n) then v else c n))
end.
```

Following are my working examples to get from these new functions to proofs of their properties. They represent what I would put in a research notebook and scratch work.

```
Theorem bound_check_3_10: bound_check 3 (empty_rf 10).
Proof.
simpl. omega.
Qed.
Theorem bound_check_2_10: bound_check 2 (empty_rf 10).
Proof.
simpl. omega.
Qed.
Theorem bound_check_4_10: bound_check 4 (empty_rf 10).
Proof.
simpl. omega.
Qed.
Example test1: (get 3 ((put 3 (empty_rf 10) (val 3)) bound_check_3_10)) bound_check_3_10 = (val 3).
Proof.
reflexivity.
Qed.
Example test2: (get 2 ((put 3 (empty_rf 10) (val 3)) bound_check_3_10)) bound_check_2_10 = ??.
Proof. reflexivity. Qed.
Example test3: (get 4 ((put 3 (empty_rf 10) (val 3)) bound_check_3_10)) bound_check_4_10 = ??.
Proof. reflexivity. Qed.
Theorem update_position_3: forall m, (get 3 ((put 3 (empty_rf 10) (val m)) bound_check_3_10)) bound_check_3_10 = (val m).
Proof.
intros.
trivial.
Qed.
Theorem update_position_size: forall v s rfc, forall p:bound_check 3 (file s rfc),
leb 3 s = true -> (get 3 ((put 3 (file s rfc) (val v)) p)) = (fun p => val v).
Proof.
intros. unfold put. unfold bound_check.
simpl.
reflexivity.
Qed.
Lemma beq_refl: forall i, beq_nat i i = true.
Proof.
intros.
induction i.
reflexivity.
auto.
Qed.
Theorem empty_empty: forall i s,
(leb i s) = true -> (get i (empty_rf s)) = (fun p:bound_check i (empty_rf s) => ??).
Proof.
intros. simpl. reflexivity.
Qed.
```

Earlier versions had a proof, `unknown_oob`

, that showed any out-of-bounds access resulted in `??`

. This is gone because such accesses are now prevented by type checking.

Following are the two major correctness theorems for the register file `put`

and `get`

functions. First a check that updates occur following `put`

and then a check that values not accessed by a `put`

are not modified.

```
Theorem update_immediate: forall v i s rfc, forall p:bound_check i (file s rfc),
((get i ((put i (file s rfc) (val v)) p)) p) = val v.
Proof.
intros.
simpl.
rewrite beq_refl.
trivial.
Qed.
Theorem update_invariant: forall v i j s rfc,
forall p:bound_check i (file s rfc),
forall p':bound_check j (file s rfc),
(beq_nat j i) = false ->
((get i ((put j (file s rfc) (val v)) p')) p) = rfc i.
Proof.
intros.
simpl.
rewrite H.
trivial.
Qed.
End RegFileDep.
```

Here’s the register file model that we’ve been using in class. There may be some small differences, but for the most part it’s exactly what I’ve done at the board.

Read this for Thursday paying attention to the invariant proofs and to the use of number comparison functions.

```
Require Import Omega.
Module RegFileArray.
Fixpoint leb (m:nat) : nat -> bool :=
match m with
| O => fun _:nat => true
| S m' =>
fun n:nat => match n with
| O => false
| S n' => leb m' n'
end
end.
Inductive RC : Type :=
| val : nat -> RC
| unknown : RC.
Notation "??" := unknown.
Definition RFC : Type := nat -> RC.
Inductive RF : Type :=
| file : nat -> RFC -> RF.
Definition empty_rf(s:nat):RF := file s (fun n:nat => ??).
Definition get(i:nat)(f:RF):RC :=
match f with
| file s c => if (leb i s) then (c i) else ??
end.
Definition put(i:nat)(f:RF)(v:RC):RF :=
match f with
| file s c => if (leb i s)
then file s (fun n:nat => if (beq_nat i n) then v else c n)
else f
end.
Example test1: (get 3 (put 3 (empty_rf 10) (val 3))) = (val 3).
Proof.
reflexivity.
Qed.
Example test2: (get 2 (put 3 (empty_rf 10) (val 3))) = ??.
Proof. reflexivity. Qed.
Example test3: (get 4 (put 3 (empty_rf 10) (val 3))) = ??.
Proof. reflexivity. Qed.
Theorem update_position_3: forall m, (get 3 (put 3 (empty_rf 10) (val m))) = (val m).
Proof.
intros.
trivial.
Qed.
Theorem update_position_size: forall v s rfc,
leb 3 s = true -> (get 3 (put 3 (file s rfc) (val v))) = val v.
Proof.
intros. unfold put.
rewrite H.
unfold get.
rewrite H.
reflexivity.
Qed.
Lemma beq_refl: forall i, beq_nat i i = true.
Proof.
intros.
induction i.
reflexivity.
auto.
Qed.
Theorem empty_empty: forall i s, (leb i s) = true -> (get i (empty_rf s)) = ??.
Proof.
intros. simpl. destruct (leb i 10); rewrite H; reflexivity.
Qed.
Theorem unknown_oob: forall i s rfc, (leb i s) = false -> (get i (file s rfc)) = ??.
Proof.
intros.
simpl. rewrite H. reflexivity.
Qed.
Theorem update_immediate: forall v i s rfc,
(leb i s) = true -> (get i (put i (file s rfc) (val v))) = val v.
Proof.
intros.
simpl.
rewrite H.
simpl.
rewrite beq_refl.
rewrite H.
trivial.
Qed.
Theorem update_invariant: forall v i j s rfc,
(beq_nat j i) = false -> (leb j s) = true -> (leb i s) = true -> (get i (put j (file s rfc) (val v))) = rfc i.
Proof.
intros.
simpl.
rewrite H0.
simpl.
rewrite H.
rewrite H1.
trivial.
Qed.
Theorem update_invariant': forall v i j s rfc,
(beq_nat j i) = false -> (leb j s) = false -> (leb i s) = true -> (get i (put j (file s rfc) (val v))) = rfc i.
Proof.
intros.
simpl.
rewrite H0.
simpl.
rewrite H1.
trivial.
Qed.
Theorem update_invariant'': forall v i j s rfc,
(beq_nat j i) = false -> (get i (put j (file s rfc) (val v))) = (get i (file s rfc)).
Proof.
intros.
simpl.
destruct (leb j s).
simpl.
rewrite H.
trivial.
simpl.
trivial.
Qed.
Theorem update_size_invariant: forall s rfc i v,
match (put i (file s rfc) v) with
| file s' rfc' => is_true (beq_nat s s')
end.
Proof.
intros.
unfold is_true.
unfold put.
destruct (leb i s).
apply beq_refl.
apply beq_refl.
Qed.
End RegFileArray.
```

Following is the Array model of register file contents developed in class today. We will add to the model next week to capture the size of the register file.

The proofs use mostly use commands defined in class with a couple of exceptions. In some places I use `trivial`

rather than `reflexivity`

. `trivial`

does a bit more work than reflexivity and can be customized. Feel free to use it in your proofs if you woule like.

I use `unfold`

in several places where I want a function replaced by its definition. `unfold f`

expands the definition in place. If `f`

is defined:

`Definition f(x:nat):nat := x + 1.`

then `unfold`

applied to the goal `f 1 = 2`

results in the goal:

`1 + 1 = 2`

I use `unfold`

in a couple of places rather than `simpl`

when `simpl`

doesn’t work or is too involved. Note that `unfold`

can be used on assumptions using the `in`

syntax to specify the assumption.

Here is the [Array] module complete with proofs:

```
Require Import Omega.
Module Array.
Fixpoint leb (m:nat) : nat -> bool :=
match m with
| O => fun _:nat => true
| S m' =>
fun n:nat => match n with
| O => false
| S n' => leb m' n'
end
end.
(** Register contents. Either a natural number value or unknown. *)
Inductive RC : Type :=
| val : nat -> RC
| unknown : RC.
Notation "??" := unknown.
(** Register file contents. Function from natural to RC. *)
Definition RFC : Type := nat -> RC.
(** Empty register file contents *)
Definition empty_rf:RFC := (fun n:nat => ??).
(** [get] applies the register file function to its argument *)
Definition get(i:nat)(f:RFC):RC := (f i).
(** [put] uses an [if] expression to modify an existing register file *)
Definition put(i:nat)(f:RFC)(v:RC):RFC :=
(fun n:nat => if (beq_nat i n) then v else f n).
(** Some tests to make sure the implementation is close to correct. *)
Example test1: (get 3 (put 3 empty_rf (val 3))) = (val 3).
Proof. reflexivity. Qed.
Example test2: (get 2 (put 3 empty_rf (val 3))) = ??.
Proof. reflexivity. Qed.
Example test3: (get 4 (put 3 empty_rf (val 3))) = ??.
Proof. reflexivity. Qed.
(** Making the test slightly more abstract by checking all contents values *)
Theorem update_position_3: forall m, (get 3 (put 3 empty_rf (val m))) = (val m).
Proof.
intros.
trivial.
Qed.
(** Lemma showing that [beq] is reflexive. Will be used in a later proof *)
Lemma beq_refl: forall i, beq_nat i i = true.
Proof.
intros.
induction i.
reflexivity.
auto.
Qed.
(** All elements of the empty register file are unknown *)
Theorem empty_empty: forall i , (get i empty_rf) = ??.
Proof.
intros. reflexivity.
Qed.
(** [get] applied to a [put] on the same index updates the register file
contents. This is correctness of the update function. *)
Theorem update_immediate: forall v i rfc,
(get i (put i rfc (val v))) = val v.
Proof.
intros.
unfold get.
unfold put.
rewrite beq_refl.
reflexivity.
Qed.
(** [get] applied to a [put] on different indexes results in the value
before updating. An invariant on update. *)
Theorem update_invariant: forall v i j rfc,
(beq_nat j i) = false -> (get i (put j rfc (val v))) = rfc i.
Proof.
intros.
unfold get.
unfold put.
rewrite H.
trivial.
Qed.
End Array.
```

A couple of hints on the proof for `distr_rev`

in your current project. First, it goes largely the same way that the examples we did in class do. Start with induction and look for lemmas that can help get rid of proof goals. Second, you may need lemmas for both the base case and inductive cases. Your mileage may vary, but my proof worked out that way. Finally, try rewriting with your inductive hypothesis *before* starting your lemma proofs. Sometimes rewriting will result in a simpler lemma.

Do not forget `simpl`

before diving into an auxilliary lemma.

Do go back and check your lemma statements. I had difficulty proving one that was clearly false because of a simple typo.

Bottom line is that this proof goes just like the one in class with only one or two small twists.

I’ve talked with a couple of student who are trying to use `Eval`

on theorems. Specifically:

`Eval compute in some_theorem.`

This will not help. You should be using `Eval`

with the functions you write for testing. `Eval`

applies the tactic `compute`

to its argument. If that is a function, then it evaluates the function as far as it can. For a theorem or lemma, there is nothing really that it can do. It will apply the tactic as you ask it to, but I don’t think the results will ever be useful.

I’ve gotten some emails regarding a proof that requires using `rewrite`

with a lemma. The theorem is showing that `rev`

conserves length. First remember our definition of `rev`

that uses a helper function `snoc`

that works like `cons`

, but adds to the end of a list:

```
Fixpoint snoc (l:natlist) (v:nat) : natlist :=
match l with
| nil => [v]
| h :: t => h :: (snoc t v)
end.
```

```
Fixpoint rev (l:natlist) : natlist :=
match l with
| nil => nil
| h :: t => snoc (rev t) h
end.
```

We decided to prove that `length (rev l) = length l`

to show that reversing a list does not change its length. This proof, taken from Pierce, tries to prove the theorem using our induction in the standard way:

```
Theorem rev_length_firsttry : forall l : natlist,
length (rev l) = length l.
Proof.
intros l. induction l as [| n l'].
(* Case "l = []". *)
reflexivity.
(* Case "l = n :: l'". *)
simpl.
rewrite <- IHl'.
Abort.
```

The base case pops out immediately using `reflexivity`

. The inductive case presents a problem. In earlier theorems we’ve been able to rewrite with the induction hypothesis. That doesn’t work here and the helper function is a hint as to why. There are two recursive functions at work here – `snoc`

and `rev`

. We need to know that the length of `snoc l x`

is `length l + 1`

. It’s the former that causes problems in the first try above.

Let’s prove the theorem we need over `snoc`

that we’ll then use in our proof over `rev`

:

```
Theorem length_snoc : forall n : nat, forall l : natlist,
length (snoc l n) = S (length l).
Proof.
intros n l. induction l as [| n' l'].
(* Case "l = nil". *)
reflexivity.
(* Case "l = cons n' l'". *)
simpl. rewrite -> IHl'. reflexivity. Qed.
```

Nothing magic here. We’re just recognizing what we need for the `rev`

proof and proving it. Note that the proof looks just like earlier proofs because there is no nested recursive function. We can now use the proof in the main `rev`

proof:

```
Theorem rev_length : forall l : natlist,
length (rev l) = length l.
Proof.
intros l. induction l as [| n l'].
(* Case "l = nil". *)
reflexivity.
(* Case "l = cons". *)
simpl. rewrite -> length_snoc.
rewrite -> IHl'. reflexivity. Qed.
```

We use the new theorem with `rewrite`

the same way we use an assumption. Specifically:

`rewrite -> length_snoc`

appears in the proof. Now we can rewrite with the induction hypothesis and wrap up the proof.

Stacks of naturals are interesting only to a point. We would like stacks of *anything*. Furthermore, stacks are stacks regardless of what they might contain.

Coq allows us to parameterize types in the same way we parameterize traditional functions. We can parameterize the `stack`

inductive type over a type parameter to allow that stack to contain whatever we want:

```
Inductive stack (T:Type) : Type :=
| empty : stack T
| push : T -> stack T -> stack T.
```

Now we can create stacks that contain whatever it is we are interested in keeping around:

`Eval compute in push nat 5 (empty nat).`

We can also update `pop`

to operate on this new stack that contains anything:

```
Definition pop (T:Type) (s:stack T) : (stack T) :=
match s with
| empty => empty T
| push x s => s
end.
```

and evalaute `pop`

providing the type explicitly:

`Eval compute in pop nat (push nat 5 (empty nat)).`

We can also make the type for `pop`

infered by including the definition in curly braces:

```
Definition pop{T:Type} (s:stack T) : (stack T) :=
match s with
| empty => empty T
| push x s' => s'
end.
```

and not include the type explicitly in the call to `pop`

:

`Eval compute in pop (push nat 5 (empty nat)).`

The definition of `stack`

remains the same when type inference is added to `pop`

. Also note that the type parmeter is not included in the pattern match defining `pop`

. Type parameters are not matched like other parameters to `pop`

.

We ended class with this quick example and I don’t expect everyone to follow it. We’ll come back to dependent types later on with more sophisticated techniques. For now, I just want you to see where we’re headed.

What does this signature say:

`Definition pop{Type:type} (s:stack T) : s <> empty T -> (stack T).`

Look first at the new return type of `pop`

:

`s <> empty T -> stack T`

which is a function from the *type* `s <> empty T`

to a stack. Remember Curry-Howard. A witness to the type `s <> empty T`

is a proof of the theorem. So, the return valueis a function to a value. To get the value, you must provide the proof to the function.

Let’s look at the rest of the new pop definition before looking more carefully at this:

```
Definition pop{T:Type}(s:stack T) : s <> empty T -> (stack T) :=
match s with
| empty => (fun p:(empty T)<>(empty T) => empty T)
| push x t => (fun p:push T x t <> empty T => t)
end.
```

Look at the return value:

`Eval compute in pop (push nat 1 (empty nat)).`

The return value is now a function for both `empty`

and push. In the `empty`

case, the return value is `fun p:empty<>empty => empty`

meaning the result will be empty if a proof of `empty<>empty`

is provided. That proof does not exist and therefore cannot be provided to the function. Thus, no stack can be returned.

In the `push`

case, a proof of `push T x t <> empty T`

most be provided to get `t`

. That proof is trivial and shown here:

```
Theorem push_ne_empty: forall T x s, push T x s <> empty T.
Proof.
intros.
discriminate.
Qed.
```

Let’s now evaluate a legal `pop`

providing a value for the proof. Note that when the proof is provided, values for the universally quantified variables are provide just like parameters:

`Eval compute in (pop (push nat 1 (empty nat))) (push_ne_empty nat 1 (empty nat)).`

Bingo. Now we get the `empty`

stack, exactly as we want. We have provided infrastructure that requires us to prove that `pop`

is always called on a value creatd with `push`

.

Let’s step away from this proof for awhile and think about what we need using simpler examples. Let’s look at numbers:

```
Inductive nat : Type :=
| 0:nat
| S:nat -> nat.
```

This type is built-in, so don’t add it yourself. We can also define `pred`

as:

```
Fixpoint pred(n:nat) : Type :=
match n with
| 0 => 0
| S n => n
end.
```

Let’s look at this theorem over natural numbers:

```
Theorem plus_id_example : forall n m:nat, n=m -> n + n = m + m.
Proof.
intros.
rewrite -> H.
reflexivity.
Qed.
```

In this case, `intros`

does two things. First, it gets rid of `n`

and `m`

as expected. However, it also makes `n=m`

an assumption:

```
n : nat
m : nat
H : n = m
============================
n + n = m + m
```

It’s pretty clear how to solve this problem. If we can replace `n`

with `m`

in the goal, we’re done. We do this with the command `rewrite H`

. The rewrite command uses a term of the type `a=b`

and replaces `a`

with `b`

in the goal.

`rewrite H.`

You end up with:

```
n : nat
m : nat
H : n = m
============================
m + m = m + m
```

that is easily discharged with `reflexivity`

. You can also go the other way with the proof command `rewrite <- H`

that will replace `m`

with `n`

.

Another example creates two assumptions with intro:

```
Theorem plus_id_exercise : forall n m o:nat,
n = m -> m = o -> n + m = m + o.
Proof.
intros. rewrite -> H. rewrite <- H0. reflexivity.
Qed.
```

What does the arrow mean? Isn’t `T->T`

a function? Precisely. The type `n=m -> n+n = m+m`

says *if you give me a thing of type n+n I will give you a thing of type m+m.* What is a thing of type

`n=m`

? it’s a proof of `n=m`

! So the function type operation is the same as the logical implication operation. Pretty darn cool!What do we want to do with this infrastruture? Let’s do a simple language optimization example. First, let’s define the abstract syntax for a simple arithmetic expression language. We have numbers, addition, subtraction, and multiplication:

```
Inductive AE : Type :=
| num : nat -> AE
| plus : AE -> AE -> AE
| minus : AE -> AE -> AE
| times : AE -> AE -> AE.
```

Now, lets used `Fixpoint`

to define an evaluation function:

```
Fixpoint eval(e:AE):nat :=
match e with
| num n => n
| plus l r => (eval l) + (eval r)
| minus l r => (eval l) - (eval r)
| times l r => (eval l) * (eval r)
end.
```

Let’s try it out:

```
Eval compute in eval (num 5).
Eval compute in eval (plus (num 5) (num 10)).
Eval compute in eval (minus (num 5) (num 4)).
```

The evaluation function should be reasonably straightforward. Numbers are constants and evaluate to their values. In every other case, we evaluate arguments and then apply the associated operation to get a natural number result. Classical, recursive evaluation function.

A simple optimization would be to remove zeros from addition functions. This is a AE to AE transformation:

```
Fixpoint optimize(e:AE):AE :=
match e with
| num n => e
| plus (num 0) r => optimize r
| plus l r => plus (optimize l) (optimize r)
| times l r => times (optimize l) (optimize r)
| minus l r => minus (optimize l) (optimize r)
end.
```

All this function does is removes a `0`

from the left argument of an addition. Let’s try it out:

```
Eval compute in eval (optimize (num 5)).
Eval compute in eval (optimize (plus (num 0) (num 10))).
Eval compute in eval (optimize (minus (plus (num 0) (num 10)) (num 4))).
```

It’s not enough to test. We want to make certain the optimization is correct. Let’s define a theorem whose proof would give us this evidence. What does correctness mean in this case? It’s simple enough to say that the evaluator works the same before and after optimization. Before we state that in Coq, we should state it in mathematics:

\[\forall e:AE \cdot eval(optimize(e)) = eval(e)\]

For every \(e\) we get evaluating the optimized \(e\) gives us the same thing as evaluating \(e\).

Now in Coq:

`Theorem correctness: forall e:AE, eval (optimize e) = eval e.`

How do we prove this? We can dive into the prover and start banging away, but you will never prove anything automatically that you can’t prove by hand. So, how would we do this by hand?

Inductive data type implies induction. There are 5 cases, four of which are trivial because optimize does not change the term. The one case that is interesting is where \(0+e\) is replaced by \(e\). How do we prove that?

\[eval 0 + e = eval e\]

```
Theorem correctness: forall e:AE, eval (optimize e) = eval e.
Proof.
intros.
induction e.
simpl. reflexivity.
destruct e1. destruct n. simpl. assumption.
simpl. rewrite IHe2. reflexivity.
simpl. simpl in IHe1. rewrite IHe1. rewrite IHe2. reflexivity.
simpl. simpl in IHe1. rewrite IHe1. rewrite IHe2. reflexivity.
simpl. simpl in IHe1. rewrite IHe1. rewrite IHe2. reflexivity.
simpl. simpl in IHe1. rewrite IHe1. rewrite IHe2. reflexivity.
simpl. simpl in IHe1. rewrite IHe1. rewrite IHe2. reflexivity.
Qed.
```

Here’s the first proof goal:

```
n : nat
============================
eval (optimize_0plus (num n)) = eval (num n)
```

This is easily solved with `simpl`

and `reflexivity`

A more interesting case is the second proof goal:

```
e1 : AE
e2 : AE
IHe1 : eval (optimize_0plus e1) = eval e1
IHe2 : eval (optimize_0plus e2) = eval e2
============================
eval (optimize_0plus (plus e1 e2)) = eval (plus e1 e2)
```

Here we have a pair of induction hypotheses that are used to do th proof.

We need to talk about `assumption`

and what it does with `IHe2`

as well as `rewrite`

and what it does in `IHe1`

and `IHe2`

`simpl`

is used in a different way in this proof by using the `in`

option. When we say `simpl in IHe2`

we are applying `simpl`

to `IHe2`

rather than the goal. Many commands use the `in`

qualifier in exactly this way.

The new proof command, `assumption`

, attempts to find an assumption that matches the goal. If it does, the goal is discharged. If it does not, it fails.

The new proof command, `rewrite IHe1`

, attempts to treat some assumption `IHe1`

like a rewrite rule. For this to work, `IHe1`

must be of the form `H: A=B`

where `H`

, which it is. `rewrite IHe1`

then tries to replace `A`

with `B`

in the goal. `rewrite <- IHe1`

works the other way and tries to replace `B`

with `A`

in the goal. Rewriting works on assumptions as well by using the notation `rewrite IHe1 in IHe2`

. We’ll learn more about automating rewriting later on, but for now we can replace a term by another term equal to it using `rewrite`

.

For the curious, here’s a simpler proof that you shouldn’t understand yet, but is worth looking forward to:

```
Theorem correctness': forall e:AE, eval (optimize e) = eval e.
Proof.
intros.
induction e; try (simpl; auto).
destruct e1; try (simpl; auto).
destruct n. simpl. assumption. try (simpl; auto).
Qed.
```

I have been struggling for awhile to get Emacs and ProofGeneral to play well with Coq on my Mac. It always works, but there have been bugs in the interface that caused the cursor to jump around that caused ProofGeneral to be difficult to use. I believe I have found the problem. Rather, I believe someone else fould the problem and fixed it.

Yesterday I updated to Emacs 24.5 using the Emacs Modified distribution for the Mac. I also updated to ProofGeneral 4.3pre150313. It appears all my problems are solved as Emacs and ProofGeneral are working as advertised. I also installed company-coq, but the jury is still out on that one. Some things I like pretty well, but the auto completion stuff is still annoying. Might be just a matter of getting used to it.

Let’s look at the definition of `bool`

as an enumeration.

```
Inductive bool : Type :=
| true : bool
| false : bool.
```

This is exactly as it was for `day`

. The type `bool`

is defined as the set of values `true`

and `false`

. Nothing magic here. `bool`

is not a keyword in the specification language. It’s just a type whose only values are `true`

and `false`

.

A simple definition of `not`

uses the `match`

expression:

```
Definition not(x:bool):bool :=
match x with
| true => false
| false => true
end.
```

Try out `not`

on a couple of examples:

```
Eval compute in not true.
> false
> : bool
```

Let’s prove a simple theorem that `not`

is its own inverse:

```
Theorem not_not : forall x:bool, not(not x) = x.
Proof.
intros. destruct x.
simpl. reflexivity.
simpl. reflexivity.
Qed.
```

We start the proof with `intros`

that eliminates

A more compact proof uses reflexivity directly and the semicolon to apply reflexivity to every subgoal generated by `destruct`

. Don’t worry if you don’t follow this yet.

```
Theorem not_not' : forall x:bool, not(not x) = x.
Proof.
intros. destruct x; reflexivity.
Qed
```

As their name implies, recursive types reference themselves. In Coq, this simply means that the signatures for constructors reference the type being defined.

A primitive `stack`

of natural numbers has the following inductive definition:

```
Inductive natstack : Type :=
| empty : natstack
| push : nat -> natstack -> natstack.
```

The `->`

symbol represents a fuction type. `nat -> nat`

is a function from naturals to naturals. `nat -> natstack -> natstack`

is a function from naturals to a function from `natstack`

to `natstack`

.

`Eval compute in ((push 5) empty).`

You can also view this as a function from two arguments of type `nat`

and `natstack`

to another `natstack`

.

`Eval compute in (push 5 empty).`

Note that both `Eval`

instances give you the same value.

`Eval compute in (push 5)`

gives you a function rather than a natural because only one argument is provided. This is called *currying*.

Note that `push`

has as one of its arguments a `natstack`

. This makes sense as a `push`

by definition takes a stack and value, and pushes the value on the original stack. This use of `natstack`

is why this definition is called recursive.

Constructors for `natstack`

define every *value* for the type `natstack`

. A value has the property that it cannot be evaluated further. Previously, `Monday`

is a value because it clearly cannot be reduced. The same is true for `empty`

. The same is also true for `push`

. Specifically, `(push 5 empty)`

is a value and cannot be reduced further. `(push 6 (push 5 empty))`

is similarly a value and cannot be reduced. This may seem odd a first as we usually have a data structure representing a stack, but this is a perfectly valid representatin of stack values.

Let’s create function over `natstack`

:

```
Definition pop(s:natstack):natstack :=
match s with
| empty => empty
| (push n s) => s
end.
```

`pop`

takes a stack and returns `empty`

if it is empty and `s`

if it is the result of some `push n s`

construction. The `n`

value is not used by the second case for `pop`

, so it can be left out:

```
Definition pop(s:natstack):natstack :=
match s with
| empty => empty
| (push _ s) => s
end.
```

Note that this is not a recursive function. Let’s define a recursive function with `Fixpoint`

:

```
Fixpoint size(s:natstack):nat :=
match s with
| empty => 0
| (push _ s) => 1 + (size s)
end.
```

`Fixpoint`

tells Coq that we are defining a recurisive structure as a fixedpoint. All functions in Coq must terminate, thus Coq must show that some argument to `size`

gets smaller with every call. Turns out this is simple as `s`

is smaller than `push x s`

and Coq is happy. For data structures defined using `Inductive`

, this will always be the case if we recurse on the arguments to a constructor for every recursive constructor.

Prove now that `size`

of anything resulting in `push`

is greater than 0.

```
Theorem push_gtz: forall x s, size (push x s) > 0.
Proof.
intros. induction s. unfold size. omega.
simpl. omega.
Qed.
```

Let’s talk about some proof commands that we just used:

`Proof`

opens the proof starts and interactive proof session. We are creating a proof object.

`intros`

eliminates universal quantifiers and preconditions. In this case, `intros`

replaces the quantifier with the assumptions:

```
x:nat
s:natstack
```

These assumptions state that there is an `x`

of type `nat`

and an `s`

of type `natstack`

. However, they do not assume what that value is. They are therefore *arbitrary* and allow us to make conclusions about *all* natural numbers.

`induction`

applies the induction principle we get automatically from the inductive definition of stacks. The goal is split into two goals, one for each enumeration case:

```
x : nat
============================
size (push x empty) > 0
```

The nonrecursive case has no induction hypothesis.

```
x : nat
n : nat
s : natstack
IHs : size (push x s) > 0
============================
size (push x (push n s)) > 0
```

The recursive case has an induction hypothesis that is named `IHs`

.

The first case is solved with two commands - `simpl`

and `omega`

. `simpl`

simplifies things by rewriting definitions and applying simplification rules. `omega`

is a powerful Pressberger arithmetic solver. In English, it discarges quantifier-free mathematical and logical sets of equations. It is not a part of the core system and must be loaded using `Requires Import Omega`

The second case is solved using `simpl`

and `omega`

also.

`Qed`

closes the proof.

Our task is to define formal systems that describe systems and reason about those systems to determine correctness.

We do this all the time: (i) circuits example of a parity checker;(ii) examples from physics.

We would like a predictive mathematics for computer systems systems.

A *formal system* consists of three parts:

- A
*formal language*- An
*alphabet* - A
*grammar*

- An
- An
*inference system*- A set of
*axioms* - A set of
*inference rules*

- A set of
- A
*semantics*

A *formal language* is an *alphabet* describing a set of atomic tokens
and a *grammar* that defines proper ordering of those tokens. We will spend little time in this course discussing formal languages, but for completness I’ll include a simple example. Consider a language, \(L\), defining strings of letters:

Alphabet: \(\{a,b,c\}\) Grammar: \(L ::= empty \mid aL \mid bbL \mid cccL\)

Examples of strings from \(L\) include:

\[a, bba, ccca, cccbbbbcccc\]

while examples of strings not in \(L\) include:

\[bbb, cbc, acca\]

The study of languages and grammars is called *formal language theory* or *computation theory*. As it turns out we can learn tons of good stuff by defining recognizers for languages using automaton. A formal language theory course is taught frequently as EECS 716 should you be interested.

An *inference system* is a collection of *inference rules* and *axioms*.
Inference rules define immediate consquences of known facts. They have the form:

\[A_0,A_1,A_2,…,A_n \vdash C\]

or equivalently:

\[\frac{A_0,A_1,A_2,…,A_n}{C}\]

The \(A_k\) are called *antecedents* and \(C\) is a *consequent*. The rule states that if all \(A_k\) are known to be true, then \(C\) is an *immediate consequence*. Consider the example inference rule:

\[\frac{\forall x \cdot P(x)}{P(a)}\]

that says if we know ((P)) is true for all ((x\), then it is true for a specific value \(a\). A second rule defines the concept of implication. Assuming that \(A\) and \(B\) are logical statements, the inference rule:

\[\frac{A, A\rightarrow B}{B}\]

says that if we know \(A\) and we know \(A\rightarrow B\), then we immediately know \(B\).

Finally, the inference rule:

\[\frac{\;}{\forall x \cdot bird(x) \rightarrow flies(x)}\]

says that withoug knowing anything else, if we know that \(x\) is a bird, then \(x\) flies. This rule is an *axiom* because the consequent depends on no antecedents and is true by definition.

A *proof* strings axioms and inference rules together starting from a set of known things and moving towards a conclusion. Using the axiom and rules above, we can prove that tweety flies:

\[\frac{\frac{\frac{\;}{\forall x \cdot bird(x) \rightarrow flies(x)}{bird(tweety)\rightarrow flies(tweety)}} bird(tweety)}{flies(tweety)}\]

Defining models and using inference systems to verify their properties is the basis of this source.

A *semantics* links symbols to their meainings. This is the topic of programming language semantics. We’ll touch on this in our class, but it’s the primary topic of EECS 762.

In this class we will focus on defining models in a specific formal system. A model is simply a collection of definitions in a formal system. Using the formal system, one can reason about properties of those definitions - what they imply, how they are related, what properties they exhibit. You have been doing this since you started doing word problems in mathematics. However, you have quite likely not done this for software artifacts.

The basic mathematics for describing software is logic. Logic will be our formal system There are many logics for many purposes, but most are rooted in *propositional logic* and *predicate calculs.*

Propositional logic is the arithmetic of logical systems and where we will start.

\[P ::= A,B,C,\ldots \mid P \wedge P \mid P \vee P \mid \neg P \mid P\rightarrow P \mid P\leftrightarrow P\]

Logical variabes and statements plus the classical logical operators for and, or, not, implies, and logical equality.

We have a collection of rules that describe out these operators work. Specifically, rules that *introduce* and *eliminate* operations. The rules for conjunction are:

\[\frac{X,Y}{X \wedge Y} \frac{X\wedge Y}{X} \frac{X\wedge Y}{Y}\]

They are directly used in a forward fashion:

“If I have \(X\) and \(Y\), then \(X\wedge Y\) is an immediate consequence.”

or indireclty in a backwards fashion:

“If I want to have \(X\wedge Y\), then I need to know \(X\) and \(Y\).”

Predicate calculus adds variables and quantifiers to propositional logic.

\[P ::= \ldots \mid \forall x:T \cdot P(x) \mid \exists x:T \cdot P(x)\]

where \(x\) is a variable and \(T\) is the type of that variable. Just like propositional logic, we have introduction and elimination rules for our new predicate operations:

\[\frac{\forall x:T \cdot P(x), a:T}{P(a)}\]

\[\frac{P(a)\mathsf{where a:T is arbitrary}}{\forall x:T \cdot P(x)}\]

Predicate calculus of this form will be our tool for writing specifications.

Coq is a proof assistant constructed around inuitionistic logic based on the calclus of constructions. It’s language is called gallina. Coq will be our Matlab - a tool for assisting us in development and manipulation of models for software, requirements, and constraints. We can do much more with Coq and will barely scratch the surface of what its capabilities are.

Coq provides relatively few primitive operations. Shockingly few if you’re not used to using theorem provers. The reason for this is quite simple - the smaller a program is, the easier it is to get right. If we build something small and build our system only from the primitives it provides, our chances of getting everything right improve dramatically. Thus, the Coq approach of providing a tiny core language and building from it.

In that core is the concept of an *inductive type*. Those of you who work in Haskell or ML have seen these as data types and the inductive type structure in Coq is quite similar:

```
Inductive day : Type :=
| monday : day
| tuesday : day
| wednesday : day
| thursday : day
| friday : day
| saturday : day
| sunday : day.
```

The days of the week are the *values* or *constructors* of this type. They define every possible way that a `day`

can be constructed. There are no other elements of `day`

`Inductive`

- keyword`day`

- type name`:`

- type relation`Type`

- defining a type`:=`

- assigns a value to the type name`.`

- terminates the definition

General format for an enumeration definition:

```
Inductive *name* : *type* :=
| *value* : *type*
| *value* : *type*
...
| *value* : *type*.
```

Writing functions over `day`

is done using pattern matching:

```
Definition next_weekday (d:day) : day :=
match d with
| monday => tuesday
| tuesday => wednesday
| wednesday => thursday
| thursday => friday
| friday => monday
| saturday => monday
| sunday => monday
end.
```

Non-recursive things are defined in this way using `Definition`

.

`Definition *name* (*params*) : *type* := *value*.`

The `match`

expression is performing a pattern match on `d`

. In this case, a trivial pattern match on the values of `day`

. You have to cover them all, or Coq will tell you.

Notice the form of both definitions:

```
*keyword* *name* *params* : *type* := *value*.
```

Everything in Coq is defined in roughly the same way.

You can check out how `next_weekday`

works using `Eval`

:

```
Eval compute in (next_weekday Friday).
Eval compute in (next_weekday (next_weekday Saturday).
```

You can also do proofs over `next_weekday`

and `day`

:

```
Example test_next_weekday:
(next_weekday (next_weekday Saturday)) = Tuesday.
Proof. simpl. reflexivity. Qed.
```

A more interesting proof would be:

```
Theorem irreflexive_next_weekday:
forall x, x <> next_weekday x.
Proof.
intros. induction x; unfold next_weekday; discriminate.
Qed.
```

Some interesting things going on here. The definition looks like all the rest - A `Theorem`

with the name `irreflexive_next_weekday`

is being defined. Read literatlly, the type of `irreflexive_next_weekday`

is `forall x, x <> next_weekday x`

. The type of the new theorem is the theorem to be proved. This is exactly right. The period after the type throws Coq into the prover where you define the value - a proof - of that type. The term `Proof`

begins that definition and human readable scripting language is used to define the proof value.

This is the Curry-Howard Correspondance at work. The proof is a witness to the theorem - it is an element of the type. Hang on to this thought.

Here’s what is going on in the proof:

- Proof - start defining a proof value
- intros - elminates the universal quantifier and creates an assumption from the quantified variable.
- induction - invokes the induction principle associated with an indctive type. every inductive definition provides and induction principle
- discriminate - invokes a proof-by-cases that ignores induction hypotheses. every inductive definition provides a proof-by-cases principle
- unfold - replaces a name with it’s definition
- Qed - Quick, Eat Donuts (end your proof value definition)

The proof is now named `test_next_weekday`

You can also extract code from both the type definition and the function:

```
Extraction day.
Extraction next_weekday.
```

You now have a definition of `day`

that is guaranteed to have the properties that were proved.

Welcome to the EECS 755 blog for Fall 2015. I will post interesting things here about class, projects, exams and other related stuff. Please check the blog often, particularly around project time when I tend to post hints and answers to student questions.

As noted on the website, we’re going to be using the Coq verification system this semester. If you want to get started on something, my suggestion would be to download and install Coq. I’ve successfully installed the system on both my Mac and on a VM running RedHat. I’ve never tried and never want to try installing on a PC.

You can use either the Coq IDE or ProofGeneral and Emacs to interface with Coq. I suppose you can use `coqtop`

in a terminal, but I would definitely not recommend that. The IDE is likely your best best for getting started, but I suspect you’ll want to move to ProofGeneral and Emacs as your verification problems get more involved.