Skip to main content

Handling fold_left in proofs

ยท 8 min read
Natalie Klaus

OCaml developers at Tezos often use fold_left function in the code. Maybe the reason is that it is a bit easier to write efficient code using fold_left than fold_right.

fold_left and fold_right from OCaml library: List

val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a

fold_left f init [b1; ...; bn] is f (... (f (f init b1) b2) ...) bn.

val fold_right : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'b

fold_right f [a1; ...; an] init is f a1 (f a2 (... (f an init) ...)).
Not tail-recursive.

And from formal verification point of view, it is much easier to prove statements about fold_right either than about fold_left. Why? The reason is in how these functions are organized. Formal verification often uses mathematical induction as an instrument for proving statements on the lists.

As you know, fold_left would fold the list from the left. So on the first step it adds the first element to the accumulator and the accumulator changes for the next step. That means we can not use the induction hypothesis anymore. Contra in fold_right function accumulator wouldn't change until the very end so the induction hypothesis can be applied within the proof.

There is no universal template for solving problems with fold_left. Indeed one should be tricky and crufty, while proving statements with fold_left, but still there are couple of useful techniques, let's get to know them.

Rewrite fold_left with fold_rightโ€‹

First of all, check if the function inside fold_left is symmetric.

note

In mathematics, a function of n variables is symmetric if its value is the same no matter the order of its arguments. Common examples are addition and multiplication.

If so, we are in the easiest case, because we can simply use the theorem fold_symmetric from the Coq Lists library.

  Theorem fold_symmetric :
forall (A : Type) (f : A -> A -> A),
(forall x y z : A, f x (f y z) = f (f x y) z) ->
forall (a0 : A), (forall y : A, f a0 y = f y a0) ->
forall (l : list A), fold_left f l a0 = fold_right f a0 l.

In more complicated cases, when folding function is not symmetric, we have to create simulations equipped with fold_right.

The solution can be described by the following sequence of steps:

  1. Write a simulation function, which does the same, but uses fold_right
  2. Prove the equality of the original function (the one with fold_left) and it's simulation (the one with fold_right)
  3. Rewrite inside the proof original function with its simulation
  4. Prove the whole statement using induction (and as we mentioned above induction on fold_right goes fine).

Generalizationโ€‹

The reason why we can not use the induction hypothesis with fold_left is an accumulator, which changes on every reduction step. Generalization of parameters is a very powerful tool: in regular proofs we usually use tactic generalize dependent parameter_name. Let's see how to do generalization for the functions with fold_left, where it is not that simple.

We plan to solve using the next steps:

  1. Create a generalized version of our function (we generalize the accumulator)
  2. Prove this simulation equals to original function
  3. Rewrite it inside lemma and go by induction
  4. We also might need some auxiliary functions (to be defined and proved).

Let's take a detailed, well-commented example


(* We create two simple inductive types. *)
Inductive good : Set := wealth | health | glory .
Inductive bad : Set := poverty | disease | oblivion.

(* And some conversion functions *)

(* Simple conversion *)
Definition fate (a : good) : bad :=
match a with
| wealth => poverty
| health => disease
| glory => oblivion
end.

Definition destiny (b : bad) : good :=
match b with
| poverty => wealth
| disease => health
| oblivion => glory
end.

(* Conversion with an accumulator. This function will be used within the [fold_left].
It receives an element and the list-accumulator, converts an element and adds
it to the list-accumulator. *)
Definition fate_acc (ls : list bad) (a : good) : list bad :=
match a with
| wealth => poverty :: ls
| health => disease :: ls
| glory => oblivion :: ls
end.

Definition destiny_acc (ls : list good) (a : bad) : list good :=
match a with
| poverty => wealth :: ls
| disease => health :: ls
| oblivion => glory :: ls
end.

(* For these definitions (containing [fold_left]) we will be proving
different statements. *)
Definition good_to_bad (ls : list good) : list bad :=
fold_left fate_acc [] ls.

Definition bad_to_good (ls : list bad) : list good :=
fold_left destiny_acc [] ls.

(* We will work with these lists. *)
Definition bad_list : list bad := [oblivion; disease; oblivion; oblivion].
Definition good_list : list good := [wealth; health; health; glory].

(* Let's check our definitions: *)
Compute good_to_bad good_list.
Compute bad_to_good bad_list.

(* These two statements are equal. *)
Compute (bad_to_good bad_list).
Compute rev (Lists.List.map destiny bad_list).

(* Let's try to prove it. *)

Lemma transformation : forall (ls : list bad),
bad_to_good ls = Lists.List.rev (Lists.List.map destiny ls).
Proof. intros.
induction ls.
simpl. reflexivity.
simpl.
unfold bad_to_good in *.
simpl. destruct a.
Abort.
(* On this step we have an inductive hypothesis:
IHls : fold_left destiny_acc [] ls = Lists.List.rev (Lists.List.map destiny ls)

And the goal: fold_left destiny_acc [wealth] ls =
(Lists.List.rev (Lists.List.map destiny ls) ++ [destiny poverty])%list

We can see that right after the first iteration we are not able to rewrite
induction hypothesis anymore, because the accumulator is not the same.

Lets do the next steps:
1. Create version of our function with a generalized accumulator
2. Prove this simulation equals to original function
3. Rewrite it inside the lemma and go by induction
4. We will also need some auxiliary lemmas to be defined and proved.
*)

Definition bad_to_good_generalized
(ls : list bad) (acc : list good) : list good :=
fold_left destiny_acc acc ls.

(* Check, before stating a Lemma. *)
Compute bad_to_good bad_list.
Compute bad_to_good_generalized bad_list [].

Lemma bad_to_good_eq : forall ls, bad_to_good ls = bad_to_good_generalized ls [].
Proof.
reflexivity.
Qed.

(* We will need the accumulator. *)
Definition accum : list good := [wealth; glory].

(* Check, before stating a Lemma. *)
Compute bad_to_good_generalized bad_list accum.
(* = [glory; glory; health; glory; wealth; glory]
: list good *)
Compute (bad_to_good_generalized bad_list [] ++ accum)%list.
(* = [glory; glory; health; glory; wealth; glory]
: list good *)

(* Auxiliary lemma, we will need it in the main proof. *)
Lemma bad_to_good_generalized_acc : forall ls acc,
bad_to_good_generalized ls acc =
(bad_to_good_generalized ls [] ++ acc)%list.
Proof.
intros.
(* We are going to do an induction by ls.
To make our inductive hypothesis to be applicable to any acc,
we generalize it. *)
generalize dependent acc.
induction ls; intros.
(* base case *)
{ simpl. sfirstorder. }
(* inductive case *)
{ destruct a eqn:A.
(* a = poverty *)
{ unfold bad_to_good_generalized. simpl.
replace (fold_left destiny_acc (wealth :: acc) ls) with
(bad_to_good_generalized ls (wealth :: acc )) by sfirstorder.
rewrite IHls.
replace (fold_left destiny_acc [wealth] ls) with
(bad_to_good_generalized ls [wealth]) by sfirstorder.
replace (bad_to_good_generalized ls [wealth]) with
((bad_to_good_generalized ls [] ++ [wealth])%list) by sfirstorder.
rewrite <- List.app_assoc. sfirstorder. }
(* using the same block of code we prove rest cases for `bad` type *)
...

(* The main lemma, trying to prove initial statement with [generalization] method. *)
Lemma transformation_attempt2 : forall (ls : list bad),
bad_to_good ls = Lists.List.rev (Lists.List.map destiny ls).
Proof.
intros.
rewrite bad_to_good_eq. (* we now work with generalized version *)
induction ls. reflexivity.
simpl.
(* we do one step reduction *)
destruct a; simpl; unfold bad_to_good_generalized; simpl.
replace (fold_left destiny_acc [wealth] ls) with
(bad_to_good_generalized ls [wealth]) by sfirstorder.
(* we are using our auxillary lemma here *)
rewrite bad_to_good_generalized_acc.
(* now we can rewrite our inductive hypothesis *)
rewrite IHls; reflexivity. (* voila! *)
(* Let's do the same for 2 more constructors of [good] type.
We could pack it all in group of tactics but we don't
do so in order to keep proof demonstrative. *)
(* Same for `health` *)
replace (fold_left destiny_acc [health] ls) with
(bad_to_good_generalized ls [health]) by sfirstorder.
rewrite bad_to_good_generalized_acc.
rewrite IHls; reflexivity.
(* Same for `glory` *)
replace (fold_left destiny_acc [glory] ls) with
(bad_to_good_generalized ls [glory]) by sfirstorder.
rewrite bad_to_good_generalized_acc.
rewrite IHls; reflexivity.
Qed.

We used this method when we were doing formal verification of Skip_list_repr.

Yes, proving statements containing fold_left is not always trivial, for instance, as an example for this blog post I tried to prove that functions good_to_bad and bad_to_good are inverse :

Lemma round_and_round (ls : list bad) : good_to_bad (bad_to_good ls) = ls.

Proof turned out to be such gymnastics that it was not good for simple visual example.

There is one very good thing about proving statements containing fold_left though: it is always fascinating, cos every Lemma is a small intellectual challenge.