Handling fold_left in proofs

OCaml developers at Tezos often use `fold_left` function in the code. Maybe the reason is that it is a bit easier to write efficient code using `fold_left` than `fold_right`.

`fold_left` and `fold_right` from OCaml library: List

``val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'afold_left f init [b1; ...; bn] is f (... (f (f init b1) b2) ...) bn.val fold_right : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'bfold_right f [a1; ...; an] init is f a1 (f a2 (... (f an init) ...)).Not tail-recursive.``

And from formal verification point of view, it is much easier to prove statements about `fold_right` either than about `fold_left`. Why? The reason is in how these functions are organized. Formal verification often uses mathematical induction as an instrument for proving statements on the lists.

As you know, `fold_left` would fold the list from the left. So on the first step it adds the first element to the accumulator and the accumulator changes for the next step. That means we can not use the induction hypothesis anymore. Contra in `fold_right` function accumulator wouldn't change until the very end so the induction hypothesis can be applied within the proof.

There is no universal template for solving problems with `fold_left`. Indeed one should be tricky and crufty, while proving statements with `fold_left`, but still there are couple of useful techniques, let's get to know them.

Rewrite `fold_left` with `fold_right`โ

First of all, check if the function inside `fold_left` is symmetric.

note

In mathematics, a function of n variables is symmetric if its value is the same no matter the order of its arguments. Common examples are addition and multiplication.

If so, we are in the easiest case, because we can simply use the theorem `fold_symmetric` from the Coq Lists library.

``  Theorem fold_symmetric :    forall (A : Type) (f : A -> A -> A),    (forall x y z : A, f x (f y z) = f (f x y) z) ->    forall (a0 : A), (forall y : A, f a0 y = f y a0) ->    forall (l : list A), fold_left f l a0 = fold_right f a0 l.``

In more complicated cases, when `folding function` is not symmetric, we have to create simulations equipped with `fold_right`.

The solution can be described by the following sequence of steps:

1. Write a simulation function, which does the same, but uses `fold_right`
2. Prove the equality of the original function (the one with `fold_left`) and it's simulation (the one with `fold_right`)
3. Rewrite inside the proof original function with its simulation
4. Prove the whole statement using induction (and as we mentioned above induction on `fold_right` goes fine).

Generalizationโ

The reason why we can not use the induction hypothesis with `fold_left` is an accumulator, which changes on every reduction step. Generalization of parameters is a very powerful tool: in regular proofs we usually use tactic `generalize dependent parameter_name`. Let's see how to do generalization for the functions with `fold_left`, where it is not that simple.

We plan to solve using the next steps:

1. Create a generalized version of our function (we generalize the accumulator)
2. Prove this simulation equals to original function
3. Rewrite it inside lemma and go by induction
4. We also might need some auxiliary functions (to be defined and proved).

Let's take a detailed, well-commented example

``(* We create two simple inductive types. *)Inductive good : Set := wealth | health | glory . Inductive bad : Set := poverty | disease | oblivion.(* And some conversion functions *)(* Simple conversion *)Definition fate (a : good) : bad :=  match a with  | wealth => poverty  | health => disease  | glory => oblivion  end.           Definition destiny (b : bad) : good :=  match b with  | poverty => wealth  | disease => health  | oblivion => glory  end.     (* Conversion with an accumulator. This function will be used within the [fold_left].   It receives an element and the list-accumulator, converts an element and adds   it to the list-accumulator. *)Definition fate_acc (ls : list bad) (a : good) : list bad :=  match a with  | wealth => poverty :: ls  | health => disease :: ls  | glory => oblivion :: ls  end.           Definition destiny_acc (ls : list good) (a : bad) : list good :=  match a with  | poverty => wealth :: ls  | disease => health :: ls  | oblivion => glory :: ls  end.              (* For these definitions (containing [fold_left]) we will be proving   different statements. *)Definition good_to_bad (ls : list good) : list bad :=  fold_left fate_acc [] ls.Definition bad_to_good (ls : list bad) : list good :=  fold_left destiny_acc [] ls.(* We will work with these lists. *)Definition bad_list : list bad := [oblivion; disease; oblivion; oblivion].Definition good_list : list good := [wealth; health; health; glory].(* Let's check our definitions: *)Compute good_to_bad good_list.Compute bad_to_good bad_list.(* These two statements are equal. *)Compute (bad_to_good bad_list).Compute rev (Lists.List.map destiny bad_list).(* Let's try to prove it. *)Lemma transformation : forall (ls : list bad),    bad_to_good ls = Lists.List.rev (Lists.List.map destiny ls).Proof. intros.       induction ls.       simpl. reflexivity.       simpl.       unfold bad_to_good in *.       simpl. destruct a.       Abort. (*  On this step we have an inductive hypothesis:    IHls : fold_left destiny_acc [] ls = Lists.List.rev (Lists.List.map destiny ls)    And the goal: fold_left destiny_acc [wealth] ls =    (Lists.List.rev (Lists.List.map destiny ls) ++ [destiny poverty])%list    We can see that right after the first iteration we are not able to rewrite     induction hypothesis anymore, because the accumulator is not the same.     Lets do the next steps:     1. Create version of our function with a generalized accumulator    2. Prove this simulation equals to original function    3. Rewrite it inside the lemma and go by induction    4. We will also need some auxiliary lemmas to be defined and proved.*)Definition bad_to_good_generalized  (ls : list bad) (acc : list good) : list good :=  fold_left destiny_acc acc ls.(* Check, before stating a Lemma. *)Compute bad_to_good bad_list.Compute bad_to_good_generalized bad_list [].Lemma bad_to_good_eq : forall ls, bad_to_good ls = bad_to_good_generalized ls [].Proof.  reflexivity.Qed. (* We will need the accumulator. *)Definition accum : list good := [wealth; glory].(* Check, before stating a Lemma. *)Compute bad_to_good_generalized bad_list accum.(*  = [glory; glory; health; glory; wealth; glory]     : list good *)Compute (bad_to_good_generalized bad_list [] ++ accum)%list. (*  = [glory; glory; health; glory; wealth; glory]     : list good *)(* Auxiliary lemma, we will need it in the main proof. *)Lemma bad_to_good_generalized_acc : forall ls acc,    bad_to_good_generalized ls acc =      (bad_to_good_generalized ls [] ++ acc)%list.Proof.  intros.  (* We are going to do an induction by ls.     To make our inductive hypothesis to be applicable to any acc,      we generalize it. *)  generalize dependent acc.  induction ls; intros.  (* base case *)  { simpl.  sfirstorder. }  (* inductive case *)  { destruct a eqn:A.    (* a = poverty *)    { unfold bad_to_good_generalized. simpl.      replace (fold_left destiny_acc (wealth :: acc) ls) with        (bad_to_good_generalized ls (wealth :: acc )) by sfirstorder.      rewrite IHls.      replace (fold_left destiny_acc [wealth] ls) with        (bad_to_good_generalized ls [wealth]) by sfirstorder.      replace (bad_to_good_generalized ls [wealth]) with        ((bad_to_good_generalized ls [] ++ [wealth])%list) by sfirstorder.      rewrite <- List.app_assoc. sfirstorder. }    (* using the same block of code we prove rest cases for `bad` type *)    ... (* The main lemma, trying to prove initial statement with [generalization] method. *)Lemma transformation_attempt2 : forall (ls : list bad),    bad_to_good ls = Lists.List.rev (Lists.List.map destiny ls).Proof.  intros.  rewrite bad_to_good_eq.  (* we now work with generalized version *)  induction ls. reflexivity.  simpl.  (* we do one step reduction *)  destruct a; simpl; unfold bad_to_good_generalized; simpl.  replace (fold_left destiny_acc [wealth] ls) with    (bad_to_good_generalized ls [wealth]) by sfirstorder.  (* we are using our auxillary lemma here *)  rewrite bad_to_good_generalized_acc.  (* now we can rewrite our inductive hypothesis *)  rewrite IHls; reflexivity. (* voila! *)  (* Let's do the same for 2 more constructors of [good] type.     We could pack it all in group of tactics but we don't     do so in order to keep proof demonstrative. *)  (* Same for `health` *)  replace (fold_left destiny_acc [health] ls) with    (bad_to_good_generalized ls [health]) by sfirstorder.  rewrite bad_to_good_generalized_acc.  rewrite IHls; reflexivity.  (* Same for `glory` *)  replace (fold_left destiny_acc [glory] ls) with    (bad_to_good_generalized ls [glory]) by sfirstorder.  rewrite bad_to_good_generalized_acc.  rewrite IHls; reflexivity.Qed.``

We used this method when we were doing formal verification of Skip_list_repr.

Yes, proving statements containing fold_left is not always trivial, for instance, as an example for this blog post I tried to prove that functions `good_to_bad` and `bad_to_good` are inverse :

``Lemma round_and_round (ls : list bad) : good_to_bad (bad_to_good ls) = ls.``

Proof turned out to be such gymnastics that it was not good for simple visual example.

There is one very good thing about proving statements containing `fold_left` though: it is always fascinating, cos every Lemma is a small intellectual challenge.