One of the points of concern in the Tezos code is integer overflows. To ensure that no overflow occurs, the development team created abstractions over the native OCaml integers. This makes it possible to check if an overflow happened and to return an error if that is the case.
But how to be sure that such abstractions are correct? In this post, we will explain how integer arithmetic can be verified using Coq.
As it is known, OCaml native integers are bounded by some bit width. We have
int
with 63 bits, int32
and int64
with 32 and 64 bits respectively. When
an operation exceeds its width capacity, we say that an overflow occurred.
In Coq this is another story, we have infinite integers, so the first thing we
need to do is to define the modular arithmetic found in OCaml. We will
use the int64
as an example, but the same reasoning applies to int
and
int32
, we just have to adjust the boundaries.
To define an OCaml integer we use Z
and define a Valid.t
type which
encloses a value between OCaml Int64.max_int
and Int64.min_int
, here is it:
Module Valid.
Defintion t (x : Z) : Prop := Int64.min_int <= x <= Int64.max_int.
End Valid.
The second thing is to define the operations, we need to ensure that operations
are closed under Valid.t
, that they respect the bounds Int64.min_int
and
Int64.max_int
. To do this, we define a normalize_int64
function that uses
Z.modulo
to ensure that the results are kept between the bounds. Here is it:
Definition two_pow_63 : Z := 9223372036854775808.
Definition two_pow_64 : Z := 18446744073709551616.
Definition normalize_int64 (n : Z) : int64 :=
Z.modulo (n + two_pow_63) two_pow_64 - two_pow_63.
Now we are ready to define the operations. We define them as the same operations
in Z, but surrounded by normalize_int64
Definition add (a : t) (b : t) : t := normalize_int64 (Z.add a b).
Definition sub (a : t) (b : t) : t := normalize_int64 (Z.sub a b).
Definition mul (a : t) (b : t) : t := normalize_int64 (Z.mul a b).
Definition div (a : t) (b : t) : t := normalize_int64 (Z.div a b).
At this point we have our definitions, now we need to ensure that they are valid and check for some properties. There are two main properties that we want to check.
First, we want to ensure that the operations are closed under Valid.t
,
in English, this would be something like "The sum of two int64
is always a valid int64
".
We verify the four operations as follows:
Lemma int64_add_is_valid a b : Valid.t (a +i64 b). lia. Qed.
Lemma int64_mul_is_valid a b : Valid.t (a *i64 b). lia. Qed.
Lemma int64_div_is_valid a b : Valid.t (a /i64 b). lia. Qed.
Lemma int64_sub_is_valid a b : Valid.t (a -i64 b). lia. Qed.
Here, a +i64 b
is a Coq
notation
to represent OCaml's Int64.(a + b)
.
At this point, we are sure that our definition is valid, that for
any two int64
we always yield a valid int64
. But this is not enough,
if we define all the operations as 0
, for example, this specification would
still be satisfied.
To fix this, we want to verify that the operation we defined is equivalent to the same operation in Z. I mean, we want to ensure that the sum is equal to the sum in Z, the subtraction is equal to the subtraction in Z, so on and so forth.
We verify the addition as follows:
Lemma add_eq : forall a b,
Valid.t (a +Z b) -> a +i64 b = a +Z b.
Proof.
lia.
Qed.
And we do the same for all the three remaining operations.
At this point, we verified that our definition of OCaml's integer arithmetic is correct, but we have not verified any Tezos code yet. Tezos abstractions are built over OCaml integers, so knowing that our definition of OCaml integers is fine we can proceed to verify a higher abstraction. The Tez_repr module is used as an example here.
Tez_repr.t
is a wrapper over an OCaml's int64
. The procedure for the
verification follows the same reasoning used before. We derive a domain over
where a Tez_repr.t
is valid, in this case, it is defined as a non-negative
int64
.
Module Valid.
Definition t (x : int64) : Prop := 0 <= x <= Int64.max_int
End Valid.t
Now comes the cool part, the definition of the operations are translated from
OCaml code using coq-of-ocaml.
So we are left to verify that each operation is valid, and that, under the
right domain, they are equivalent to the same operation in Z
.
Here is an example for the addition of two Tez_repr.t
,
(** On success, addition is closed under Tez_rept.Valid.t *)
Lemma op_plusquestion_is_valid : forall {t1 t2 : Tez_repr.t},
Valid.t t1 ->
Valid.t t2 ->
match Tez_repr.op_plusquestion t1 t2 with
| Pervasives.Ok t => Valid.t t
| Pervasives.Error _ => True
end.
Proof.
intros; unfold Tez_repr.op_plusquestion.
destruct_all Tez_repr.t.
destruct (_ <i64 _) eqn:?; simpl; lia.
Qed.
(** On success, Tez_repr.op_plusquestion equals to addition in Z *)
Lemma op_plusquestion_eq : forall {t t1 t2 : int64},
Valid.t (Tez_repr.Tez_tag t1) ->
Valid.t (Tez_repr.Tez_tag t2) ->
match Tez_repr.op_plusquestion
(Tez_repr.Tez_tag t1)
(Tez_repr.Tez_tag t2) with
| Pervasives.Ok (Tez_repr.Tez_tag t) => t = t1 +Z t2
| Pervasives.Error _ => True
end.
Proof.
Utils.tezos_z_autounfold. simpl.
intros t t1 t2 H1 H2.
destruct (_ <? _) eqn:H_lt; hfcrush.
Qed.
So with these two proofs, we verified that the sum of two Tez_repr.t
is
always a non-negative int64
, and, if no overflow occurs, the sum of two
Tez_rept.t
will be just the same as the sum of two numbers in Z, we just
have a more restricted domain than Z.