Skip to main content

Verifying the compare functions of OCaml

ยท 7 min read
Guillaume Claret

In OCaml, we create a compare function for many data structures to have:

  • an ordering, and:
  • an equality function.

These compare functions should behave as follows:

  • compare x y returns -1 when x is "lesser" than y;
  • compare x y returns 0 when x is "equal" to y;
  • compare x y returns 1 when x is "greater" than y.

The compare functions are useful to implement many algorithms and derive data structures such as maps and sets. Having an issue in the consistency of such functions can lead to bugs in the code using them. An example of inconsistency is to have both:

  • compare x y = 1 and:
  • compare y x = 1

for some values of x and y. A complex example of compare function in the code of Tezos is located in the Script_comparable.v file to compare arbitrary Michelson values.

In this blog post we will see how we specify the compare functions and verify them.

info

You can see the main result of this work in the file Proofs/Compare.v.

Specificationโ€‹

We require the compare functions to have values in the set {-1, 0, +1} and to yield "equal" and "lesser than or equal" operators verifying the usual standard mathematical properties:

  • reflexivity, symmetry, and transitivity for the equality;
  • reflexivity, anti-symmetry, and transitivity for the "lesser than or equal" operator.

These operators are defined as follows in OCaml:

let equal x y =
compare x y = 0

let lesser_than_or_equal x y =
compare x y <= 0

We simplify the conditions on these operators to four equivalent conditions on compare given in the following Coq record:

Module Valid.
Record t {a : Set} {compare : a -> a -> int} : Prop := {
image x y :
match compare x y with
| -1 | 0 | 1 => True
| _ => False
end;
zero x y :
compare x y = 0 -> x = y;
sym x y :
compare x y = - compare y x;
trans x y z :
compare x y = 1 ->
compare y z = 1 ->
compare x z = 1;
}.
Arguments t {_}.
End Valid.

When the property Valid.t compare is verified with a Coq proof, we consider the corresponding compare function to be valid. The four conditions given in Valid.t are enough to derive the usual properties on the operators equal and lesser_than_or_equal. For example, we can verify the reflexivity of the equality:

forall x, compare x x = 0

using the sym property. Having a minimized set of properties to verify simplifies the validation of the compare functions.

Extended specificationโ€‹

We needed to extend this specification for two reasons:

  1. Some compare : a -> a -> int functions are not well defined on their whole type a and require a pre-condition on their input values.
  2. Some compare functions do not check for the equality but for a more generic equivalence relation.

To solve the first issue, we add a domain : a -> Prop function stating on which subset of a the compare is defined. For the second issue, we add a function f : a -> a' describing the equivalence class of each element. We consider that two elements are equal if and only if the following condition holds:

f x = f y

Using a function to describe the equivalence classes is convenient for expressing an equivalence relation. Indeed, we get the reflexivity, symmetry, and transitivity for free because the standard equality = is an equivalence relation.

Adding these extensions, we get this new complete definition of a validity predicate for the comparison:

Module Valid.
Record t {a a' : Set}
{domain : a -> Prop} {f : a -> a'} {compare : a -> a -> int} : Prop := {
congruence_left x1 x2 y :
domain x1 -> domain x2 -> domain y ->
f x1 = f x2 -> compare x1 y = compare x2 y;
image x y :
domain x -> domain y ->
match compare x y with
| -1 | 0 | 1 => True
| _ => False
end;
zero x y :
domain x -> domain y ->
compare x y = 0 -> f x = f y;
sym x y :
domain x -> domain y ->
compare x y = - compare y x;
trans x y z :
domain x -> domain y -> domain z ->
compare x y = 1 ->
compare y z = 1 ->
compare x z = 1;
}.
Arguments t {_ _}.
End Valid.

In this new specification, we add:

  • a domain used as a pre-condition on all the parameters for the compare function;
  • a condition congruence_left to say that the value of compare does not change when we stay in the same equivalence class;
  • a new condition zero stating that if compare x y then f x = f y instead of x = y previously.

This extended specification is equivalent to the original one if we take:

  • domain := fun _ => True
  • f := id (the identity function)

In practice, for most of our examples, we have these simplified conditions with domain := fun _ => True and f := id.

Proof methodโ€‹

Because it is tedious and repetitive to verify the compare functions, we designed a set of proven-correct compare combinators. Then we prove that the compare functions are valid showing they can be expressed from these combinators.

Let us take the example of compare from Script_string_repr.v defined in OCaml as:

let compare (String_tag x) (String_tag y) = Compare.String.compare x y

and translated by coq-of-ocaml to the Coq code:

Definition compare (function_parameter : t) : t -> int :=
let 'String_tag x_value := function_parameter in
fun (function_parameter : t) =>
let 'String_tag y_value := function_parameter in
Compare.String.(Compare.S.compare) x_value y_value.

We formally verify it with the following lemma in Proofs/Script_string_repr.v:

Lemma compare_is_valid :
Compare.Valid.t (fun _ โ‡’ True) id Script_string_repr.compare.
Proof.
apply (Compare.equality (
let proj '(Script_string_repr.String_tag s) := s in
Compare.projection proj String.compare
)); [sauto lq: on|].
eapply Compare.implies.
{ eapply Compare.projection_is_valid.
apply Compare.string_is_valid.
}
all: sauto q: on.
Qed.

The statement says that Script_string_repr.compare is valid for any value and the equivalence relation defined by id, that is to say =. We proceed by the following steps:

  1. We show that this compare function is equal to a projection from the type Script_string_repr.t with a single constructor String_tag, composed to the standard String.compare function. We prove the equality using the automated tactic sauto lq: on as suggested by the tactic best.
  2. We use the Compare.implies lemma to transform the domain and the function f to existential variables. This is necessary as the proofs of the compare combinators may not syntactically yield the same domain and function f as we have.
  3. We apply the lemmas Compare.projection_is_valid and Compare.string_is_valid stating that our primitive combinators are valid.
  4. We show that the domain and function f we obtain are equivalent to the ones in the statement. We use the automated tactic sauto q: on for that, also suggested by best.

Future workโ€‹

For future work, we would like to:

  • Extend the set of combinators we have. In particular, we would like to have a generic compare combinator to define and verify comparisons over algebraic data types with many cases. An example of usage would be to verify the function compare_balance from Receipt_repr.v.
  • Automate parts of the proof, like applying the lemmas for the validity of the combinators.