Property based tests are a way to find bugs in the code by running a boolean function (representing an expected property) on randomly generated inputs. In this blog post, we show how we translate these tests to corresponding Coq theorems. Once we have a formal representation, we prove them correct on any inputs. Thanks to this technique, we can:
- have a bridge between the specification given by the tests and the specification in Coq;
- make sure that the properties verified in the tests are true with a mathematical degree of certainty.
We have done this work with students from the University of Groningen:
- Joris Peters https://github.com/jorisptrs
- Rienk Bergsma https://github.com/Dragonflew17
- Tait van Strien https://gitlab.com/7A1T
- Tino Alferink https://gitlab.com/Tinoalfie
- Tudor Roscoiu https://gitlab.com/tudoroscoiu
Here is a short presentation that we gave at a meeting with developers of Tezos: Property based tests with Coq. To see an example of a formally verified test, you can look at:
- Test_sc_rollup_tick_repr.v: the tests for Sc_rollup_tick_repr.v translated into Coq together with their proofs;
- test_sc_rollup_tick_repr.ml: the corresponding test file in OCaml.
Example in OCamlโ
As an example, we take the file sc_rollup_tick_repr.ml defining a notion of "tick":
include Z
let initial = zero
let next = succ
let pp = pp_print
let encoding = Data_encoding.n
let distance tick1 tick2 = Z.abs (Z.sub tick1 tick2)
(* ... other definitions ... *)
with its associated property based tests in test_sc_rollup_tick_repr.ml. An example of test is:
(** A generator for ticks *)
let tick : Tick.t QCheck.arbitrary =
QCheck.(
Gen.(make (Option.value ~default:Tick.initial <$> map Tick.of_int nat)))
let test_distance_triangle_inequality =
QCheck.Test.make
~name:"distance is a distance (triangle inequality)"
(triple tick tick tick)
@@ fun (x, y, z) ->
Tick.(Z.(geq (distance x y + distance y z) (distance x z)))
This checks that, for all x
, y
, and z
, the following property holds:
distance x y + distance y z >= distance x z
We use the OCaml framework QCheck to write these property based tests. The value tick
is a generator, that is to say a function generating random values of a tick. We create this generator using combinators:
nat
generates non-negativeint
values;map Tick.of_int nat
applies to these values the conversionTick.of_int
fromint
to ticks;Option.value ~default:Tick.initial <$> map Tick.of_int nat
returns a default valueTick.initial
in caseTick.of_int
returnsNone
.
Translation in Coqโ
We run coq-of-ocaml to translate the code and the tests to Coq. The generated files are the followings:
- Sc_rollup_tick_repr.v for the code;
- Test_sc_rollup_tick_repr.v for the tests.
For the test files, we do some manual transformation in addition to what coq-of-ocaml
generates to fix some missing library imports. However, given more work on the translation pipeline, these manual transformations should not be needed.
To represent property based tests as Coq theorems, we replace the randomly generated values with "forall" quantifiers. We restrict these "forall" quantifiers to the domain of values that can be generated. We do this transformation for the combinators of the generators so that the tests look the same when expressed in Coq theorems:
Definition tick : QCheck.arbitrary Sc_rollup_tick_repr.t :=
QCheck.make None None None None None
(QCheck.Gen.op_ltdollargt
(fun x_1 =>
Option.value_value x_1
Sc_rollup_tick_repr.initial)
(QCheck.Gen.map
Sc_rollup_tick_repr.of_int
QCheck.Gen.nat)).
Definition test_distance_triangle_inequality : QCheck.Test.t :=
QCheck.Test.make None None None None None None
(Some "distance is a distance (triangle inequality)")
(QCheck.triple_value tick tick tick)
(fun (function_parameter :
Sc_rollup_tick_repr.t *
Sc_rollup_tick_repr.t *
Sc_rollup_tick_repr.t) =>
let '(x_value, y_value, z_value) := function_parameter in
Z.geq
(Z.add
(Sc_rollup_tick_repr.distance x_value y_value)
(Sc_rollup_tick_repr.distance y_value z_value))
(Sc_rollup_tick_repr.distance x_value z_value)).
Combinatorsโ
The differences with the original OCaml code are mostly syntactical. However, the type QCheck.arbitrary
for the generators is now:
Definition arbitrary (a : Set) : Type := a -> Prop.
so a predicate over a type a
. The definition tick
is a predicate over values of type Sc_rollup_tick_repr.t
, corresponding to the values that can be generated by the OCaml generator tick
. As an example, here is how we define the combinator QCheck.Gen.map
:
Definition map {a b : Set} (f : a -> b)
(gen : arbitrary a) : arbitrary b :=
fun v =>
โ v', gen v' โง f v' = v.
It says that a value can be generated by a map
if there exists a predecessor for the function f
that we can generate using gen
. You can find the definition of all the combinators we are using in QCheck.v.
Theoremโ
In Coq, the definition test_distance_triangle_inequality
is of type QCheck.Test.t
that is a synonym of Prop
. We define the function make
to create a test as:
Definition make {a : Set}
(_ _ _ _ _ _ _ : option string)
(generator : QCheck.arbitrary a)
(test_function : a -> bool) : t :=
forall x,
generator x ->
test_function x = true.
All the _
parameters represent optional parameters in OCaml and are here for typing reasons. The important part is that we check:
forall x,
generator x ->
test_function x = true.
that is to say that the property function always returns true
when applied to a value in the domain of the generator. We state the theorem for the validity of our test as follows:
Lemma test_distance_triangle_inequality_proof :
test_distance_triangle_inequality.
Proofโ
We generally proceed in two steps:
- We state in a clear way what is the domain of the generators:
Lemma tick_is_non_negative_int (n : Z.t) :
tick n ->
Pervasives.Int.Valid.non_negative n.
Proof.
(* ... *)
Qed.
This states that: "the generator tick
produces non-negative integers in the range of values of the type int
".
- We write the proof using the domain of the generators. We follow the structure of the property test showing that the result must be
true
in all possible branches.
Here is the proof for the lemma test_distance_triangle_inequality_proof
. Note that we did not need to use the domain of the generator in this particular example:
Lemma test_distance_triangle_inequality_proof :
test_distance_triangle_inequality.
Proof.
unfold test_distance_triangle_inequality, QCheck.Test.make.
intros.
destruct x eqn:?.
destruct p eqn:?.
unfold Sc_rollup_tick_repr.distance.
lia.
Qed.
General translationโ
Here we summarize the general translation process from a property based test to a theorem. We translate a random generator to a predicate describing its domain. We translate a boolean property, applied to this generator, to a statement saying that this property is always true
on the corresponding domain.
Language | Generator | Property |
---|---|---|
OCaml | gen : seed โ a | p : a โ bool |
Coq | pre : a โ Prop pre(x) = (x โ image(gen)) | โ x, pre(x) โ p(x) = true |
What we haveโ
We have a translation in Coq and a part of the proofs for the following test files:
Nextโ
We plan to continue writing the proofs and translating other test files. We would also like to completely automate the translation from the OCaml test files to Coq. This task is complex because the tests depend on many other files from the shell part of Tezos, that we would need to translate also.
The techniques we described are not restricted to OCaml and Coq, and could be used for other languages and proof systems.