In Tezos, the skip-list data structure is used in several places. Verification of
skip-list
required many days of painstaking work, and this work is not yet finished.
In this post, we will talk about the techniques we used.
In particular, we describe the verification of function mem
- one of the functions
operating in this data structure. More information about formal verification of skip list
can be found here.
note
A skip list is a probabilistic data structure, the extended form of a list (ubiquitous in functional programming). The skip list is used to store a sorted list of elements or data with a linked list. It consists of cells and each cell carries some content and a list of backpointers. In one single step, it skips several elements of the entire list. That is why it has that name: "skip list".
First steps of verification​
Let us consider the function mem
- one of the functions from the
Skip_list_repr
. For example, we have two neighbor cells: the first cell
and
the second cell
in a sequence. Function mem
checks if a precise pointer
to the second cell
presents in the list of backpointers of the previous
cell - the first cell
.
#[bypass_check(guard)]
Definition mem {FArgs} {A B : Set}
(equal : A → B → bool) (x_value : A)
(l_value : FallbackArray.t (option B)) : bool :=
let n_value := FallbackArray.length l_value in
let fix aux (idx : int) {struct idx} : bool :=
if idx ≥i n_value then
false
else
match FallbackArray.get l_value idx with
| None ⇒ aux (idx +i 1)
| Some y_value ⇒
if equal x_value y_value then
true
else
aux (idx +i 1)
end in
aux 0.
First step in verification of this function is to retrieve inner fixpoint out of the definition. Nested definitions are very difficult to verify. Our primary task is to make a total decomposition of all functions, disassemble them into simple components.
The first issue we faced during the work with this function was that (like many
other functions in skip-list) it was marked with #[bypass_check(guard)]
, which
means that Coq could not guess its decreasing argument (function is ill-formed,
not total, it may diverge). In order to be able to work with the function we
created its simulation, equipped by fuel - the natural number.
Fixpoint mem_aux {FArgs} {A B : Set} (idx : int)
(equal : A -> B -> bool) (x_value : A)
(l_value : FallbackArray.t (option B)) n_value fuel {struct fuel}: bool :=
match fuel with
| Datatypes.O => false
| Datatypes.S fuel =>
if idx >=i n_value then
false
else
match FallbackArray.get l_value idx with
| None => mem_aux (idx +i 1) equal x_value l_value n_value fuel
| Some y_value =>
if equal x_value y_value then
true
else
mem_aux (idx +i 1) equal x_value l_value n_value fuel
end
end.
Definition mem2 {FArgs} {A B : Set}
(equal : A -> B -> bool) (x_value : A)
(l_value : FallbackArray.t (option B)) fuel : bool :=
let n_value := FallbackArray.length l_value in
mem_aux 0 equal x_value l_value n_value fuel.
Axiom mem_eq : forall {FArgs} {A B : Set}
(equal : A -> B -> bool) (x_value : A)
(l_value : FallbackArray.t (option B)) fuel,
mem equal x_value l_value =
mem2 equal x_value l_value fuel.
The second issue we faced - we could not prove equality between our simulation and
initial mem
function. The reason is in how function mem
organized. There
was no way to do the next step of reduction. We had to axiomatize equality of
simulation function and original one. The solution of this problem requires
additional research and one of the possible approaches is mentioned in this
article.
Inductive predicates in formal verification​
Please consider the code below. It is yet another simulation of the function
mem
but this time it is having a form of inductive predicate.
Inductive mem_aux_Prop {FArgs} {A B : Set} (idx : int)
(equal : A -> B -> bool) (x_value : A)
(l_value : FallbackArray.t (option B)) : Prop :=
| maP_None :
idx >=i FallbackArray.length l_value = false ->
FallbackArray.get l_value idx = None ->
mem_aux_Prop (idx +i 1) equal x_value l_value ->
mem_aux_Prop idx equal x_value l_value
| maP_Some_eq :
forall (y_value : B),
idx >=i FallbackArray.length l_value = false ->
FallbackArray.get l_value idx = Some y_value ->
equal x_value y_value = true ->
mem_aux_Prop idx equal x_value l_value
| maP_Some_neq :
forall (y_value : B),
idx >=i FallbackArray.length l_value = false ->
FallbackArray.get l_value idx = Some y_value ->
equal x_value y_value = false ->
mem_aux_Prop (idx +i 1) equal x_value l_value ->
mem_aux_Prop idx equal x_value l_value.
This approach can be described as follows: generally, we define a predicate
equivalent to mem
and prove that they are the same. Advantages: it is
easier to use inductive predicates for formal reasoning, since we can use
induction by that predicate (unlike with the regular definition, when we can
only do induction by arguments of the function). The next step is to use this predicate
to prove the properties of the function. The proof of the equivalence
should be done first, to make sure if the predicate was made correctly.
Lemma mem_aux_Prop_exists_general `{FArgs} {A B : Set}
(equal : A -> B -> bool) (x_value : A) (ptr : B)
(l_value : FallbackArray.t (option B)) (idx : int) (idx2 : int) :
idx < FallbackArray.length l_value ->
l_value.(t.default) = None ->
get l_value idx = Some ptr ->
equal x_value ptr = true ->
0 <= idx2 <= idx ->
mem_aux_Prop idx2 equal x_value l_value.
Proof.
intros. pose (Compare.length_is_valid l_value.(t.items)).
apply mem_aux_Prop_exists with equal x_value l_value idx ptr in H1;
trivial; try lia.
destruct H4. revert H4.
apply neg_le_ind with (m := idx2) (n := idx); intuition.
rewrite <- Z.sub_1_r.
destruct (get l_value (m - 1)) eqn:getValue;
[ destruct (equal x_value b) eqn:equalValue;
[ apply maP_Some_eq with b | apply maP_Some_neq with b ]
| apply maP_None ]; trivial; try lia.
all: unfold FallbackArray.length, "+i" in *;
rewrite Pervasives.normalize_identity; rewrite Z.sub_add;
intuition.
Qed.
Lemma mem_aux_of_Prop {FArgs} {A B : Set}
(equal : A -> B -> bool) (x_value : A)
(l_value : FallbackArray.t (option B))
(idx : int) (fuel : Datatypes.nat) :
Pervasives.Int.Valid.t idx ->
FallbackArray.length l_value < Pervasives.max_int ->
let n_value := FallbackArray.length l_value in
(Z.to_nat (n_value - idx) <= fuel)%nat ->
mem_aux_Prop idx equal x_value l_value ->
mem_aux idx equal x_value l_value fuel = true.
Proof.
intros idx_valid len_restriction ? le_fuel HmaP.
revert le_fuel. revert fuel.
induction HmaP; (destruct fuel; simpl; intro le_fuel; [intuition |]).
{ rewrite H0, H1. intuition. }
{ now rewrite H0, H1, H2. }
{ rewrite H0, H1, H2. intuition. }
Qed.
More about inductive predicates​
Having an inductive predicate defined on a Coq Set
is enough to justify case
analysis reasoning. In combination with dependent types inductive predicates
can express phantom types like OCamls GADTs. Whether to actively use inductive
predicates in proofs - depends on preferences and conditions. But in skillful
hands, it is a powerful tool for formal verification!