Skip to main content

Skip-list verification. Using inductive predicates

· 6 min read

In Tezos, the skip-list data structure is used in several places. Verification of skip-list required many days of painstaking work, and this work is not yet finished. In this post, we will talk about the techniques we used. In particular, we describe the verification of function mem - one of the functions operating in this data structure. More information about formal verification of skip list can be found here.

note

A skip list is a probabilistic data structure, the extended form of a list (ubiquitous in functional programming). The skip list is used to store a sorted list of elements or data with a linked list. It consists of cells and each cell carries some content and a list of backpointers. In one single step, it skips several elements of the entire list. That is why it has that name: "skip list".

First steps of verification​

Let us consider the function mem - one of the functions from the Skip_list_repr. For example, we have two neighbor cells: the first cell and the second cell in a sequence. Function mem checks if a precise pointer to the second cell presents in the list of backpointers of the previous cell - the first cell.

#[bypass_check(guard)]
Definition mem {FArgs} {A B : Set}
(equal : A → B → bool) (x_value : A)
(l_value : FallbackArray.t (option B)) : bool :=
let n_value := FallbackArray.length l_value in
let fix aux (idx : int) {struct idx} : bool :=
if idx ≥i n_value then
false
else
match FallbackArray.get l_value idx with
| None ⇒ aux (idx +i 1)
| Some y_value ⇒
if equal x_value y_value then
true
else
aux (idx +i 1)
end in
aux 0.

First step in verification of this function is to retrieve inner fixpoint out of the definition. Nested definitions are very difficult to verify. Our primary task is to make a total decomposition of all functions, disassemble them into simple components.

The first issue we faced during the work with this function was that (like many other functions in skip-list) it was marked with #[bypass_check(guard)], which means that Coq could not guess its decreasing argument (function is ill-formed, not total, it may diverge). In order to be able to work with the function we created its simulation, equipped by fuel - the natural number.

Fixpoint mem_aux {FArgs} {A B : Set} (idx : int)
(equal : A -> B -> bool) (x_value : A)
(l_value : FallbackArray.t (option B)) n_value fuel {struct fuel}: bool :=
match fuel with
| Datatypes.O => false
| Datatypes.S fuel =>
if idx >=i n_value then
false
else
match FallbackArray.get l_value idx with
| None => mem_aux (idx +i 1) equal x_value l_value n_value fuel
| Some y_value =>
if equal x_value y_value then
true
else
mem_aux (idx +i 1) equal x_value l_value n_value fuel
end
end.

Definition mem2 {FArgs} {A B : Set}
(equal : A -> B -> bool) (x_value : A)
(l_value : FallbackArray.t (option B)) fuel : bool :=
let n_value := FallbackArray.length l_value in
mem_aux 0 equal x_value l_value n_value fuel.

Axiom mem_eq : forall {FArgs} {A B : Set}
(equal : A -> B -> bool) (x_value : A)
(l_value : FallbackArray.t (option B)) fuel,
mem equal x_value l_value =
mem2 equal x_value l_value fuel.

The second issue we faced - we could not prove equality between our simulation and initial mem function. The reason is in how function mem organized. There was no way to do the next step of reduction. We had to axiomatize equality of simulation function and original one. The solution of this problem requires additional research and one of the possible approaches is mentioned in this article.

Inductive predicates in formal verification​

Please consider the code below. It is yet another simulation of the function mem but this time it is having a form of inductive predicate.

Inductive mem_aux_Prop {FArgs} {A B : Set} (idx : int)
(equal : A -> B -> bool) (x_value : A)
(l_value : FallbackArray.t (option B)) : Prop :=
| maP_None :
idx >=i FallbackArray.length l_value = false ->
FallbackArray.get l_value idx = None ->
mem_aux_Prop (idx +i 1) equal x_value l_value ->
mem_aux_Prop idx equal x_value l_value
| maP_Some_eq :
forall (y_value : B),
idx >=i FallbackArray.length l_value = false ->
FallbackArray.get l_value idx = Some y_value ->
equal x_value y_value = true ->
mem_aux_Prop idx equal x_value l_value
| maP_Some_neq :
forall (y_value : B),
idx >=i FallbackArray.length l_value = false ->
FallbackArray.get l_value idx = Some y_value ->
equal x_value y_value = false ->
mem_aux_Prop (idx +i 1) equal x_value l_value ->
mem_aux_Prop idx equal x_value l_value.

This approach can be described as follows: generally, we define a predicate equivalent to mem and prove that they are the same. Advantages: it is easier to use inductive predicates for formal reasoning, since we can use induction by that predicate (unlike with the regular definition, when we can only do induction by arguments of the function). The next step is to use this predicate to prove the properties of the function. The proof of the equivalence should be done first, to make sure if the predicate was made correctly.

Lemma mem_aux_Prop_exists_general `{FArgs} {A B : Set}
(equal : A -> B -> bool) (x_value : A) (ptr : B)
(l_value : FallbackArray.t (option B)) (idx : int) (idx2 : int) :
idx < FallbackArray.length l_value ->
l_value.(t.default) = None ->
get l_value idx = Some ptr ->
equal x_value ptr = true ->
0 <= idx2 <= idx ->
mem_aux_Prop idx2 equal x_value l_value.
Proof.
intros. pose (Compare.length_is_valid l_value.(t.items)).
apply mem_aux_Prop_exists with equal x_value l_value idx ptr in H1;
trivial; try lia.
destruct H4. revert H4.
apply neg_le_ind with (m := idx2) (n := idx); intuition.
rewrite <- Z.sub_1_r.
destruct (get l_value (m - 1)) eqn:getValue;
[ destruct (equal x_value b) eqn:equalValue;
[ apply maP_Some_eq with b | apply maP_Some_neq with b ]
| apply maP_None ]; trivial; try lia.
all: unfold FallbackArray.length, "+i" in *;
rewrite Pervasives.normalize_identity; rewrite Z.sub_add;
intuition.
Qed.

Lemma mem_aux_of_Prop {FArgs} {A B : Set}
(equal : A -> B -> bool) (x_value : A)
(l_value : FallbackArray.t (option B))
(idx : int) (fuel : Datatypes.nat) :
Pervasives.Int.Valid.t idx ->
FallbackArray.length l_value < Pervasives.max_int ->
let n_value := FallbackArray.length l_value in
(Z.to_nat (n_value - idx) <= fuel)%nat ->
mem_aux_Prop idx equal x_value l_value ->
mem_aux idx equal x_value l_value fuel = true.
Proof.
intros idx_valid len_restriction ? le_fuel HmaP.
revert le_fuel. revert fuel.
induction HmaP; (destruct fuel; simpl; intro le_fuel; [intuition |]).
{ rewrite H0, H1. intuition. }
{ now rewrite H0, H1, H2. }
{ rewrite H0, H1, H2. intuition. }
Qed.

More about inductive predicates​

Having an inductive predicate defined on a Coq Set is enough to justify case analysis reasoning. In combination with dependent types inductive predicates can express phantom types like OCamls GADTs. Whether to actively use inductive predicates in proofs - depends on preferences and conditions. But in skillful hands, it is a powerful tool for formal verification!