Tezos uses the json-data-encoding library throughout the protocol to serialize data for use in RPC, writing to disk or placing in a block. RPC is a synchronous operation. Consequently, it is paramount to verify the correctness of code as well as the logical equivalence and semantics of performance-enhanced procedures to proven ones.
This library has been subject to stack overflows, as seen in issues like this. We will translate and verify this library to validate its logic and simplify debugging these issues.
Polymorphic variantsโ
The json-data-encoding library makes frequent use of Polymorphic variant types, annotated by a backtick (`). Coq supports algebraic data types but has no direct equivalent of polymorphic variants; Every type must be named. There is no structural typing in Coq. As such, we need to modify the OCaml to reformat these types into a strong type discipline and a structure that Coq will understand.
Removing polymorphic variants starts with this
let unexpected kind expected =
let kind =
match kind with
| `O [] -> "empty object"
| `A [] -> "empty array"
| `O _ -> "object"
| `A _ -> "array"
| `Null -> "null"
| `String _ -> "string"
| `Float _ -> "number"
| `Bool _ -> "boolean"
Becoming
let unexpected (kind : 'a Json_repr.view) expected =
let kind =
match kind with
| O [] -> "empty object"
| A [] -> "empty array"
| O _ -> "object"
| A _ -> "array"
| Null -> "null"
| String _ -> "string"
| Float _ -> "number"
| Bool _ -> "boolean"
We then continue changing all dependent calls of these matches.
Unreachable branchesโ
Unlike Coq, OCaml code doesn't require branches to be exhaustive.
We need to introduce a new catchall branch for remaining patterns,
even if it at first seems impossible to reach.
In Coq, we add the branch:
| _ => unreachable_gadt_branch
to deal with these.
In coq-of-ocaml
, we can add the hint [@coq_match_with_default]
.
Splitting functions into more manageable sub-functionsโ
During proof, we found it very difficult to verify the entire procedure of a function with many steps. Therefore we broke up these functions into their parts.
(* Fast tail-recursive map for the list suffix. *)
let chunked_tail_recursive_map_12 f l =
(*some code*)
in
let map_tail_chunk suffix chunk =
(*some code*)
in
let rec map_all_tail_chunks suffix chunks =
(*some code*)
in
...
end
Due to the function complexity, the code is difficult to prove and understand.
To break it down, we split it into sub-functions, detailed in AuxRecMap
.
Module AuxRecMap = struct
let rec split chunks l =
(*some code*)
let map_tail_chunk suffix chunk f =
(*some code*)
...
end AuxRecMap.
Definition chunked_tail_recursive_map_12 {A B : Set}
(f_value : A -> B) (l_value : list A) : list B :=
let chunks := AuxRecMap.split
...
end.
Propertiesโ
Some Coq lemmas require pre-conditions on the parameters of the procedures that we verify.
List_map
splits a list into chunks of length 12, so we need to make a definition of this property:
Module Chunk.
Module Valid.
Definition t {a : Set} (chunk : list a):=
(Lists.List.length chunk <= 12)%nat.
End Valid.
Module Saturated.
Definition t {a : Set} (chunk : list a):=
(Lists.List.length chunk >= 12)%nat.
End Saturated.
End Chunk.
Through simplification of the OCaml code and introduction of properties to these lists,
we will test and verify the logic of this library,
finding any unaccounted branches and errors.
We have verified the list_map.ml
file. This contains an optimised, tail recursive version of the default OCaml List.map
. We proved that it is equivalent to the aforementioned default function. Now we will adapt the rest of the OCaml files to translate to Coq and verify their functions.