Skip to main content

Verify the Michelson types of Mi-Cho-Coq

ยท 5 min read
Guillaume Claret

Mi-Cho-Coq is a Coq framework which we use to verify smart contracts written in Michelson, the language of smart contracts for Tezos. You can get an introduction to formal verification on Michelson code on opentezos.com/formal-verification.

We want the highest possible level of confidence in the result of our verification work but there is a gap between Mi-Cho-Coq, which was manually written in Coq, and the Tezos protocol. To fill this gap, we want to prove that the semantics of the Michelson language defined in Mi-Cho-Coq exactly matches the mechanic translation of the Michelson interpreter from the Tezos protocol, which we import to Coq using coq-of-ocaml. Defining a bijection between the two type grammars is a requirement for this project. A difficulty is that, because Michelson evolves quickly and Mi-Cho-Coq is handwritten, some recent features are missing in Mi-Cho-Coq. This verification work can also help in identifying them.

All our proofs and definitions are in Proofs/Script_typed_ir/Ty.v.

What we verifyโ€‹

In Mi-Cho-Coq we define the types in syntax_type.v as follows:

Inductive type : Set :=
| Comparable_type (_ : simple_comparable_type)
| key
| unit
| signature
| option (a : type)
| list (a : type)
| set (a : comparable_type)
| contract (a : type)
| operation
| pair (a : type) (b : type)
| or (a : type) (b : type)
| lambda (a b : type)
| map (k : comparable_type) (v : type)
| big_map (k : comparable_type) (v : type)
| chain_id.

In the OCaml implementation, we define the Michelson types in script_typed_ir.mli:

and 'ty ty =
| Unit_t : unit ty_metadata -> unit ty
| Int_t : z num ty_metadata -> z num ty
| Nat_t : n num ty_metadata -> n num ty
| Signature_t : signature ty_metadata -> signature ty
| String_t : Script_string.t ty_metadata -> Script_string.t ty
| Bytes_t : Bytes.t ty_metadata -> bytes ty
| Mutez_t : Tez.t ty_metadata -> Tez.t ty
| Key_hash_t : public_key_hash ty_metadata -> public_key_hash ty
| Key_t : public_key ty_metadata -> public_key ty
| Timestamp_t :
Script_timestamp.t ty_metadata -> Script_timestamp.t ty
| Address_t : address ty_metadata -> address ty
| Bool_t : bool ty_metadata -> bool ty
| Pair_t :
('a ty * field_annot option * var_annot option)
* ('b ty * field_annot option * var_annot option)
* ('a, 'b) pair ty_metadata
-> ('a, 'b) pair ty
...

The definition in the OCaml implementation contains more information because:

  • It has the ty_metadata, which are optional code annotations and a type size used to compute the gas.
  • It contains types which are not yet existing in Mi-Cho-Coq.

We will show a bijection between the two definitions, up to missing types in Mi-Cho-Coq and metadata. This way, we can have a clear view of what is implemented in Mi-Cho-Coq and what is yet to do.

Statementsโ€‹

We have the Coq translation of the interpreter types in Script_typed_ir.v. Currently, when we translate the OCaml GADTs we remove the type parameters as seen in this code extract:

with ty : Set :=
| Unit_t : ty_metadata โ†’ ty
| Int_t : ty_metadata โ†’ ty
| Nat_t : ty_metadata โ†’ ty
| Signature_t : ty_metadata โ†’ ty
| String_t : ty_metadata โ†’ ty
| Bytes_t : ty_metadata โ†’ ty
| Mutez_t : ty_metadata โ†’ ty
| Key_hash_t : ty_metadata โ†’ ty
| Key_t : ty_metadata โ†’ ty
| Timestamp_t : ty_metadata โ†’ ty
| Address_t : ty_metadata โ†’ ty
| Bool_t : ty_metadata โ†’ ty
| Pair_t :
ty ร— option field_annot ร— option var_annot โ†’
ty ร— option field_annot ร— option var_annot โ†’ ty_metadata โ†’ ty
...

This simplifies the translation of expressions with GADT types in Coq, although we frequently need to add unsafe casts during matches. We define in Coq the following functions:

Parameter to_Michocoq :
Script_typed_ir.ty -> option syntax_type.type.

Parameter of_Michocoq :
syntax_type.type -> Script_typed_ir.ty.

Parameter get_canonical :
Script_typed_ir.ty -> Script_typed_ir.ty.

The two functions to_Michocoq and of_Michocoq do the conversion between the type in the OCaml implementation and Mi-Cho-Coq. The conversion to_Michocoq may return None when there is no corresponding type in Mi-Cho-Coq. The function get_canonical puts a type in a canonical form where all the metadata are set to a default value.

We verify that:

Axiom get_canonical_is_involutive : forall ty,
get_canonical (get_canonical ty) = get_canonical ty.

Axiom to_Michocoq_of_Michocoq : forall ty,
to_Michocoq (of_Michocoq ty) = Some ty.

Axiom of_Michocoq_to_Michocoq : forall ty,
match to_Michocoq ty with
| Some ty' => of_Michocoq ty' = get_canonical ty
| None => True
end.

Proofsโ€‹

We write our proofs by induction on the types. The details are in Proofs/Script_typed_ir/Ty.v. The proofs are quite standard once we have written the statements. For example, for to_Michocoq_of_Michocoq the proof is the following:

Fixpoint to_Michocoq_of_Michocoq ty
: to_Michocoq (of_Michocoq ty) = Some ty.
destruct ty; simpl;
repeat rewrite Comparable_ty.to_Michocoq_of_Michocoq;
repeat rewrite to_Michocoq_of_Michocoq;
try reflexivity.
destruct_all syntax_type.simple_comparable_type; reflexivity.
Qed.

In Mi-Cho-Coq and in OCaml, the Michelson types are a generalization of the comparable types, which are the types with a comparison function. We also state and verify a bijection between the comparable types of Mi-Cho-Coq and the OCaml implementation.

Conclusionโ€‹

Thanks to this effort, we found no mistakes in the definition of the types in Mi-Cho-Coq. Furthermore, for future work, we documented the list of types present in the OCaml implementation but missing in Mi-Cho-Coq, for future work. Our next task is to translate and compare Michelson's instructions/data types, and finally, compare the semantics of the interpreters.