# Michelson

The main difficulty in verifying the implementation of Michelson is the use of **GADTs** in OCaml. For the OCaml developers, it gives:

- guaranty that the type-checker and interpreter produce well-typed terms;
- tagless matching on the values.

On our side, this:

- complexifies the maintenance for the translation with
`coq-of-ocaml`

; - generates a lot of
`cast`

and`cast_exists`

dynamic casts in the`match`

, preventing the proof work.

As a solution we:

- manually define a
**dependent version**of the Michelson's implementation, doing the translation from GADTs to dependent types by hand; - formally verify that this dependent translation is equivalent to the OCaml version to make sure it is correct.

We have to do this translation to dependent types by hand as `coq-of-ocaml`

does not support it. This is also the opportunity to clean a few other things, like writing definitions with verified termination.

## Simulationsâ€‹

The way we proceed is the following:

- We re-define everything in a
`Simulations/`

folder, prefixing all the new definitions with`dep_`

for dependent-type (and avoiding name collisions) to manually translate the GADTs of Michelson. For the AST, we use the notion of "type family" to have types parameterized by values rather than other types. - We show that these new definitions are equal to the OCaml definitions with the casts. These equality lemmas are the only place where we talk about these casts. As other proofs, they are in the
`Proofs/`

folder. - Once this is done, we can reason about the dependently typed code for Michelson.

## Architecture of the simulationsâ€‹

The starting file is `Script_family`

with the type `Ty.t`

describing the types of values that can appear in Michelson. With `Ty.t`

come:

`Stack_ty.t`

: like`Ty.t`

but for the stacks, so a list of`Ty.t`

,`Ty_to_set`

: the actual`Set`

implementing the values corresponding to a certain`Ty.t`

in the**GADT**version of the interpreter with`cast`

. These are the types of values on which the interpreter computes.- With_family.ty_to_dep_Set: the actual
`Set`

implementing the values corresponding to a certain`Ty.t`

in the**dependent**version of the interpreter. This type is almost the same as`Ty.to_Set`

apart from:- the
`lambda`

type, as in the dependent version we need annotations on the input and output stacks; - the
`set`

,`map`

, and`big_map`

types, as we avoid using existential types for the implementation of these data structures in the simulation to simplify the proofs; - the
`contract`

type that is parametrized by its type.

- the
- All the usual types used in the OCaml implementation, parametrized by
`Ty.t`

instead of a polymorphic type used by a GADT in OCaml. Many of these types are defined in Simulations/Script_typed_ir.v. For each of them, we also provide a conversion function back to the non-dependent version (the reverse direction is impossible in general). For example, there are both:- With_family.ty a type of signature
`Ty.t -> Set`

corresponding to Script_typed_ir.ty - With_family.to_ty a function of signature
`forall {a : Ty.t}, With_family.ty a -> Script_typed_ir.ty`

- With_family.ty a type of signature

A tricky thing is the handling of the Script_comparable.dep_compare_comparable function over Michelson values. We need this function to define the dependent version of `set`

, `map`

, and `big_map`

types using the corresponding `_Set.Make`

and `Map.Make`

functors from OCaml. What we do:

- we define this comparison function over
`Ty.to_Set`

, because at this point`With_family.ty_to_dep_Set`

is not yet defined since it depends on the sets and maps types; - ideally, it should be defined over
`With_family.ty_to_dep_Set`

but we use the fact that the two coincide over comparable types; - we use the two functions With_family.to_value and With_family.of_value to go back and forth from
`With_family.ty_to_dep_Set`

to`Ty.to_Set`

as the key values (on which the comparison function runs) do not live in the same space as the other values. The function`With_family.of_value`

is partial as it is only complete on comparable types, but this is enough for the keys. Compatibility lemmas between`to_value`

and`of_value`

are given in Proofs/Script_typed_ir.v.

The function `With_family.to_value`

is important to go from values of the dependent interpreter to the GADT interpreter.

## Simulation proofâ€‹

The main technique to write simulations is to:

- look at the OCaml code;
- look at the translated Coq code;
- look at other examples.

The main goal is to remove axioms, in particular:

`cast`

/`cast_exists`

`unreachable_gadt_branch`

- non-termination
`assert`

or other side-effects

We take an example to see how to define the simulations and simulation proofs with `unparse_comparable_data`

in `script_ir_translator.ml`

. In OCaml the code is:

`let[@coq_struct "ty"] rec unparse_comparable_data :`

type a loc.

loc:loc ->

context ->

unparsing_mode ->

a comparable_ty ->

a ->

(loc Script.michelson_node * context) tzresult Lwt.t =

fun ~loc ctxt mode ty a ->

In the translated Coq with GADTs:

`#[bypass_check(guard)]`

Fixpoint unparse_comparable_data {loc a : Set}

(loc_value : loc) (ctxt : Alpha_context.context) (mode : unparsing_mode)

(ty : Script_typed_ir.comparable_ty) (a_value : a) {struct ty}

: M? (Alpha_context.Script.michelson_node loc Ã— Alpha_context.context) :=

We write the simulation as follows:

`Fixpoint dep_unparse_comparable_data {loc : Set} {a}`

(loc_value : loc) (ctxt : Alpha_context.context) (mode : unparsing_mode)

(ty : With_family.ty a) (a_value : With_family.ty_to_dep_Set a) {struct ty}

: M? (Alpha_context.Script.michelson_node loc Ã— Alpha_context.context) :=

We lost the fact that the type should be comparable, but we will get that property back in the proofs as a pre-condition. Note how we translated:

- types parametrized by
`a`

, - the type
`a`

alone.

We also used this simulation to have a definition that terminates without disabling the termination checker with `#[bypass_check(guard)]`

. The lemma to verify this simulation is stated as follows:

`Fixpoint dep_unparse_comparable_data_eq {a} {loc : Set}`

(l_value : loc) (ctxt : Alpha_context.context) (mode : unparsing_mode)

(ty : With_family.ty a) (value : With_family.ty_to_dep_Set a) {struct ty} :

With_family.is_Comparable ty â†’

dep_unparse_comparable_data l_value ctxt mode ty value =

unparse_comparable_data l_value ctxt mode

(With_family.to_ty ty) (With_family.to_value value).

Proof.

(* ... *)

Note how we use `With_family.to_ty`

and `With_family.to_value`

to convert elements from the dependent world to the GADT world.

## Eliminate `cast`

and `cast_exists`

â€‹

The Coq translation of code using a lot of GADTs uses a lot of `cast`

and `cast_exists`

axioms. They are defined in the Basics.v file of `coq-of-ocaml`

. These axioms make dynamic casts in Coq and are necessary as we remove the type parameters of the GADTs during the translation (without these casts, the `match`

operations in Coq would not have enough information to type-check). When doing the proofs, we need to show that these axioms are actually like the identity. Generally, to eliminate the casts in a function `f`

we proceed in two steps:

- defining a function
`dep_f`

in the`Simulations/`

folder, using dependent types and no casts; - showing the two are equal, with a proof in the
`Proofs/`

folder.

All the `cast`

can be eliminated with a `rewrite cast_eval`

, unless there is an error in the user's dependent definitions. The `cast_exists`

is harder, as we need to give values for the existential variables, and the unification of Coq sometimes needs help to find them. There are a few tactics to help:

`rewrite_cast_exists_eval_eq`

from`coq-of-ocaml`

but it does not work very well;`rewrite_cast_exists`

from /docs/environment/v7 that works better (recommended approach);- sometimes we need to do a
`match goal`

by hand. Here is an example of`match goal`

that was useful:

`match goal with`

| _ : ?t |- context[cast_exists ?T _] =>

rewrite (cast_exists_eval_1 (T := T) (E1 := t))

end

The axiom `cast_exists_eval_1`

is defined in /docs/environment/v7 too.