In this blog post, I propose to make you discover **how we deal with internal errors**
and show that they do not occur while running the protocol! I will first give you
a glimpse of the **so-called error monad** (but without going into the details of
the notion of monad ๐). You actually don't need to have any knowledge of monads in computer programming or elsewhere to read this article: we will take them step by step. But
if you are already familiar on them, you can skim through the first parts of this post and have a look at the monadic `let ... in`

s we use and how we articulate them with **validity predicates**. We conclude by giving a first glimpse at how we use these validity predicates to **manually specify information which is lost during the automatic translation**
from OCaml to Coq.

## Internal Errorsโ

### What is an internal error?โ

Let us present incrementally our methodology to check that internal errors
do not occur. But what is an internal error? Well, **internal errors** are those
that should never occur whatsoever. They may be put in regard to **user errors**,
e.g., when a user produces a bad (for instance, ill-typed) block intentionally or
not:

- First, we select a list of internal errors. We have hitherto mostly focused on Tezos's "internal error" and storage errors. What should be considered as an internal error (vs. a user error) is a matter of consensus between Tezos developers and this list may be updated.
- Then, we verify the protocol one function at a time. We shall refine this
later, but let's say for now that, for each function
`f`

, we want to state and prove a theorem of the form:If the inputs

`x1`

, ...,`xn`

of`f`

satisfy some pre-conditions`P1`

,...,`Pn`

, then:- if
`f`

succeeds, then the output`f x1 ... xn`

satisfies some post-condition`P`

. - if
`f`

fails (i.e., raises errors), then it didn't raise any*internal*error.

- if

Of course, pre-conditions are pivotal to prove such a statement, and must be
specified carefully (more on this below). Checking a post-condition ensures
compositionality between our various validity lemmas, in the sense that the
output of a function `f`

may be taken as an input of another function `g`

.
The example below will illustrate all of this.

### The Error Monadโ

For each type `A`

, we define a new type `M? A`

called the **error monad** (or
the **result monad**) of `A`

. Intuitively, every `e : M? A`

is morally an expression of
type `A`

which either evaluates to a value `a : A`

, or raises an error in the
course of its evaluation.

Concretely, the error monad `M? A`

is built from `A`

(and another type parameter `B`

that
we mostly put under the carpet) with two constructors:

`Ok a`

(with`a : A`

) specifies a computation in`A`

which succeeds and outputs a result`a`

`Error b`

specifies an error`b`

. Basically,`b`

can be thought as string (or a list of strings) which contains associated error message + some extra-information.

The definition and notation of `M? A`

can be found
in Error_monad.v
and in Pervasives.v. This actually emulates the result monad of
Tezos. Monads are a very rich structure in logic, category theory, theoretical computer science as well as in practical programming, but we may blissfully ignore all their subtleties here.

Then, checking that internal errors do not occur is about verifying that the only
`Error b`

s that may pop up are not in the list of internal errors, with the
function not_internal.

### Monadic `let`

and monadic composition (bind)โ

One way to compute the **composition** of two functions `f : A -> B`

and `g : B -> C`

on an input `a : A`

is to write `let b := f a in g b`

.

Likewise, one can "compose" two functions `f : A -> M? B`

and `g : B -> M? C`

when applied to an input `a : A`

. It can be written as:

`let? b := f a in g b`

The so-called monadic `let? ... in`

works the same as a regular `let ... in`

except that it propagages any error that occurs in `f`

in the expected way, e.g. if `f`

or `g`

raise an error then `let? b := f a in g b`

raises the same error. Thus, `let? ... in`

allows performing what is called
**monadic composition** (which is sometimes referred to as the
`bind`

operator of the monad). This `bind`

operator for the monad error is written `>>=?`

in the OCamlcode.
**If you look at the Tezos repository, you will see that monadic composition is pervasive in the code of the protocol!**

**Remark:** Formally, `let? ... in`

is defined by (pseudo-)pattern-matching as follows:

- If
`f a = Error foo`

, then`let? b := f a in g b`

outputs`Error foo`

. - If
`f a = Ok b0`

and`g b0 = Error bar`

, then`let? b := f a in g b`

outputs`Error bar`

. - If
`f a = Ok b0`

and`g b0 = Ok c0`

, then`let? b := f a in g b`

outputs`c0`

.

## Specifying validityโ

### Validity predicates for valuesโ

For most types `A`

, we need to define what it is to be valid for an object of
`A`

. Intuitively, the validity predicate on a given type `A`

must be simple,
natural and *in fine* (i.e., when proofs are completed) ensure that a valid input
cannot cause an internal error (notice that the criteria must indeed be
natural to avoid (implicitly) tautological definition such as "`a`

is valid
when `a`

does not raise any internal error when taken as a parameter.")

For instance, on the type `Int32`

, validity is about not being out of bounds.

`Module Valid.`

Definition t (n : int32) : Prop :=

Int32.min_int โค n โค Int32.max_int.

This predicate allows us for instance to check that there is no overflow while doing operations such as addition or
multiplication when suitable pre-conditions are met. To have an example, have a look at the proof file of
`Saturation_repr.v`

, e.g. at the lemma
`add_is_valid`

, which
uses a validity predicate which is very similar to `Int32.Valid.t.`

above.

### Validity properties for functionsโ

Whereas specifying validity predicates on various types pertains to the realm of
*definitions*, verifying that given protocol function are valid pertains to that
of theorems and proofs.

First of all, to state the lemmas about the validity of a certain function, we
resort to the `letP?`

binder, which may be thought of as a particular case of `letP?`

,
when the second function `g`

is actually a predicate arriving in `Prop`

, the type
of propositions.
Thus, for `e : M? A`

and `P : A -> Prop`

, depending on the case, the proposition
`letP? a := e in P a`

means that:

- When the evaluation of
`e`

succeeds and outputs`a0 : A`

(i.e.,`e`

evaluates to`Ok a0`

), then`letP? a := e in P a`

iff`a0`

satisfies the predicate`P`

; - When the evaluation of
`e`

raises an error (i.e.,`e`

evaluates to`Error b`

), then`letP? a := e in P a`

iff`b`

is not internal.

Actually, `letP?`

is a notation wrapping the operator `bind_prop`

which does pattern-matching on `M? A`

In practice, the verification of a function `f`

articulates the `letP?`

both with
**validity pre- and post-conditions** on the inputs and outputs of `f`

as follows:

`Lemma f_is_valid : (* in mock-code *)`

forall (x1 ... xn), Valid1 x1 /\ ... /\ Validn xn

-> letP? (y1,...,yp) := f x1 ... xn in

Valid1' y1 /\ ... /\ Validp' yp

As expected, the above lemma specifies that, when the inputs of `f`

are valid, then
if the computation of `f`

succeeds, it outputs a valid value, and if it does not, it
does not cause any internal error. Notice that the lemma `f_is_valid`

is an (almost)
formal version of the kind of properties we desired to capture, as sketched in Paragraph *What is an internal error?*

## Building up validity predicatesโ

### Main principlesโ

Usually, the nature of the validity predicate for a type `A`

follows the nature of
`A`

, e.g., for records
(i.e., inductive types with named fields), the validity predicate often takes the
shape of a record with the same field names. For instance,
`step_constants`

has three fields, `amount`

and `balance`

of type `Tez_repr.t`

and `level`

of type
`Script_int.n_num.t`

. The validity predicate
of `step_constants`

is then canonically defined as follows:

`Module step_constants.`

Module Valid.

Record t (sc : step_constants) : Prop := {

amount : Tez_repr.Valid.t sc.(step_constants.amount);

balance : Tez_repr.Valid.t sc.(step_constants.balance);

level : Script_int.n_num.Valid.t sc.(step_constants.level);

}.

End Valid.

End step_constants.

As the example of `step_constants`

above illustrates, validity predicates are built on
top of another, e.g. if a type `C`

has a constructor
`Ctor: A -> B -> C`

, then the validity predicate of `C`

should look like:

`t x :=`

match x with

| Ctor a b => A.Valid.t a -> B.Valid.t b -> t (Ctor a b)

(* snip for other constructors *)

end.

For instance, the validity predicate for the type `Ticket_scanner.has_tickets`

, which
has a unique constructor ```
Has_tickets : Ticket_scanner.Ticket_inspection.has_tickets
-> Script_typed_ir.ty -> Ticket_scanner.has_tickets
```

should be:

`Module has_tickets.`

Module Valid.

Inductive t {a : Ty.t} (ht : Ticket_scanner.has_tickets) (ty : Script_typed_ir.ty) -> Prop :=

| Has_tickets a ht ty :

a = Script_typed_ir.ty.Valid.to_Ty_t ty ->

Ticket_inspection.has_tickets.Valid.t a ht ->

Script_typed_ir.ty.Valid.t ty ->

t (Ticket_scanner.Has_tickets ht ty).

End Valid.

End has_tickets.

As a side note, the predicate `has_tickets.Valid.t`

is not yet updated as this blog
post is being written, but it will match the above definition soon.

##### note

An interesting feature of our work at Formal Land can be noticed, whether it is for validity predicate or specifying internal errors, any modification we make is propagated through the whole Coq code;

- For instance, changing a validity predicate for a type
`A`

will automatically modify the validity predicate for any type`B`

that uses the type`A`

as a field; - or adding another error in the internal error list will modify everywhere the
`not_internal`

checks.

Then, Coq type-checker will catch every failure that could come from any modification. This ensures that our work does not regress silently when the protocol evolves.

### Specifying lost information in validity predicatesโ

Look at the implicit parameter `a : Ty.t`

in the validity predicate of `has_tickets`

quoted above. It comes from the fact that in the OCaml code, the type `has_tickets`

is parametrized with `'a`

, but observe that in the Coq code (which was
automatically produced by `coq-of-ocaml`

), `has_tickets`

*does not* have a type
parameter. Indeed, one of the roles of validity predicates is to also retrieve all the
relevant information that was lost during the automatic translation (beside
setting conditions that cannot be specified in OCaml types, such as integer
bounds). This topic is subtle and we may come back to it in the future.

To conclude this blog entry, I'll just mention the fact that **the use
of validity predicates goes beyond verifying that internal errors do not occur!**.
It is only on valid inputs that we can hope to describe the behavior of a function.
The use of validity predicates is thus crucial to capture the specification
of Michelson functions, beyond the fact that they may or (hopefully) may not
raise internal errors!