Skip to main content

The error monad, internal errors and validity predicates, step-by-step

ยท 10 min read
Pierre Vial

In this blog post, I propose to make you discover how we deal with internal errors and show that they do not occur while running the protocol! I will first give you a glimpse of the so-called error monad (but without going into the details of the notion of monad ๐Ÿ™‚). You actually don't need to have any knowledge of monads in computer programming or elsewhere to read this article: we will take them step by step. But if you are already familiar on them, you can skim through the first parts of this post and have a look at the monadic let ... ins we use and how we articulate them with validity predicates. We conclude by giving a first glimpse at how we use these validity predicates to manually specify information which is lost during the automatic translation from OCaml to Coq.

Internal Errorsโ€‹

What is an internal error?โ€‹

Let us present incrementally our methodology to check that internal errors do not occur. But what is an internal error? Well, internal errors are those that should never occur whatsoever. They may be put in regard to user errors, e.g., when a user produces a bad (for instance, ill-typed) block intentionally or not:

  • First, we select a list of internal errors. We have hitherto mostly focused on Tezos's "internal error" and storage errors. What should be considered as an internal error (vs. a user error) is a matter of consensus between Tezos developers and this list may be updated.
  • Then, we verify the protocol one function at a time. We shall refine this later, but let's say for now that, for each function f, we want to state and prove a theorem of the form:

    If the inputs x1, ..., xn of f satisfy some pre-conditions P1,..., Pn, then:

    • if f succeeds, then the output f x1 ... xn satisfies some post-condition P.
    • if f fails (i.e., raises errors), then it didn't raise any internal error.

Of course, pre-conditions are pivotal to prove such a statement, and must be specified carefully (more on this below). Checking a post-condition ensures compositionality between our various validity lemmas, in the sense that the output of a function f may be taken as an input of another function g. The example below will illustrate all of this.

The Error Monadโ€‹

For each type A, we define a new type M? A called the error monad (or the result monad) of A. Intuitively, every e : M? A is morally an expression of type A which either evaluates to a value a : A, or raises an error in the course of its evaluation.

Concretely, the error monad M? A is built from A (and another type parameter B that we mostly put under the carpet) with two constructors:

  • Ok a (with a : A) specifies a computation in A which succeeds and outputs a result a
  • Error b specifies an error b. Basically, b can be thought as string (or a list of strings) which contains associated error message + some extra-information.

The definition and notation of M? A can be found in Error_monad.v and in Pervasives.v. This actually emulates the result monad of Tezos. Monads are a very rich structure in logic, category theory, theoretical computer science as well as in practical programming, but we may blissfully ignore all their subtleties here.

Then, checking that internal errors do not occur is about verifying that the only Error bs that may pop up are not in the list of internal errors, with the function not_internal.

Monadic let and monadic composition (bind)โ€‹

One way to compute the composition of two functions f : A -> B and g : B -> C on an input a : A is to write let b := f a in g b.

Likewise, one can "compose" two functions f : A -> M? B and g : B -> M? C when applied to an input a : A. It can be written as:

let? b := f a in g b

The so-called monadic let? ... in works the same as a regular let ... in except that it propagages any error that occurs in f in the expected way, e.g. if f or g raise an error then let? b := f a in g b raises the same error. Thus, let? ... in allows performing what is called monadic composition (which is sometimes referred to as the bind operator of the monad). This bind operator for the monad error is written >>=? in the OCamlcode. If you look at the Tezos repository, you will see that monadic composition is pervasive in the code of the protocol!

Remark: Formally, let? ... in is defined by (pseudo-)pattern-matching as follows:

  • If f a = Error foo, then let? b := f a in g b outputs Error foo.
  • If f a = Ok b0 and g b0 = Error bar, then let? b := f a in g b outputs Error bar.
  • If f a = Ok b0 and g b0 = Ok c0, then let? b := f a in g b outputs c0.

Specifying validityโ€‹

Validity predicates for valuesโ€‹

For most types A, we need to define what it is to be valid for an object of A. Intuitively, the validity predicate on a given type A must be simple, natural and in fine (i.e., when proofs are completed) ensure that a valid input cannot cause an internal error (notice that the criteria must indeed be natural to avoid (implicitly) tautological definition such as "a is valid when a does not raise any internal error when taken as a parameter.")

For instance, on the type Int32, validity is about not being out of bounds.

Module Valid.
Definition t (n : int32) : Prop :=
Int32.min_int โ‰ค n โ‰ค Int32.max_int.

This predicate allows us for instance to check that there is no overflow while doing operations such as addition or multiplication when suitable pre-conditions are met. To have an example, have a look at the proof file of Saturation_repr.v, e.g. at the lemma add_is_valid, which uses a validity predicate which is very similar to Int32.Valid.t. above.

Validity properties for functionsโ€‹

Whereas specifying validity predicates on various types pertains to the realm of definitions, verifying that given protocol function are valid pertains to that of theorems and proofs.

First of all, to state the lemmas about the validity of a certain function, we resort to the letP? binder, which may be thought of as a particular case of letP?, when the second function g is actually a predicate arriving in Prop, the type of propositions. Thus, for e : M? A and P : A -> Prop, depending on the case, the proposition letP? a := e in P a means that:

  • When the evaluation of e succeeds and outputs a0 : A (i.e., e evaluates to Ok a0), then letP? a := e in P a iff a0 satisfies the predicate P;
  • When the evaluation of e raises an error (i.e., e evaluates to Error b), then letP? a := e in P a iff b is not internal.

Actually, letP? is a notation wrapping the operator bind_prop which does pattern-matching on M? A

In practice, the verification of a function f articulates the letP? both with validity pre- and post-conditions on the inputs and outputs of f as follows:

Lemma f_is_valid : (* in mock-code *)
forall (x1 ... xn), Valid1 x1 /\ ... /\ Validn xn
-> letP? (y1,...,yp) := f x1 ... xn in
Valid1' y1 /\ ... /\ Validp' yp

As expected, the above lemma specifies that, when the inputs of f are valid, then if the computation of f succeeds, it outputs a valid value, and if it does not, it does not cause any internal error. Notice that the lemma f_is_valid is an (almost) formal version of the kind of properties we desired to capture, as sketched in Paragraph What is an internal error?

Building up validity predicatesโ€‹

Main principlesโ€‹

Usually, the nature of the validity predicate for a type A follows the nature of A, e.g., for records (i.e., inductive types with named fields), the validity predicate often takes the shape of a record with the same field names. For instance, step_constants has three fields, amount and balance of type Tez_repr.t and level of type Script_int.n_num.t. The validity predicate of step_constants is then canonically defined as follows:

Module step_constants.
Module Valid.
Record t (sc : step_constants) : Prop := {
amount : Tez_repr.Valid.t sc.(step_constants.amount);
balance : Tez_repr.Valid.t sc.(step_constants.balance);
level : Script_int.n_num.Valid.t sc.(step_constants.level);
}.
End Valid.
End step_constants.

As the example of step_constants above illustrates, validity predicates are built on top of another, e.g. if a type C has a constructor Ctor: A -> B -> C, then the validity predicate of C should look like:

t x :=
match x with
| Ctor a b => A.Valid.t a -> B.Valid.t b -> t (Ctor a b)
(* snip for other constructors *)
end.

For instance, the validity predicate for the type Ticket_scanner.has_tickets, which has a unique constructor Has_tickets : Ticket_scanner.Ticket_inspection.has_tickets -> Script_typed_ir.ty -> Ticket_scanner.has_tickets should be:

Module has_tickets.
Module Valid.
Inductive t {a : Ty.t} (ht : Ticket_scanner.has_tickets) (ty : Script_typed_ir.ty) -> Prop :=
| Has_tickets a ht ty :
a = Script_typed_ir.ty.Valid.to_Ty_t ty ->
Ticket_inspection.has_tickets.Valid.t a ht ->
Script_typed_ir.ty.Valid.t ty ->
t (Ticket_scanner.Has_tickets ht ty).
End Valid.
End has_tickets.

As a side note, the predicate has_tickets.Valid.t is not yet updated as this blog post is being written, but it will match the above definition soon.

note

An interesting feature of our work at Formal Land can be noticed, whether it is for validity predicate or specifying internal errors, any modification we make is propagated through the whole Coq code;

  • For instance, changing a validity predicate for a type A will automatically modify the validity predicate for any type B that uses the type A as a field;
  • or adding another error in the internal error list will modify everywhere the not_internal checks.

Then, Coq type-checker will catch every failure that could come from any modification. This ensures that our work does not regress silently when the protocol evolves.

Specifying lost information in validity predicatesโ€‹

Look at the implicit parameter a : Ty.t in the validity predicate of has_tickets quoted above. It comes from the fact that in the OCaml code, the type has_tickets is parametrized with 'a, but observe that in the Coq code (which was automatically produced by coq-of-ocaml), has_tickets does not have a type parameter. Indeed, one of the roles of validity predicates is to also retrieve all the relevant information that was lost during the automatic translation (beside setting conditions that cannot be specified in OCaml types, such as integer bounds). This topic is subtle and we may come back to it in the future.

To conclude this blog entry, I'll just mention the fact that the use of validity predicates goes beyond verifying that internal errors do not occur!. It is only on valid inputs that we can hope to describe the behavior of a function. The use of validity predicates is thus crucial to capture the specification of Michelson functions, beyond the fact that they may or (hopefully) may not raise internal errors!