Blog | Coq Tezos of OCaml 💫

Verifying the compare functions of OCaml

April 4, 2022 · 7 min read

Guillaume Claret

Formal Land

In OCaml, we create a compare function for many data structures to have:

an ordering, and:
an equality function.

These compare functions should behave as follows:

compare x y returns -1 when x is "lesser" than y;
compare x y returns 0 when x is "equal" to y;
compare x y returns 1 when x is "greater" than y.

The compare functions are useful to implement many algorithms and derive data structures such as maps and sets. Having an issue in the consistency of such functions can lead to bugs in the code using them. An example of inconsistency is to have both:

compare x y = 1 and:
compare y x = 1

for some values of x and y. A complex example of compare function in the code of Tezos is located in the Script_comparable.v file to compare arbitrary Michelson values.

In this blog post we will see how we specify the compare functions and verify them.

Some refactoring, some profiling and some optimization

March 28, 2022 · 6 min read

Natasha Klaus

Formal Land

During our work on verification of Tezos source code, we faced a problem: some proofs were taking too long time to execute. For example, function which proves equality of parse_ty_aux and dep_parse_ty_aux was executing about 5 minutes, according to CI (Not acceptable! We couldn't even merge this proof!).

For the information why we actually needed to prove the equality of parse_ty_aux and dep_parse_ty_aux, and why we had to create a function-simulation dep_parse_ty_aux, please check this article: Simulations - dependently-typed version

What is the reason of such a long execution period? First of all the length of original parse_ty_aux is about 500 lines of code. So the term itself is rather big, and each step of its reduction is a time-taking procedure.

Many other functions that also need to be verified are even longer, for example the Fixpoint parse_data_aux is having 4932 lines of code. Therefore, it is very important that the proofs for such a long functions are optimized.

Simulations - dependently-typed version

February 26, 2022 · 5 min read

Natasha Klaus

Formal Land

During our work on verification of Tezos source code, we run into an interesting question: how to implement OCaml's GADTs in Coq? The essence of this problem, as well as its solution, were described in the article: Translating the Michelson interpreter where we show how we introduce casts in the generated Coq code.

In this article, we will consider subsequent work in this direction. To reach one of our main goals, proof of the compatibility of the parsing and unparsing functions for the Michelson types and values, we need to avoid casts. A powerful Coq's instrument, the dependent types, will help us to solve the casts problem. We create a kind of clone of coq-of-ocaml's code, which is a dependently typed version.

The whole process can be structured as follows:

the Simulations folder (along with its purpose and content),
definition of dependent types,
definition of functions with dependent types (analogs of functions with casts),
proof of the equivalence between functions with dependent types and functions with casts,
proof of assertions and properties for functions with dependent types.

and this is how it is all organized in the repository:

we keep files, generated by coq-of-ocaml in folder Proto_alpha
we keep dependently-defined types and functions in folder Simulations
we keep our proofs in folder Proofs

Formalization of code in Coq - tactics

February 10, 2022 · 4 min read

Natasha Klaus

Formal Land

For the Proof Engineer, the formalization of code in Coq is like an amazing adventure. But sometimes it can be tedious and monotonous due to the same repetitive actions that programmers should perform. Fortunately, Coq provides us with proof automation which helps us significantly increase our productivity.

In this post, we will show how we use/write tactics in our work on Tezos code formalization. In particular, we will touch the following topics:

presentation of some features of Ltac (the Coq's tactic language),
recursive proof search,
built-in Coq tactics

Integer arithmetic verification

February 8, 2022 · 5 min read

Daniel Hilst

Formal Land

One of the points of concern in the Tezos code is integer overflows. To ensure that no overflow occurs, the development team created abstractions over the native OCaml integers. This makes it possible to check if an overflow happened and to return an error if that is the case.

But how to be sure that such abstractions are correct? In this post, we will explain how integer arithmetic can be verified using Coq.

Further automation of data_encoding proofs with CoqHammer

January 24, 2022 · 3 min read

Daniel Hilst

Formal Land

This post is a continuation of the previous blog post on proof automation for the data_encoding proofs.

data_encoding is a library used in the Tezos codebase to define binary and JSON encodings, by using some primitives and then combining these primitives with combinators.

Translating the Michelson interpreter

January 12, 2022 · 7 min read

Guillaume Claret

Formal Land

Verifying the implementation of the interpreter of Michelson smart contracts is of importance because one of the strengths of Tezos is to be able to formally verify the smart contracts. In order to make this verification sound, we need to show that the smart contracts verification framework 🍬 Mi-Cho-Coq is coherent with the actual implementation of Michelson. In this blog post, we show how we translated the Michelson interpreter from the OCaml language to Coq. This is the first step to show the coherence of Mi-Cho-Coq with the implementation.

The main file of the interpreter is script_interpreter.ml (around 1,800 lines). The abstract syntax tree of smart contracts is given in script_typed_ir.mli (around 1,600 lines). We translate the OCaml code to Coq using coq-of-ocaml. A difficulty in translating the interpreter is that it heavily relies on GADTs in OCaml, a feature that does not exist in Coq. We show how by adding code annotations and cast axioms we can do this translation.

The generated Coq code for the interpreter is in Script_interpreter.v.

End of 2021 in review

January 7, 2022 · 3 min read

Guillaume Claret

Formal Land

We did quite a lot of work at the end of 2021. We present here our main results and progress on the formal verification of the protocol of Tezos. This effort aims to discover hidden bugs and prevent new ones, using mathematical methods and the proof system Coq.

Automation of data_encoding proofs

November 22, 2021 · 2 min read

Daniel Hilst

Formal Land (for Nomadic Labs)

The data_encoding library is used throughout the Tezos codebase for various serializing/deserializing tasks. Because of this, in our effort to formally verify the protocol, there are many proofs involving encoding.

Simply speaking, data_encoding has a series of primitives for encoding/decoding scalar data types and combinators for non-scalar data types, combinators work by combining primitives together with functions and letting users compose smaller encodings into bigger ones.

Having axioms and lemmas for primitives and combinators enable us to verify the encodings, but in practice, these proofs can be very mechanical.

Translate the Tenderbake's code to Coq

November 8, 2021 · 11 min read

Guillaume Claret

Formal Land (for Nomadic Labs)

Recently, an important merge request landed in Tezos to implement Tenderbake. This new consensus algorithm offers deterministic finality:

"a block that has just been appended to the chain of some node is known to be final once it has two additional blocks on top of it, regardless of network latency." (Nomadic Labs' blog)

This merge request and other changes resulted in almost 10,000 new Coq lines 🐓 in the translated code.

In this blog post, we explain how we dealt with all recent code changes to maintain the Coq translation of the protocol. We use the tool coq-of-ocaml to do our translation. The resulting Coq code is what you can browse on this website. All our changes on the OCaml code are in our fork branch guillaume-claret@proto_alpha-coq-of-ocaml. The procedure to compile the protocol to Coq is in this HOWTO.