In this blog post, I propose to make you discover how we deal with internal errors
and show that they do not occur while running the protocol! I will first give you
a glimpse of the so-called error monad (but without going into the details of
the notion of monad 🙂). You actually don't need to have any knowledge of monads in computer programming or elsewhere to read this article: we will take them step by step. But
if you are already familiar on them, you can skim through the first parts of this post and have a look at the monadic let ... in
s we use and how we articulate them with validity predicates. We conclude by giving a first glimpse at how we use these validity predicates to manually specify information which is lost during the automatic translation
from OCaml to Coq.
Absence of internal errors
We are currently verifying the absence of internal errors in the protocol of Tezos. We show that no unexpected errors are reachable in the code, such as assert false
or critical errors. We describe here what we have done recently towards the verification of such errors.
Skip-list verification. Using inductive predicates
In Tezos, the skip-list data structure is used in several places. Verification of
skip-list
required many days of painstaking work, and this work is not yet finished.
In this post, we will talk about the techniques we used.
In particular, we describe the verification of function mem
- one of the functions
operating in this data structure. More information about formal verification of skip list
can be found here.
Verifying the skip-list
Formal verification of the skip list of Tezos includes the work on verification of the following properties :
- Validity of a cell (containing
content
and a list ofback pointers
), - Validity of a dereferencing operator,
- Validity of encoding functions for
cell
andpointer
, - Validity of backpointers,
- Verification of
equality
definition, - Verification of group of functions and definitions serving the organization
of data-structure :
next
,genesis
,best_skip
,mem
,pp
and others, - Validity of a
back_path
(the list of backpointers), - Uniqueness of
express line
(back_path
, list of pointers, generated with usage ofbest_skip
function).
Verifying json-data-encoding
Tezos uses the json-data-encoding library throughout the protocol to serialize data for use in RPC, writing to disk or placing in a block. RPC is a synchronous operation. Consequently, it is paramount to verify the correctness of code as well as the logical equivalence and semantics of performance-enhanced procedures to proven ones.
This library has been subject to stack overflows, as seen in issues like this. We will translate and verify this library to validate its logic and simplify debugging these issues.
Fixing reused proofs
After each new release of the protocol there comes a need to validate it. One of the options would be to start from scratch, but fortunately there is no need for that since many parts of the protocol are only slightly changed or stay the same. That allows us to reuse old proofs in consequent versions and speed up the validation.
Of course there is a small number of proofs which stop working due to the changes and that has to be repaired. This process can sometimes be tricky since it requires an understanding of how different functions and proofs depend on each other, but that makes it very interesting. In this post, we will show examples of broken proofs, discuss reasons for their failure, and show techniques we used to fix issues.
Formal verification of property based tests
Property based tests are a way to find bugs in the code by running a boolean function (representing an expected property) on randomly generated inputs. In this blog post, we show how we translate these tests to corresponding Coq theorems. Once we have a formal representation, we prove them correct on any inputs. Thanks to this technique, we can:
- have a bridge between the specification given by the tests and the specification in Coq;
- make sure that the properties verified in the tests are true with a mathematical degree of certainty.
We have done this work with students from the University of Groningen:
- Joris Peters https://github.com/jorisptrs
- Rienk Bergsma https://github.com/Dragonflew17
- Tait van Strien https://gitlab.com/7A1T
- Tino Alferink https://gitlab.com/Tinoalfie
- Tudor Roscoiu https://gitlab.com/tudoroscoiu
Plan for backward compatibility verification
The backward compatibility is an important property to maintain between two releases of the protocol of Tezos. For example, this ensures that existing smart contracts still work after a protocol upgrade. That property can be hard to ensure by the protocol developers as the codebase's size increases. At the time of writing, there are 86,346 lines of code in src/proto_alpha/lib_protocol in the master
branch of Tezos, counting all the .ml
and .mli
files.
We propose using formal verification to check that the backward compatibility is preserved after each protocol release, with a mathematical degree of certainty. More precisely, we show that for every inputs and storage contexts, a smart contract working with version Jakarta of Tezos will generate the exact same output running on the next version K. Even if there are infinitely many possible inputs, we can verify all of them by writing a mathematical proof in the formal system Coq.
In this blog post, we describe our plan for this verification effort.
Handling fold_left in proofs
OCaml developers at Tezos often use fold_left
function
in the code. Maybe the reason is that it is a bit easier to write efficient
code using fold_left
than fold_right
.
fold_left
and fold_right
from OCaml library: List
val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a
fold_left f init [b1; ...; bn] is f (... (f (f init b1) b2) ...) bn.
val fold_right : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'b
fold_right f [a1; ...; an] init is f a1 (f a2 (... (f an init) ...)).
Not tail-recursive.
And from formal verification point of view,
it is much easier to prove statements about fold_right
either than about fold_left
.
Why? The reason is in how these functions are organized. Formal verification often uses
mathematical induction as an instrument for proving statements on the lists.
Fuel - quiet escape from the halting problem
As you may already know from the article Simulations - dependently-typed version,
in the Proto_alpha
folder we keep code, generated by the coq-of-ocaml
.
Many of the fixpoints here are preceded by the annotation #[bypass_check(guard)]
,
which means that guard checking is locally disabled. And this means that
we can not consider these functions to be total. In such cases, proof engineers
usually feel irritation, because "total" is one of their favorite words.
We know that total functions never crush and they always return a well-typed
result within a finite time.
note
Thanks to Alan Turing we know that the halting problem - the difficulty of determining whether a specific program terminates or not - is undecidable. Coq can't determine if a function is total in general, according to the original definition of totality. Instead, it analyzes a function's syntax and makes a conservative approximation.
Our current goal is that the doubles defined in the folder
Simulations
, whose equivalence with their prototypes from Proto_alpha
should
be proved, would be total (have enabled Coqs guard checking on fixpoint).
We already mentioned the Guard Checking
question here.
Coq is analyzing the function's syntax to consider it to be total. One way to convince Coq that a function is total is to show that there is a decreasing argument that converges to the base case.
Let's consider the function parse_ty_aux
from Proto_alpha
and its clone
dep_parse_ty_aux
from the Simulations
folder. parse_ty_aux
is having
{struct node_value}
annotation. The point of the {struct ident}
annotation is
to tell the Coq, which argument decreases along the recursive calls.
node_value : Alpha_context.Script.node
is designed in such a way that it is
not trivial to show that the function converges. Coq can not detect that each
and every recursion call is done on the direct subterm of a given node_value
argument (and it is indeed actually not the case).