The data_encoding library is used throughout the Tezos codebase for various serializing/deserializing tasks. Because of this, in our effort to formally verify the protocol, there are many proofs involving encoding.

Simply speaking, `data_encoding`

has a series of primitives for encoding/decoding scalar
data types and combinators for non-scalar data types, combinators work by
combining primitives together with functions and letting users compose smaller
encodings into bigger ones.

Having axioms and lemmas for primitives and combinators enable us to verify the encodings, but in practice, these proofs can be very mechanical.

For example here is a proof involving data_encoding.

`Lemma fixed_encoding_is_valid`

: Data_encoding.Valid.t (fun _ => True) Constants_repr.fixed_encoding.

eapply Data_encoding.Valid.implies.

eapply

Data_encoding.Valid.conv,

Data_encoding.Valid.obj10;

constructor;

try first [

apply Data_encoding.Valid.uint8 |

apply Data_encoding.Valid.uint16 |

apply Data_encoding.Valid.int31

];

eapply Data_encoding.Valid.list_value;

eapply Data_encoding.Valid.conv_with_guard;

apply Data_encoding.Valid.int64_value.

intro x; destruct x; intuition.

repeat constructor.

Qed.

All these `Data_encoding.Valid.*`

lines can now be replaced by a simple call of
`data_encoding_auto`

. This tactic will try to apply each of the combinator axioms
from `Data_encoding.Valid`

and then use the `Data_encoding_db`

hint database to
solve the remaining subgoals regarding `data_encoding`

primitives. If the proof
cannot be fully solved it will remain subgoals that are specific for each encoding.

Now here is the same proof but with `data_encoding_auto`

.

`Lemma fixed_encoding_is_valid`

: Data_encoding.Valid.t (fun _ => True) Constants_repr.fixed_encoding.

Data_encoding.Valid.data_encoding_auto.

intro x; destruct x; intuition.

repeat constructor.

Qed.

Some `data_encoding`

combinators involve arbitrary functions; these are `case`

for encoding variants of sum types, and `conv`

for providing a conversion
between encodings. Because of this it is not always possible to fully automate the
encoding proofs, but we can leave all the mechanical work for the machine so we
can focus on things that are specific for each encoding.

As we write more encoding proofs we can increment the hints database. Since encodings have a nesting nature, we can use these hints to prove bigger encodings making it easier to write encoding proofs and avoiding repetitive proofs.