During our work on verification of Tezos source code,
we faced a problem: some proofs were taking too long time to execute.
For example, function which proves equality of parse_ty_aux
and
dep_parse_ty_aux
was executing about 5 minutes, according to CI
(Not acceptable! We couldn't even merge this proof!).
For the information why we actually needed to prove the equality of parse_ty_aux
and dep_parse_ty_aux
, and why we had to create a function-simulation
dep_parse_ty_aux
, please check this article:
Simulations - dependently-typed version
What is the reason of such a long execution period? First of all the length of
original parse_ty_aux
is about 500 lines of code. So the term itself is rather
big, and each step of its reduction is a time-taking procedure.
Many other functions that also need to be verified are even longer, for example
the Fixpoint parse_data_aux
is having 4932 lines of code. Therefore, it is
very important that the proofs for such a long functions are optimized.
One-step reductionโ
From the very beginning of the proof building it became obvious that big numbers used in both functions may be the essence of the problem:
in dep_parse_ty
we have 10 000 in nat - natural numbers,
fuel <= 10 000
and in parse_ty_aux
we have same big number in Z - integers represented in a
binary way
stack_depth <= 10 000
Tactics like simpl
, cbn
, cbv
, unfold
would reduce as much as they can,
trying to simplify 10 000, so we had to force Coq to perform just one-step
reduction in many cases.
Ways of one-step reduction we used:
- limitation of tactic
unfold
by giving it a precise string reference
unfold "let? _ := _ in _"
- building a special-case tactics for each piece of code, for example tactic
simpl
can be forced to work only in the very precise area:
| |- context [match ?e with _ => _ end] =>
match e with
| Script_typed_ir.Type_size.(Script_typed_ir.TYPE_SIZE.compound1) _ _ =>
simpl e
end
simpl
,cbn
,cbv
are customizable, for some of them debugging info can be printed
Execution timeโ
To find out what are the most expensive parts of our code, we also used the
command Time
, which executes sentence and displays the time needed to
execute it.
Command Time
helped us a lot in picking the right order for tactics
application. For example tactic lia
should go after all the simpler
tactics, like reflexivity
and others. The reason is the next: tactic
lia
works slowly, it tries many variants on the same goal, while reflexivity
either solves or reports inability to resolve goals faster. So in the sequence
of the tactics, we should suggest reflexivity
to solve all it can, and all
remaining goals we should give lia
.
Changing order of these tactics on the bunch of goals can notably accelerate calculations:
try lia; try reflexivity (** not recommended*)
try reflexivity; try lia (** recommended *)
Also, we had to get rid of such tactics as easy
in some sequences of tactics,
and replace easy
by the more simpl and precise tactics, consider this example
below: p
is the node.Prim
, when we destruct it, we receive more
than 150 subgoals.
This code takes 53.2 seconds:
Time (destruct p eqn:TP;
simpl; try reflexivity; destruct l0 eqn:L0; simpl; try easy;
destruct Script_ir_annot.check_type_annot eqn:CTA; simpl;
try easy).
This code takes 6.3 seconds! Quite a difference!
Here try easy
replaced with try (split; reflexivity)
:
Time (destruct p eqn:TP;
simpl; try reflexivity; destruct l0 eqn:L0; simpl; try (split; reflexivity);
destruct Script_ir_annot.check_type_annot eqn:CTA; simpl;
try (split; reflexivity)).
Replacing 10000 to 10 and calculating execution timeโ
After all the manipulations described above, it was not clear what was causing the slow work of prover. We were thinking that one-step reduction applied to all problem areas should have solved the problem with large numbers. But still the numbers 10000 and 10001 were on the list of suspects, it was too early to delete them. We will not describe here who from this list turned out to be innocent, instead, we will tell you how we managed to catch the criminal red-handed.
We replaced 10000 and 10001 in dep_parse_ty_aux
from Simulations
folder and
in parse_ty_aux
from Proto_alpha
everywhere they occured with 10 and 11
respectively. When we measured time of updated equality proof, difference was
the next: 5 minutes versus 1 minute. Experiment has shown that the speed of
calculations accelerated by 5 times.
Opaque and Transparentโ
We can make our numbers Opaque
, this will prevent them from unfolding and
simplifying, see the code below:
Definition number_10001 : Z.t := 10001.
Opaque number_10001.
Definition number_10000 : Z.t := 10000.
Opaque number_10000.
And if we need them to be transparent
(we do in many cases, we actually use
these numbers in calculations), we can make them transparent before usage,
like this:
Transparent number_10001.
In the code we have several such number definitions (all required for calculations)
Definition number_10001 : Z.t := 10001.
Opaque number_10001.
Definition number_10000 : Z.t := 10000.
Opaque number_10000.
Definition nat_number_10001 : nat := 10001.
Opaque nat_number_10001.
Definition nat_number_10000 : nat := 10000.
Opaque nat_number_10000.
Definition positive_number_10001 : positive := 10001.
Opaque positive_number_10001.
Definition positive_number_10000 : positive := 10000.
Opaque positive_number_10000.
We also need to switch between them inside proof (make transparent
/
make opaque
), and this is getting tiresome. But we are lucky, because
we are dealing with Coq. In the example below, with_strategy
will make all
our opaque numbers transparent just for lia
tactic where we need it.
Ltac tr_lia := with_strategy transparent
[nat_number_10000
nat_number_10001
number_10000
number_10001
positive_number_10000
positive_number_10001] lia.
"This can be useful for guarding calls to reduction in tactic automation
to ensure that certain constants are never unfolded by tactics like simpl
and cbn
or to ensure that unfolding does not fail..." says the coq reference
manual. Exactly what we need!
Here are some particularly notable results: without Opaque
solution, this
line of code below (which calls quite voluminous tactic basic
),
works 2 minutes:
all : repeat (basic Y' parse_ty_aux_dep_parse_ty_aux_eq).
With Opaque
solution this line works 2 seconds!
We are, of course, very satisfied with the result. That was educational. Thanks for reading!