[CakeML-dev] Let's remove max_app
Magnus Myreen
magnus.myreen at gmail.com
Sun Mar 5 08:21:43 UTC 2017
Hi all,
We should get rid of max_app. Reasons:
1. The benchmarks show that one hits a significant performance drop
if the compiled code includes curried functions with more
arguments than max_app. I believe the bootstrapped compiler would
run a significant constant factor faster if it was compiled with a
high (or non-existent) max_app value, since many functions take
more than 5 arguments. (I was e.g. looking at assign_def in
data_to_word which is run on every Op in the dataLang
programs. The assign function takes 7 arguments).
2. The current compilation strategy produces too much stub code in
clos_to_bvl. If we really must have max_app (I argue below that
we don't), then it should be something like 10 or 15. With the
current compilation strategy, the binary produced for max_app=9 is
176 kb larger than the binary for max_app=3. I wanted to try
compiling with max_app=10, but that was taking more than 5
minutes. Compiling with max_app=3 takes 7 seconds for me. I
believe the slow down in compilation speed with differing max_app
values is caused by the rapid increase in code size (number of
stubs in clos_to_bvl) as max_app gets larger. My measurements
are below while compiling factorial (and the basis library):
max_app=1: 10.810s (* very little room for opts in basis *)
max_app=2: 8.724s
max_app=3: 7.872s (* allows optimisations in basis code *)
max_app=4: 8.935s
max_app=5: 11.608s
max_app=6: 21.771s
max_app=7: 42.932s
max_app=8: 1m24.872s
max_app=9: 3m9.745s
One of the reasons the bootstrapped compilation times are slow is
that closLang (and subsequent) optimisations could not fire due to
clos_mti not doing enough (due to the low max_app) when
compiling the compiler in HOL. As we know, if clos_mti doesn't
do enough, a lot of other optimisations don't do much.
So the question is: why do we have max_app currently? I believe it is
only there because of the way clos_to_bvl wants to jump to specialised
code for dealing with applying n arguments to an m-argument
closure. Since n and m need to be <= max_app, we have max_app *
max_app stubs (each being a non-trivial piece of code).
My suggestion is to:
1. Remove max_app completely from the semantics, compiler and
proofs. This should make various things neater and simpler.
2. Have only one general-purpose stub (gen_app_stub) in clos_to_bvl
for dealing with implementing the semantics of applying an
arbitrary closure value to any number of arguments. Currently the
compilation of applications of unknown closures (case: loc_opt =
NONE) is as follows:
(compile_exps max_app [App loc_opt x1 xs2] aux =
let (c1,aux1) = compile_exps max_app [x1] aux in
let (c2,aux2) = compile_exps max_app xs2 aux1 in
([dtcase loc_opt of
| NONE =>
Let (c2++c1) (mk_cl_call (Var (LENGTH c2)) (GENLIST Var
(LENGTH c2)))
| SOME loc =>
(Call (LENGTH c2 - 1) (SOME (loc + (num_stubs max_app)))
(c2 ++ c1))],
aux2)) /\
mk_cl_call cl args =
If (Op Equal [mk_const (LENGTH args - 1); mk_el cl (mk_const 1)])
(Call (LENGTH args - 1) NONE (args ++ [cl] ++ [mk_el cl (mk_const 0)]))
(Call 0 (SOME (LENGTH args - 1)) (args ++ [cl]))
I suggest we change the last line above to something that tuples
up the given arguments and senda them to the gen_app_stub,
i.e. something like:
(Call 0 (SOME gen_app_stub_location)
[cl; Op (Cons 0) args; Op (Const (& (LENGTH args))) []])
The code above passes in LENGTH args to save gen_app_stub from
having to compute it from the tuple. It should probably also be
given the previously computed value of `mk_el cl (mk_const 1)` so
that it can avoid looking up the expected number of arguments from
the closure value. (This is a very minor optimisation.)
The code for gen_app_stub can of course include a few tests for
common cases at the top, e.g. it should probably first check
whether this is a two-argument closure applied to one argument (my
guess is that this is the most common case of a partial
application) and then run special-purpose code only for that
common case, but all other cases would be dealt with using generic
code.
Thoughts? Am I missing some reason for why we have max_app? I suggest
we remove it.
Cheers,
Magnus
More information about the Developers
mailing list