[CakeML-dev] Let's remove max_app
Magnus Myreen
magnus.myreen at gmail.com
Sun Mar 5 09:15:32 UTC 2017
Just to add a bit to my previous message:
An intermediate solution:
- leave the semantics as it is
- have max_app be something huge, e.g. 1000000
- alter clos_to_bvl as suggested in my previous email
Cheers,
Magnus
On 5 March 2017 at 08:21, Magnus Myreen <magnus.myreen at gmail.com> wrote:
> Hi all,
>
> We should get rid of max_app. Reasons:
>
> 1. The benchmarks show that one hits a significant performance drop
> if the compiled code includes curried functions with more
> arguments than max_app. I believe the bootstrapped compiler would
> run a significant constant factor faster if it was compiled with a
> high (or non-existent) max_app value, since many functions take
> more than 5 arguments. (I was e.g. looking at assign_def in
> data_to_word which is run on every Op in the dataLang
> programs. The assign function takes 7 arguments).
>
> 2. The current compilation strategy produces too much stub code in
> clos_to_bvl. If we really must have max_app (I argue below that
> we don't), then it should be something like 10 or 15. With the
> current compilation strategy, the binary produced for max_app=9 is
> 176 kb larger than the binary for max_app=3. I wanted to try
> compiling with max_app=10, but that was taking more than 5
> minutes. Compiling with max_app=3 takes 7 seconds for me. I
> believe the slow down in compilation speed with differing max_app
> values is caused by the rapid increase in code size (number of
> stubs in clos_to_bvl) as max_app gets larger. My measurements
> are below while compiling factorial (and the basis library):
>
> max_app=1: 10.810s (* very little room for opts in basis *)
> max_app=2: 8.724s
> max_app=3: 7.872s (* allows optimisations in basis code *)
> max_app=4: 8.935s
> max_app=5: 11.608s
> max_app=6: 21.771s
> max_app=7: 42.932s
> max_app=8: 1m24.872s
> max_app=9: 3m9.745s
>
> One of the reasons the bootstrapped compilation times are slow is
> that closLang (and subsequent) optimisations could not fire due to
> clos_mti not doing enough (due to the low max_app) when
> compiling the compiler in HOL. As we know, if clos_mti doesn't
> do enough, a lot of other optimisations don't do much.
>
> So the question is: why do we have max_app currently? I believe it is
> only there because of the way clos_to_bvl wants to jump to specialised
> code for dealing with applying n arguments to an m-argument
> closure. Since n and m need to be <= max_app, we have max_app *
> max_app stubs (each being a non-trivial piece of code).
>
> My suggestion is to:
>
> 1. Remove max_app completely from the semantics, compiler and
> proofs. This should make various things neater and simpler.
>
> 2. Have only one general-purpose stub (gen_app_stub) in clos_to_bvl
> for dealing with implementing the semantics of applying an
> arbitrary closure value to any number of arguments. Currently the
> compilation of applications of unknown closures (case: loc_opt =
> NONE) is as follows:
>
> (compile_exps max_app [App loc_opt x1 xs2] aux =
> let (c1,aux1) = compile_exps max_app [x1] aux in
> let (c2,aux2) = compile_exps max_app xs2 aux1 in
> ([dtcase loc_opt of
> | NONE =>
> Let (c2++c1) (mk_cl_call (Var (LENGTH c2)) (GENLIST Var
> (LENGTH c2)))
> | SOME loc =>
> (Call (LENGTH c2 - 1) (SOME (loc + (num_stubs max_app)))
> (c2 ++ c1))],
> aux2)) /\
>
> mk_cl_call cl args =
> If (Op Equal [mk_const (LENGTH args - 1); mk_el cl (mk_const 1)])
> (Call (LENGTH args - 1) NONE (args ++ [cl] ++ [mk_el cl (mk_const 0)]))
> (Call 0 (SOME (LENGTH args - 1)) (args ++ [cl]))
>
> I suggest we change the last line above to something that tuples
> up the given arguments and senda them to the gen_app_stub,
> i.e. something like:
>
> (Call 0 (SOME gen_app_stub_location)
> [cl; Op (Cons 0) args; Op (Const (& (LENGTH args))) []])
>
> The code above passes in LENGTH args to save gen_app_stub from
> having to compute it from the tuple. It should probably also be
> given the previously computed value of `mk_el cl (mk_const 1)` so
> that it can avoid looking up the expected number of arguments from
> the closure value. (This is a very minor optimisation.)
>
> The code for gen_app_stub can of course include a few tests for
> common cases at the top, e.g. it should probably first check
> whether this is a two-argument closure applied to one argument (my
> guess is that this is the most common case of a partial
> application) and then run special-purpose code only for that
> common case, but all other cases would be dealt with using generic
> code.
>
> Thoughts? Am I missing some reason for why we have max_app? I suggest
> we remove it.
>
> Cheers,
> Magnus
More information about the Developers
mailing list