[CakeML-Dev] Bootstrap evaluation timings
Ramana Kumar
Ramana.Kumar at cl.cam.ac.uk
Tue Nov 22 21:42:13 UTC 2016
Preliminary results with inline_size_limit set back down to 3 indicate that
word_to_stack is still taking longer than before, specifically 3.5 hours as
opposed to the 1.5 hours of two months ago. Could the exp_cut parameter
also be relevant?
On 22 November 2016 at 16:30, Ramana Kumar <Ramana.Kumar at cl.cam.ac.uk>
wrote:
> Some updated times (now included program sizes) are up (
> https://cakeml.org/bootstrap-timing.txt). Things are still surprisingly
> slow (word_to_stack, sec_ok) - guess is still that it's because of inline
> limit, so the next thing to try is changing the inline limit..
>
> Is conf_ok required if I change the inline_limit just for the bootstrap?
> Isn't it easy to show that conf_ok isn't affected by that config field?
>
> On 18 November 2016 at 10:39, Yong Kiam <tanyongkiam at gmail.com> wrote:
>
>> It shouldn't be very hard... I just thought it wouldn't be high priority
>> since the remaining ones are only in LabAsms.
>>
>> Changing the config there seems right. You won't get a conf_ok proof
>> about it though.
>>
>> On Thu, Nov 17, 2016 at 5:46 PM, Ramana Kumar <Ramana.Kumar at cl.cam.ac.uk>
>> wrote:
>>
>>> I've seen the word_to_stack slowdown twice (although it only shows up in
>>> the timings document once - I didn't record the other data).
>>>
>>> I think removing more asm_ok is always a good idea, but I'm not sure how
>>> much priority it should be given - how hard is it to remove more?
>>>
>>> I guess after the current run finishes I could try changing the inline
>>> limit (I would do this by using a different config here
>>> https://github.com/CakeML/cakeml/blob/master/compiler/bootst
>>> rap/evaluation/to_lab_x64BootstrapScript.sml#L83 - does that seem
>>> right?)
>>>
>>> On 18 November 2016 at 09:36, Magnus Myreen <magnus.myreen at gmail.com>
>>> wrote:
>>>
>>>> I think the inline_size_limit of 10 is potentially risky. I suggest
>>>> the bootstrap is run with a lower setting, e.g. 2 or 3. -- Magnus
>>>>
>>>> On 18 November 2016 at 09:26, Yong Kiam <tanyongkiam at gmail.com> wrote:
>>>> > Argh, that's a shame. Would moving more of asm_ok up help? There are
>>>> still
>>>> > some things in LabAsms that can be removed.
>>>> >
>>>> > I'm not sure know why word_to_stack -specifically- would slow down by
>>>> so
>>>> > much. Could it be a one-off thing or did you hit that slowdown
>>>> multiple
>>>> > times already?
>>>> >
>>>> > Another possibility: changes to the default compiler configuration
>>>> could
>>>> > have slowed it down.
>>>> >
>>>> > For example this:
>>>> > https://github.com/CakeML/cakeml/commit/98188addc34e5c3a0345
>>>> 72c10eca7b3b41aff71c
>>>> >
>>>> > (e.g. maybe more things got inlined)
>>>> >
>>>> > On Thu, Nov 17, 2016 at 5:09 PM, Ramana Kumar <
>>>> Ramana.Kumar at cl.cam.ac.uk>
>>>> > wrote:
>>>> >>
>>>> >> I will put the timings from various runs of the bootstrap on this
>>>> web page
>>>> >> so we can more easily refer to them:
>>>> >> https://cakeml.org/bootstrap-timing.txt.
>>>> >> I've added partial timings for a recent revision (still waiting on
>>>> the
>>>> >> rest).
>>>> >> There is currently a problem with word_to_stack, which is now taking
>>>> 5
>>>> >> hours where it used to take 1.
>>>> >> And given that it seems to be stuck on "sec_ok" I'm wondering how
>>>> much the
>>>> >> asm_ok Asm check removal has really helped... I guess it was
>>>> supposed to
>>>> >> bring it down to 6h, so we'll see...
>>>> >>
>>>> >> On 29 September 2016 at 10:47, Ramana Kumar <
>>>> Ramana.Kumar at cl.cam.ac.uk>
>>>> >> wrote:
>>>> >>>
>>>> >>> I have tried running the "sec_ok" part of the bootstrap with the
>>>> asm_ok
>>>> >>> checks for Asm lines turned off. Two things to report:
>>>> >>>
>>>> >>> The time is significantly improved:
>>>> >>> runtime: 6h02m19s, gctime: 30m06s, systime: 3m11s.
>>>> >>> (compared to previously: 10h46m09s, gctime: 1h07m37s,
>>>> systime:
>>>> >>> 5m21s.)
>>>> >>>
>>>> >>> Secondly, this timing information is bogus: the wall clock time was
>>>> >>> definitely under 2 hours (I didn't check precisely). I expect
>>>> timing for
>>>> >>> threads in parallel execution are added sequentially. So for the
>>>> parts of
>>>> >>> the bootstrap run in parallel, the timings probably need to be
>>>> divided by up
>>>> >>> to 8 to get an accurate picture. Perhaps I should annotate which
>>>> ones are in
>>>> >>> parallel and which not.
>>>> >>>
>>>> >>> I think it's worthwhile removing the asm_ok checks. I can help
>>>> doing so
>>>> >>> if someone else sets up some cheated theorems to fill in.
>>>> >>>
>>>> >>> On 8 September 2016 at 14:33, Ramana Kumar <
>>>> Ramana.Kumar at cl.cam.ac.uk>
>>>> >>> wrote:
>>>> >>>>
>>>> >>>> Here are some timings now including the last step (x64Bootstrap).
>>>> >>>> (The commit number doesn't exactly match the output labels, since I
>>>> >>>> changed some of that (about to be committed) in the middle.)
>>>> >>>>
>>>> >>>> Here we can see that the biggest problem (>10 hours) is the
>>>> checking
>>>> >>>> that all the lines in the final ASM code satisfy line_ok_light
>>>> (i.e., the
>>>> >>>> "sec_ok" thing below). But there are a few steps that take 1-2
>>>> hours each,
>>>> >>>> and then a few more that take 20-40 mins, so ... definitely more
>>>> than an
>>>> >>>> overnight job still...
>>>> >>>>
>>>> >>>> 1a939d3
>>>> >>>>
>>>> >>>> eval to_mod: runtime: 1m49s, gctime: 3.4s, systime:
>>>> 0.35667s.
>>>> >>>> eval to_con: runtime: 8m56s, gctime: 5.5s, systime:
>>>> 0.84000s.
>>>> >>>> eval to_dec: runtime: 7.4s, gctime: 2.4s, systime: 0.15667s.
>>>> >>>> eval to_exh: runtime: 3m32s, gctime: 8.9s, systime:
>>>> 0.68667s.
>>>> >>>> eval to_pat: runtime: 14.1s, gctime: 1.4s, systime:
>>>> 0.06667s.
>>>> >>>> eval to_clos: runtime: 5.3s, gctime: 1.1s, systime:
>>>> 0.05667s.
>>>> >>>> eval to_bvl: runtime: 2m36s, gctime: 12.6s, systime:
>>>> 0.88000s.
>>>> >>>> eval to_bvi:
>>>> >>>> ... inline: runtime: 6.1s, gctime: 1.2s, systime: 0.09333s.
>>>> >>>> ... optimise: runtime: 24m53s, gctime: 1m53s, systime:
>>>> 17.1s.
>>>> >>>> ... compile: runtime: 2m27s, gctime: 11.9s, systime: 1.5s.
>>>> >>>> eval to_data: runtime: 8m11s, gctime: 21.4s, systime: 1.2s.
>>>> >>>> eval data_to_word: runtime: 8m28s, gctime: 30.4s, systime:
>>>> 2.4s.
>>>> >>>>
>>>> >>>> word_to_word: runtime: 1h00m52s, gctime: 5m53s, systime:
>>>> 34.1s.
>>>> >>>>
>>>> >>>> clash: runtime: 5m06s, gctime: 22.3s, systime: 2.8s.
>>>> >>>> (external oracle) runtime: 2m31s, gctime: 1.4s, systime:
>>>> >>>> 0.77667s.
>>>> >>>>
>>>> >>>> chunk: runtime: 1h08m52s, gctime: 6m06s, systime: 28.2s.
>>>> >>>> (chunk results) runtime: 11m49s, gctime: 1m13s, systime:
>>>> 7.2s.
>>>> >>>>
>>>> >>>> eval word_to_stack: runtime: 1h35m38s, gctime: 3m53s,
>>>> systime:
>>>> >>>> 9.4s.
>>>> >>>>
>>>> >>>> stack_alloc: runtime: 23m27s, gctime: 45.7s, systime: 6.7s.
>>>> >>>> stack_remove: runtime: 19m15s, gctime: 21.5s, systime: 4.7s.
>>>> >>>> stack_names: runtime: 15m20s, gctime: 24.9s, systime: 3.2s.
>>>> >>>> stack_to_lab: runtime: 42m12s, gctime: 9m27s, systime:
>>>> 10.4s.
>>>> >>>>
>>>> >>>> filter_skip: runtime: 3m56s, gctime: 24.7s, systime: 3.2s.
>>>> >>>> ffi_limit: runtime: 2m33s, gctime: 1m39s, systime: 3.2s.
>>>> >>>> enc_sec: runtime: 25m01s, gctime: 1m40s, systime: 11.0s.
>>>> >>>> sec_length: runtime: 4m11s, gctime: 48.9s, systime: 3.5s.
>>>> >>>>
>>>> >>>> compute_labels: runtime: 1h17m12s, gctime: 13m50s, systime:
>>>> >>>> 58.7s.
>>>> >>>>
>>>> >>>> enc_secs_again: runtime: 2h00m15s, gctime: 24m27s, systime:
>>>> >>>> 1m46s.
>>>> >>>>
>>>> >>>> upd_lab_len: runtime: 5m51s, gctime: 55.9s, systime: 3.4s.
>>>> >>>> sec_length2: runtime: 3m32s, gctime: 8.4s, systime: 1.2s.
>>>> >>>>
>>>> >>>> compute_labels2: runtime: 1h27m09s, gctime: 22m29s, systime:
>>>> >>>> 1m17s.
>>>> >>>>
>>>> >>>> enc_secs_again2: runtime: 2h37m41s, gctime: 1h03m15s,
>>>> systime:
>>>> >>>> 3m01s.
>>>> >>>>
>>>> >>>> pad_sectionruntime: 8m47s, gctime: 35.8s, systime: 3.8s.
>>>> >>>>
>>>> >>>> sec_ok: runtime: 10h46m09s, gctime: 1h07m37s, systime:
>>>> 5m21s.
>>>> >>>>
>>>> >>>> all_secs_ok: runtime: 4.5s, gctime: 0.07667s, systime:
>>>> 0.00333s.
>>>> >>>> prog_to_bytes: runtime: 7m27s, gctime: 2m23s, systime: 6.3s.
>>>> >>>>
>>>> >>>> flat_bytes: runtime: 45m21s, gctime: 8m25s, systime: 8.2s.
>>>> >>>> expand_defs: runtime: 49m25s, gctime: 31m53s, systime: 1.6s.
>>>> >>>>
>>>> >>>> On 7 September 2016 at 08:43, Ramana Kumar <
>>>> Ramana.Kumar at cl.cam.ac.uk>
>>>> >>>> wrote:
>>>> >>>>>
>>>> >>>>> More timing for a later commit. I believe this one includes
>>>> Magnus's
>>>> >>>>> changes to add compile_seqs, which seem not to have had much
>>>> effect. (I'm
>>>> >>>>> hoping I haven't messed up the timing somehow... it's surprising
>>>> how stable
>>>> >>>>> they are...)
>>>> >>>>>
>>>> >>>>> commit: 83b3aaf
>>>> >>>>>
>>>> >>>>> eval to_mod: runtime: 1m50s, gctime: 5.8s, systime: 1.0s.
>>>> >>>>> eval to_con: runtime: 8m41s, gctime: 4.2s, systime:
>>>> 0.65000s.
>>>> >>>>> eval to_dec: runtime: 11.5s, gctime: 6.4s, systime:
>>>> 0.46333s.
>>>> >>>>> eval to_exh: runtime: 3m27s, gctime: 6.8s, systime:
>>>> 0.89000s.
>>>> >>>>> eval to_pat: runtime: 14.7s, gctime: 1.7s, systime:
>>>> 0.11667s.
>>>> >>>>> eval to_clos: runtime: 5.5s, gctime: 1.1s, systime:
>>>> 0.06333s.
>>>> >>>>> eval to_bvl: runtime: 2m29s, gctime: 5.6s, systime:
>>>> 0.28333s.
>>>> >>>>> eval to_bvi:
>>>> >>>>> ... inline: runtime: 6.3s, gctime: 1.3s, systime: 0.06333s.
>>>> >>>>> ... optimise: runtime: 23m26s, gctime: 28.8s, systime:
>>>> 6.3s.
>>>> >>>>> ... compile: runtime: 2m18s, gctime: 4.1s, systime:
>>>> 0.25000s.
>>>> >>>>> eval to_data: runtime: 8m23s, gctime: 44.3s, systime: 3.2s.
>>>> >>>>> eval data_to_word: runtime: 8m23s, gctime: 39.3s, systime:
>>>> 3.1s.
>>>> >>>>>
>>>> >>>>> word_to_word: runtime: 58m18s, gctime: 4m53s, systime:
>>>> 30.1s.
>>>> >>>>>
>>>> >>>>> clash: runtime: 5m01s, gctime: 18.4s, systime: 2.5s.
>>>> >>>>> (external oracle) runtime: 2m31s, gctime: 1.9s, systime:
>>>> >>>>> 0.20667s.
>>>> >>>>>
>>>> >>>>> chunk: runtime: 1h01m18s, gctime: 1m13s, systime: 22.9s.
>>>> >>>>> (chunk results) runtime: 14m26s, gctime: 5m57s, systime:
>>>> 9.1s.
>>>> >>>>>
>>>> >>>>> eval word_to_stack: runtime: 1h29m05s, gctime: 4m38s,
>>>> systime:
>>>> >>>>> 14.4s.
>>>> >>>>>
>>>> >>>>> stack_alloc: runtime: 23m03s, gctime: 58.3s, systime: 5.8s.
>>>> >>>>> stack_remove: runtime: 20m09s, gctime: 1m29s, systime:
>>>> 6.5s.
>>>> >>>>> stack_names: runtime: 15m05s, gctime: 41.6s, systime: 3.9s.
>>>> >>>>> stack_to_lab: runtime: 35m11s, gctime: 2m54s, systime:
>>>> 12.0s.
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> On 6 September 2016 at 10:11, Yong Kiam <tanyongkiam at gmail.com>
>>>> wrote:
>>>> >>>>>>
>>>> >>>>>> Hmm, that is surprising. I expected the main function to have a
>>>> good
>>>> >>>>>> number of If-s (which my change should have helped with), but it
>>>> seems like
>>>> >>>>>> that didn't affect the time at all.
>>>> >>>>>>
>>>> >>>>>> Here are two possibilities:
>>>> >>>>>>
>>>> >>>>>> 1) Caching lookups into the colouring function for the chunk
>>>> step. The
>>>> >>>>>> function is defined as an option-lookup into an oracle sptree:
>>>> >>>>>> https://github.com/CakeML/cakeml/blob/9937677d34446adace9f29
>>>> d5719131f9a5b4aeac/compiler/backend/reg_alloc/reg_allocScript.sml#L110
>>>> >>>>>> I'm not sure how much that will help though, since the variable
>>>> names
>>>> >>>>>> should be in SSA already, so it is mostly looking up different
>>>> things each
>>>> >>>>>> time.
>>>> >>>>>>
>>>> >>>>>> 2) Make splitting more aggressive, i.e. the main function should
>>>> be
>>>> >>>>>> divided into more pieces so that its colouring sptree isn't too
>>>> big (it gets
>>>> >>>>>> larger when there are more variables too).
>>>> >>>>>>
>>>> >>>>>> Also, are you getting some kind of effect where every parallel
>>>> thread
>>>> >>>>>> is just waiting for one to finish (namely the one working on the
>>>> main
>>>> >>>>>> function)? 2) might help divide the work more evenly.
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> On Mon, Sep 5, 2016 at 7:45 PM, Ramana Kumar
>>>> >>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>> >>>>>>>
>>>> >>>>>>> Some more timings, for two recent commits. Not a big
>>>> difference, it
>>>> >>>>>>> seems.
>>>> >>>>>>>
>>>> >>>>>>> commit: 69ac2f9
>>>> >>>>>>>
>>>> >>>>>>> eval to_mod: runtime: 1m45s, gctime: 3.8s, systime:
>>>> 0.36667s.
>>>> >>>>>>> eval to_con: runtime: 8m36s, gctime: 5.5s, systime:
>>>> 0.90667s.
>>>> >>>>>>> eval to_dec: runtime: 7.3s, gctime: 2.3s, systime:
>>>> 0.17000s.
>>>> >>>>>>> eval to_exh: runtime: 3m27s, gctime: 9.5s, systime: 1.1s.
>>>> >>>>>>> eval to_pat: runtime: 13.9s, gctime: 1.5s, systime:
>>>> 0.10333s.
>>>> >>>>>>> eval to_clos: runtime: 5.2s, gctime: 1.1s, systime:
>>>> 0.07000s.
>>>> >>>>>>> eval to_bvl: runtime: 2m28s, gctime: 5.5s, systime:
>>>> 0.22667s.
>>>> >>>>>>> eval to_bvi: ... inline: runtime: 6.0s, gctime: 1.3s,
>>>> systime:
>>>> >>>>>>> 0.09000s.
>>>> >>>>>>> ... optimise:runtime: 23m27s, gctime: 6.9s, systime:
>>>> 3.2s.
>>>> >>>>>>> ... compile: runtime: 2m33s, gctime: 19.3s, systime:
>>>> 1.2s.
>>>> >>>>>>> eval to_data: runtime: 8m29s, gctime: 51.5s, systime:
>>>> 3.4s.
>>>> >>>>>>> eval data_to_word: runtime: 8m41s, gctime: 50.2s,
>>>> systime:
>>>> >>>>>>> 3.4s.
>>>> >>>>>>>
>>>> >>>>>>> word_to_word: runtime: 58m21s, gctime: 4m28s, systime:
>>>> 28.0s.
>>>> >>>>>>>
>>>> >>>>>>> clash: runtime: 4m58s, gctime: 15.5s, systime: 2.7s.
>>>> >>>>>>> (external oracle) runtime: 2m32s, gctime: 1.8s, systime:
>>>> >>>>>>> 0.24000s.
>>>> >>>>>>>
>>>> >>>>>>> chunk: runtime: 1h13m05s, gctime: 6m06s, systime: 26.9s.
>>>> >>>>>>> (chunk results) runtime: 13m31s, gctime: 2m47s, systime:
>>>> >>>>>>> 14.4s.
>>>> >>>>>>>
>>>> >>>>>>> eval word_to_stack: runtime: 1h32m52s, gctime: 6m12s,
>>>> systime:
>>>> >>>>>>> 12.7s.
>>>> >>>>>>>
>>>> >>>>>>> stack_alloc: runtime: 23m03s, gctime: 42.2s, systime:
>>>> 5.8s.
>>>> >>>>>>> stack_remove: runtime: 19m08s, gctime: 22.7s, systime:
>>>> 4.3s.
>>>> >>>>>>> stack_names: runtime: 15m45s, gctime: 1m05s, systime:
>>>> 5.3s.
>>>> >>>>>>> stack_to_lab: runtime: 41m59s, gctime: 9m22s, systime:
>>>> 8.6s.
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> commit: 9937677
>>>> >>>>>>>
>>>> >>>>>>> eval to_mod: runtime: 1m51s, gctime: 5.9s, systime:
>>>> 0.90000s.
>>>> >>>>>>> eval to_con: runtime: 8m48s, gctime: 4.4s, systime:
>>>> 0.71667s.
>>>> >>>>>>> eval to_dec: runtime: 7.2s, gctime: 2.3s, systime:
>>>> 0.15667s.
>>>> >>>>>>> eval to_exh: runtime: 3m30s, gctime: 9.5s, systime: 1.2s.
>>>> >>>>>>> eval to_pat: runtime: 14.3s, gctime: 1.6s, systime:
>>>> 0.09000s.
>>>> >>>>>>> eval to_clos: runtime: 5.2s, gctime: 0.95667s, systime:
>>>> >>>>>>> 0.07333s.
>>>> >>>>>>> eval to_bvl: runtime: 2m28s, gctime: 5.2s, systime:
>>>> 0.32667s.
>>>> >>>>>>> eval to_bvi: ... inline: runtime: 5.8s, gctime: 0.88000s,
>>>> >>>>>>> systime: 0.05333s.
>>>> >>>>>>> ... optimise: runtime: 23m24s, gctime: 5.8s, systime:
>>>> 3.0s.
>>>> >>>>>>> ... compile: runtime: 2m27s, gctime: 12.5s, systime:
>>>> 0.78333s.
>>>> >>>>>>> eval to_data: runtime: 8m18s, gctime: 34.2s, systime:
>>>> 3.0s.
>>>> >>>>>>> eval data_to_word: runtime: 8m24s, gctime: 37.4s,
>>>> systime:
>>>> >>>>>>> 2.8s.
>>>> >>>>>>>
>>>> >>>>>>> word_to_word: runtime: 57m40s, gctime: 4m56s, systime:
>>>> 30.1s.
>>>> >>>>>>>
>>>> >>>>>>> clash: runtime: 5m12s, gctime: 33.6s, systime: 3.9s.
>>>> >>>>>>> (external oracle) runtime: 2m31s, gctime: 1.8s, systime:
>>>> >>>>>>> 0.15667s.
>>>> >>>>>>>
>>>> >>>>>>> chunk: runtime: 1h07m54s, gctime: 6m07s, systime: 29.5s.
>>>> >>>>>>> (chunk results) runtime: 13m21s, gctime: 2m54s, systime:
>>>> >>>>>>> 14.0s.
>>>> >>>>>>>
>>>> >>>>>>> eval word_to_stack: runtime: 1h34m00s, gctime: 7m32s,
>>>> systime:
>>>> >>>>>>> 8.1s.
>>>> >>>>>>>
>>>> >>>>>>> stack_alloc: runtime: 22m35s, gctime: 18.7s, systime:
>>>> 3.7s.
>>>> >>>>>>> stack_remove: runtime: 19m04s, gctime: 16.7s, systime:
>>>> 3.9s.
>>>> >>>>>>> stack_names: runtime: 15m13s, gctime: 37.8s, systime:
>>>> 4.0s.
>>>> >>>>>>> stack_to_lab: runtime: 36m21s, gctime: 3m29s, systime:
>>>> 13.9s.
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> On 1 September 2016 at 09:04, Ramana Kumar
>>>> >>>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>> >>>>>>>>
>>>> >>>>>>>> Also, updated timings for the middle bit, as of 4d0d56b (which
>>>> is
>>>> >>>>>>>> also for the front half timings in my previous email).
>>>> >>>>>>>>
>>>> >>>>>>>> eval data_to_word: runtime: 7m42s, gctime: 35.7s,
>>>> systime:
>>>> >>>>>>>> 2.7s.
>>>> >>>>>>>>
>>>> >>>>>>>> word_to_word: runtime: 56m07s, gctime: 4m48s, systime:
>>>> 28.7s.
>>>> >>>>>>>>
>>>> >>>>>>>> clash: runtime: 4m48s, gctime: 16.0s, systime: 2.2s.
>>>> >>>>>>>>
>>>> >>>>>>>> (external oracle) runtime: 2m21s, gctime: 1.7s, systime:
>>>> >>>>>>>> 0.10667s.
>>>> >>>>>>>>
>>>> >>>>>>>> chunk: runtime: 1h10m00s, gctime: 6m06s, systime: 27.1s.
>>>> >>>>>>>>
>>>> >>>>>>>> (use chunk) runtime: 12m26s, gctime: 2m29s, systime:
>>>> 11.0s.
>>>> >>>>>>>>
>>>> >>>>>>>> eval word_to_stack: runtime: 1h29m28s, gctime: 7m23s,
>>>> >>>>>>>> systime: 8.0s.
>>>> >>>>>>>>
>>>> >>>>>>>> stack_alloc: runtime: 21m13s, gctime: 24.3s, systime:
>>>> 3.9s.
>>>> >>>>>>>> stack_remove: runtime: 18m12s, gctime: 46.6s, systime:
>>>> 4.3s.
>>>> >>>>>>>> stack_names: runtime: 14m08s, gctime: 32.1s, systime:
>>>> 4.1s.
>>>> >>>>>>>> stack_to_lab: runtime: 36m41s, gctime: 1m19s, systime:
>>>> 10.6s.
>>>> >>>>>>>>
>>>> >>>>>>>> It doesn't look like 91d539b made a huge difference.
>>>> >>>>>>>>
>>>> >>>>>>>> On 1 September 2016 at 08:42, Ramana Kumar
>>>> >>>>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>> >>>>>>>>>
>>>> >>>>>>>>> I have some timings for the front half of the backend now:
>>>> >>>>>>>>> eval to_mod: runtime: 1m45s, gctime: 9.6s, systime:
>>>> >>>>>>>>> 0.40000s.
>>>> >>>>>>>>> eval to_con: runtime: 7m00s, gctime: 5.4s, systime:
>>>> >>>>>>>>> 0.80667s.
>>>> >>>>>>>>> eval to_dec: runtime: 7.0s, gctime: 2.3s, systime:
>>>> 0.13333s.
>>>> >>>>>>>>> eval to_exh: runtime: 3m18s, gctime: 4.5s, systime:
>>>> >>>>>>>>> 0.67000s.
>>>> >>>>>>>>> eval to_pat: runtime: 15.9s, gctime: 3.7s, systime:
>>>> >>>>>>>>> 0.50667s.
>>>> >>>>>>>>> eval to_clos: runtime: 5.0s, gctime: 1.1s, systime:
>>>> >>>>>>>>> 0.07000s.
>>>> >>>>>>>>> eval to_bvl: runtime: 2m18s, gctime: 5.2s, systime:
>>>> >>>>>>>>> 0.36000s.
>>>> >>>>>>>>> eval to_bvi:
>>>> >>>>>>>>> ... inline: runtime: 5.8s, gctime: 1.1s, systime:
>>>> 0.07667s.
>>>> >>>>>>>>> ... optimise: runtime: 22m52s, gctime: 30.1s, systime:
>>>> 6.1s.
>>>> >>>>>>>>> ... compile: runtime: 2m05s, gctime: 6.9s, systime:
>>>> >>>>>>>>> 0.72000s.
>>>> >>>>>>>>> eval to_data: runtime: 5m44s, gctime: 28.8s, systime:
>>>> 2.1s.
>>>> >>>>>>>>>
>>>> >>>>>>>>> I'm not sure why there are suspicious timings for things like
>>>> >>>>>>>>> stack_alloc. One possibility is that the "time" function adds
>>>> up times
>>>> >>>>>>>>> across all parallel threads. In that case, it would make
>>>> sense to divide the
>>>> >>>>>>>>> time by 8. I would need to investigate whether this is
>>>> actually what's going
>>>> >>>>>>>>> on...
>>>> >>>>>>>>>
>>>> >>>>>>>>> On 30 August 2016 at 19:36, Magnus Myreen <
>>>> magnus.myreen at gmail.com>
>>>> >>>>>>>>> wrote:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> My guess is that evaluation times get slower once we pass
>>>> dataLang
>>>> >>>>>>>>>> because the programs become a lot larger once the data
>>>> abstraction
>>>> >>>>>>>>>> has
>>>> >>>>>>>>>> been removed.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> There are however some suspicious looking timings here. I
>>>> note
>>>> >>>>>>>>>> that
>>>> >>>>>>>>>> stack_alloc does near to nothing, but still takes 31 minutes
>>>> to
>>>> >>>>>>>>>> run.
>>>> >>>>>>>>>> Here is stack_alloc's main compiler function:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> https://github.com/CakeML/cakeml/blob/master/compiler/backen
>>>> d/stack_allocScript.sml#L166
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Cheers,
>>>> >>>>>>>>>> Magnus
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> On 30 August 2016 at 03:43, Ramana Kumar
>>>> >>>>>>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>> >>>>>>>>>> > I have some information now on where slow parts of the
>>>> bootstrap
>>>> >>>>>>>>>> > evaluation
>>>> >>>>>>>>>> > are, which we could use to direct effort in case we want to
>>>> >>>>>>>>>> > speed things up.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > Currently I have timings for between dataLang and labLang.
>>>> (I
>>>> >>>>>>>>>> > know the
>>>> >>>>>>>>>> > phases above dataLang are mostly quick, and the ones after
>>>> >>>>>>>>>> > labLang are
>>>> >>>>>>>>>> > mostly slow, but I will have to get actual numbers later)
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > eval data_to_word: runtime: 7m48s, gctime: 21.9s,
>>>> >>>>>>>>>> > systime: 2.1s.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > word_to_word: runtime: 55m07s, gctime: 5m31s,
>>>> systime:
>>>> >>>>>>>>>> > 30.9s.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > clash: runtime: 5m33s, gctime: 34.4s, systime: 4.4s.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > (external oracle) runtime: 2m30s, gctime: 12.1s,
>>>> systime:
>>>> >>>>>>>>>> > 1.5s
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > chunk: runtime: 1h04m14s, gctime: 1m37s, systime:
>>>> 20.8s.
>>>> >>>>>>>>>> > (use results of chunk) runtime: 11m21s, gctime: 2m00s,
>>>> >>>>>>>>>> > systime:
>>>> >>>>>>>>>> > 10.1s.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > eval word_to_stack: runtime: 1h34m21s, gctime: 9m42s,
>>>> >>>>>>>>>> > systime: 9.4s.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > stack_alloc: runtime: 31m55s, gctime: 30.3s,
>>>> systime:
>>>> >>>>>>>>>> > 5.4s.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > stack_remove: runtime: 25m43s, gctime: 1m32s,
>>>> systime:
>>>> >>>>>>>>>> > 6.9s.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > stack_names: runtime: 22m03s, gctime: 3m01s,
>>>> systime:
>>>> >>>>>>>>>> > 6.6s.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > stack_to_lab: runtime: 56m00s, gctime: 4m38s,
>>>> systime:
>>>> >>>>>>>>>> > 23.8s.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > What exactly these different phases are doing can be found
>>>> by
>>>> >>>>>>>>>> > looking at
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > https://github.com/CakeML/cake
>>>> ml/blob/master/compiler/bootstrap/evaluation/to_lab_x64Boots
>>>> trapScript.sml.
>>>> >>>>>>>>>> > I think the one confusingly called "chunk" means checking
>>>> the
>>>> >>>>>>>>>> > results of the
>>>> >>>>>>>>>> > external reg. alloc oracle.
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > _______________________________________________
>>>> >>>>>>>>>> > Dev mailing list
>>>> >>>>>>>>>> > Dev at cakeml.org
>>>> >>>>>>>>>> > https://lists.cakeml.org/listinfo/dev
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> _______________________________________________
>>>> >>>>>>> Dev mailing list
>>>> >>>>>>> Dev at cakeml.org
>>>> >>>>>>> https://lists.cakeml.org/listinfo/dev
>>>> >>>>>>>
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >> _______________________________________________
>>>> >> Developers mailing list
>>>> >> Developers at cakeml.org
>>>> >> https://lists.cakeml.org/listinfo/developers
>>>> >>
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > Developers mailing list
>>>> > Developers at cakeml.org
>>>> > https://lists.cakeml.org/listinfo/developers
>>>> >
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cakeml.org/pipermail/developers/attachments/20161123/d3cb084c/attachment-0001.html>
More information about the Developers
mailing list