[CakeML-dev] [CakeML-Dev] Bootstrap evaluation timings

Ramana Kumar Ramana.Kumar at cl.cam.ac.uk
Tue Dec 13 08:26:31 UTC 2016


I was able to cut about 25 minutes off in total by reworking
compute_labels. Compare the last two runs at
https://cakeml.org/bootstrap-timing.txt (n.b. sec_lengths is gone in
the second).

On 6 December 2016 at 18:19, Ramana Kumar <Ramana.Kumar at cl.cam.ac.uk> wrote:
> https://cakeml.org/bootstrap-timing.txt now shows realtime in addition
> to cpu time, so we can see the effect of parallelism.
>
> In addition, it seems like making compute_labels and section_labels
> tail recursive did not make them faster (in fact maybe made them
> slower) (compare the tail-rec commit, 59ff22c, with the previous one).
> However, I suspect this is because the custom conversion for running
> compute_labels is not as smart as it could be. I might try to speed it
> up. Are there any tricks I should know about apart from trying to
> parallelise things and to avoid too large terms (by making
> abbreviating constants)? Maybe putting the otherwise tail-recursive
> call behind a THENC is a bad idea?
>
> On 28 November 2016 at 10:09, Ramana Kumar <Ramana.Kumar at cl.cam.ac.uk> wrote:
>> I think it's about 10 hours for everything up to the bootstrap evaluation
>> (including the translation). So we're looking at slightly over a day for the
>> full regression. However, these numbers are somewhat inflated because the
>> (par) parts are something like sums of times across 8 threads.
>>
>> On 28 November 2016 at 10:06, Yong Kiam <tanyongkiam at gmail.com> wrote:
>>>
>>> In case anyone is curious, that's about 17.4 hours in total (might be
>>> wrong, I used some quick vim magic to get that number)
>>>
>>> On Sun, Nov 27, 2016 at 5:45 PM, Ramana Kumar <Ramana.Kumar at cl.cam.ac.uk>
>>> wrote:
>>>>
>>>> Latest timings with Yong Kiam's fixes look good.
>>>> https://cakeml.org/bootstrap-timing.txt.
>>>> The slowest evaluations are in encoding, computing labels (they happen
>>>> twice at >1 hour each), and checking jumps (>5 hours).
>>>> Have a look at the times and see if it looks like anything could be sped
>>>> up.
>>>>
>>>> On 27 November 2016 at 11:09, Yong Kiam <tanyongkiam at gmail.com> wrote:
>>>>>
>>>>> Here's a summary of the issue:
>>>>>
>>>>> The latest change to put Equal into data_to_word currently produces
>>>>> unreachable code (GiveUp).
>>>>>
>>>>> These unreachable bits of code goes through wordLang, but the allocator
>>>>> and SSA don't touch it because the code refers to variables that are
>>>>> "uninitialized". This ended up producing massive stack frames in the
>>>>> bootstrap (upwards of 1000 variables on stack) going into word_to_stack
>>>>> because the variables were just passed in as is.
>>>>>
>>>>> My latest commit fixes this in wordLang, and hopefully the time for
>>>>> word_to_stack goes back down to something more reasonable (untested).
>>>>>
>>>>>
>>>>> On Sat, Nov 26, 2016 at 6:11 PM, Ramana Kumar
>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>>>>
>>>>>> It looks like exp_cut did make a difference. Latest timings here:
>>>>>> https://cakeml.org/bootstrap-timing.txt
>>>>>>
>>>>>> Also, the asm_ok check removal has made a difference: sec_ok is only
>>>>>> ~6hours (previously it has always been >10 and sometimes >18), as originally
>>>>>> hoped.
>>>>>>
>>>>>> The longest thing in the latest timings is word_to_stack (8hrs), but as
>>>>>> we've been discussing separately, there are fixes to that coming soon...
>>>>>>
>>>>>> Other things to consider: do any passes look like they're taking
>>>>>> unexpectedly long, or there might be relatively easy tweaks to speed them
>>>>>> up? Could other tweaks to the configuration parameters improve the speed?
>>>>>> (e.g., set inline even lower)
>>>>>>
>>>>>>
>>>>>> On 23 November 2016 at 11:45, Yong Kiam <tanyongkiam at gmail.com> wrote:
>>>>>>>
>>>>>>> I went through the commit history on github, and here is a list of
>>>>>>> changes since 1a939d3 that looked like they might affect something (based on
>>>>>>> the title):
>>>>>>>
>>>>>>> 55d5fc0: Update unverified register allocator
>>>>>>> 63cbd66: Removed asm_ok
>>>>>>> 98188ad: Changed default bvl_to_bvi configurations
>>>>>>> 66ddd3d: exh_reorder
>>>>>>> 8c2f5ae: load_opt
>>>>>>>
>>>>>>> I also found that the eval/x64 directory could probably be deleted
>>>>>>> (the default configs for every backend is also defined in
>>>>>>> targets/configTheory). The only difference in the x64_compiler_config
>>>>>>> defined there and the one in configTheory is pad_bits for data_to_word
>>>>>>> though. I think it is only there because otherwise all the backends have to
>>>>>>> be built to eval anything using the x64 configuration.
>>>>>>>
>>>>>>> I also reduced the value for exp_cut locally on the nqueens benchmark,
>>>>>>> but I didn't any meaningful difference in compile times (the benchmark is
>>>>>>> probably too small though).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 22, 2016 at 4:42 PM, Ramana Kumar
>>>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>>>>>>
>>>>>>>> Preliminary results with inline_size_limit set back down to 3
>>>>>>>> indicate that word_to_stack is still taking longer than before, specifically
>>>>>>>> 3.5 hours as opposed to the 1.5 hours of two months ago. Could the exp_cut
>>>>>>>> parameter also be relevant?
>>>>>>>>
>>>>>>>> On 22 November 2016 at 16:30, Ramana Kumar
>>>>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>>>>>>>
>>>>>>>>> Some updated times (now included program sizes) are up
>>>>>>>>> (https://cakeml.org/bootstrap-timing.txt). Things are still surprisingly
>>>>>>>>> slow (word_to_stack, sec_ok) - guess is still that it's because of inline
>>>>>>>>> limit, so the next thing to try is changing the inline limit..
>>>>>>>>>
>>>>>>>>> Is conf_ok required if I change the inline_limit just for the
>>>>>>>>> bootstrap? Isn't it easy to show that conf_ok isn't affected by that config
>>>>>>>>> field?
>>>>>>>>>
>>>>>>>>> On 18 November 2016 at 10:39, Yong Kiam <tanyongkiam at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> It shouldn't be very hard... I just thought it wouldn't be high
>>>>>>>>>> priority since the remaining ones are only in LabAsms.
>>>>>>>>>>
>>>>>>>>>> Changing the config there seems right. You won't get a conf_ok
>>>>>>>>>> proof about it though.
>>>>>>>>>>
>>>>>>>>>> On Thu, Nov 17, 2016 at 5:46 PM, Ramana Kumar
>>>>>>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I've seen the word_to_stack slowdown twice (although it only shows
>>>>>>>>>>> up in the timings document once - I didn't record the other data).
>>>>>>>>>>>
>>>>>>>>>>> I think removing more asm_ok is always a good idea, but I'm not
>>>>>>>>>>> sure how much priority it should be given - how hard is it to remove more?
>>>>>>>>>>>
>>>>>>>>>>> I guess after the current run finishes I could try changing the
>>>>>>>>>>> inline limit (I would do this by using a different config here
>>>>>>>>>>> https://github.com/CakeML/cakeml/blob/master/compiler/bootstrap/evaluation/to_lab_x64BootstrapScript.sml#L83
>>>>>>>>>>> - does that seem right?)
>>>>>>>>>>>
>>>>>>>>>>> On 18 November 2016 at 09:36, Magnus Myreen
>>>>>>>>>>> <magnus.myreen at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> I think the inline_size_limit of 10 is potentially risky. I
>>>>>>>>>>>> suggest
>>>>>>>>>>>> the bootstrap is run with a lower setting, e.g. 2 or 3. -- Magnus
>>>>>>>>>>>>
>>>>>>>>>>>> On 18 November 2016 at 09:26, Yong Kiam <tanyongkiam at gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> > Argh, that's a shame. Would moving more of asm_ok up help?
>>>>>>>>>>>> > There are still
>>>>>>>>>>>> > some things in LabAsms that can be removed.
>>>>>>>>>>>> >
>>>>>>>>>>>> > I'm not sure know why word_to_stack -specifically- would slow
>>>>>>>>>>>> > down by so
>>>>>>>>>>>> > much. Could it be a one-off thing or did you hit that slowdown
>>>>>>>>>>>> > multiple
>>>>>>>>>>>> > times already?
>>>>>>>>>>>> >
>>>>>>>>>>>> > Another possibility: changes to the default compiler
>>>>>>>>>>>> > configuration could
>>>>>>>>>>>> > have slowed it down.
>>>>>>>>>>>> >
>>>>>>>>>>>> > For example this:
>>>>>>>>>>>> >
>>>>>>>>>>>> > https://github.com/CakeML/cakeml/commit/98188addc34e5c3a034572c10eca7b3b41aff71c
>>>>>>>>>>>> >
>>>>>>>>>>>> > (e.g. maybe more things got inlined)
>>>>>>>>>>>> >
>>>>>>>>>>>> > On Thu, Nov 17, 2016 at 5:09 PM, Ramana Kumar
>>>>>>>>>>>> > <Ramana.Kumar at cl.cam.ac.uk>
>>>>>>>>>>>> > wrote:
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> I will put the timings from various runs of the bootstrap on
>>>>>>>>>>>> >> this web page
>>>>>>>>>>>> >> so we can more easily refer to them:
>>>>>>>>>>>> >> https://cakeml.org/bootstrap-timing.txt.
>>>>>>>>>>>> >> I've added partial timings for a recent revision (still
>>>>>>>>>>>> >> waiting on the
>>>>>>>>>>>> >> rest).
>>>>>>>>>>>> >> There is currently a problem with word_to_stack, which is now
>>>>>>>>>>>> >> taking 5
>>>>>>>>>>>> >> hours where it used to take 1.
>>>>>>>>>>>> >> And given that it seems to be stuck on "sec_ok" I'm wondering
>>>>>>>>>>>> >> how much the
>>>>>>>>>>>> >> asm_ok Asm check removal has really helped... I guess it was
>>>>>>>>>>>> >> supposed to
>>>>>>>>>>>> >> bring it down to 6h, so we'll see...
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> On 29 September 2016 at 10:47, Ramana Kumar
>>>>>>>>>>>> >> <Ramana.Kumar at cl.cam.ac.uk>
>>>>>>>>>>>> >> wrote:
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> I have tried running the "sec_ok" part of the bootstrap with
>>>>>>>>>>>> >>> the asm_ok
>>>>>>>>>>>> >>> checks for Asm lines turned off. Two things to report:
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> The time is significantly improved:
>>>>>>>>>>>> >>> runtime: 6h02m19s,    gctime: 30m06s,     systime: 3m11s.
>>>>>>>>>>>> >>> (compared to previously: 10h46m09s,    gctime: 1h07m37s,
>>>>>>>>>>>> >>> systime:
>>>>>>>>>>>> >>> 5m21s.)
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> Secondly, this timing information is bogus: the wall clock
>>>>>>>>>>>> >>> time was
>>>>>>>>>>>> >>> definitely under 2 hours (I didn't check precisely). I expect
>>>>>>>>>>>> >>> timing for
>>>>>>>>>>>> >>> threads in parallel execution are added sequentially. So for
>>>>>>>>>>>> >>> the parts of
>>>>>>>>>>>> >>> the bootstrap run in parallel, the timings probably need to
>>>>>>>>>>>> >>> be divided by up
>>>>>>>>>>>> >>> to 8 to get an accurate picture. Perhaps I should annotate
>>>>>>>>>>>> >>> which ones are in
>>>>>>>>>>>> >>> parallel and which not.
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> I think it's worthwhile removing the asm_ok checks. I can
>>>>>>>>>>>> >>> help doing so
>>>>>>>>>>>> >>> if someone else sets up some cheated theorems to fill in.
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> On 8 September 2016 at 14:33, Ramana Kumar
>>>>>>>>>>>> >>> <Ramana.Kumar at cl.cam.ac.uk>
>>>>>>>>>>>> >>> wrote:
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> Here are some timings now including the last step
>>>>>>>>>>>> >>>> (x64Bootstrap).
>>>>>>>>>>>> >>>> (The commit number doesn't exactly match the output labels,
>>>>>>>>>>>> >>>> since I
>>>>>>>>>>>> >>>> changed some of that (about to be committed) in the middle.)
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> Here we can see that the biggest problem (>10 hours) is the
>>>>>>>>>>>> >>>> checking
>>>>>>>>>>>> >>>> that all the lines in the final ASM code satisfy
>>>>>>>>>>>> >>>> line_ok_light (i.e., the
>>>>>>>>>>>> >>>> "sec_ok" thing below). But there are a few steps that take
>>>>>>>>>>>> >>>> 1-2 hours each,
>>>>>>>>>>>> >>>> and then a few more that take 20-40 mins, so ... definitely
>>>>>>>>>>>> >>>> more than an
>>>>>>>>>>>> >>>> overnight job still...
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> 1a939d3
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> eval to_mod: runtime: 1m49s,    gctime: 3.4s,     systime:
>>>>>>>>>>>> >>>> 0.35667s.
>>>>>>>>>>>> >>>> eval to_con: runtime: 8m56s,    gctime: 5.5s,     systime:
>>>>>>>>>>>> >>>> 0.84000s.
>>>>>>>>>>>> >>>> eval to_dec: runtime: 7.4s,    gctime: 2.4s,     systime:
>>>>>>>>>>>> >>>> 0.15667s.
>>>>>>>>>>>> >>>> eval to_exh: runtime: 3m32s,    gctime: 8.9s,     systime:
>>>>>>>>>>>> >>>> 0.68667s.
>>>>>>>>>>>> >>>> eval to_pat: runtime: 14.1s,    gctime: 1.4s,     systime:
>>>>>>>>>>>> >>>> 0.06667s.
>>>>>>>>>>>> >>>> eval to_clos: runtime: 5.3s,    gctime: 1.1s,     systime:
>>>>>>>>>>>> >>>> 0.05667s.
>>>>>>>>>>>> >>>> eval to_bvl: runtime: 2m36s,    gctime: 12.6s,     systime:
>>>>>>>>>>>> >>>> 0.88000s.
>>>>>>>>>>>> >>>> eval to_bvi:
>>>>>>>>>>>> >>>> ... inline: runtime: 6.1s,    gctime: 1.2s,     systime:
>>>>>>>>>>>> >>>> 0.09333s.
>>>>>>>>>>>> >>>> ... optimise: runtime: 24m53s,    gctime: 1m53s,
>>>>>>>>>>>> >>>> systime: 17.1s.
>>>>>>>>>>>> >>>> ... compile: runtime: 2m27s,    gctime: 11.9s,     systime:
>>>>>>>>>>>> >>>> 1.5s.
>>>>>>>>>>>> >>>> eval to_data: runtime: 8m11s,    gctime: 21.4s,     systime:
>>>>>>>>>>>> >>>> 1.2s.
>>>>>>>>>>>> >>>> eval data_to_word: runtime: 8m28s,    gctime: 30.4s,
>>>>>>>>>>>> >>>> systime: 2.4s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> word_to_word: runtime: 1h00m52s,    gctime: 5m53s,
>>>>>>>>>>>> >>>> systime: 34.1s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> clash: runtime: 5m06s,    gctime: 22.3s,     systime: 2.8s.
>>>>>>>>>>>> >>>> (external oracle) runtime: 2m31s,    gctime: 1.4s,
>>>>>>>>>>>> >>>> systime:
>>>>>>>>>>>> >>>> 0.77667s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> chunk: runtime: 1h08m52s,    gctime: 6m06s,     systime:
>>>>>>>>>>>> >>>> 28.2s.
>>>>>>>>>>>> >>>> (chunk results) runtime: 11m49s,    gctime: 1m13s,
>>>>>>>>>>>> >>>> systime: 7.2s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> eval word_to_stack: runtime: 1h35m38s,    gctime: 3m53s,
>>>>>>>>>>>> >>>> systime:
>>>>>>>>>>>> >>>> 9.4s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> stack_alloc: runtime: 23m27s,    gctime: 45.7s,     systime:
>>>>>>>>>>>> >>>> 6.7s.
>>>>>>>>>>>> >>>> stack_remove: runtime: 19m15s,    gctime: 21.5s,
>>>>>>>>>>>> >>>> systime: 4.7s.
>>>>>>>>>>>> >>>> stack_names: runtime: 15m20s,    gctime: 24.9s,     systime:
>>>>>>>>>>>> >>>> 3.2s.
>>>>>>>>>>>> >>>> stack_to_lab: runtime: 42m12s,    gctime: 9m27s,
>>>>>>>>>>>> >>>> systime: 10.4s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> filter_skip: runtime: 3m56s,    gctime: 24.7s,     systime:
>>>>>>>>>>>> >>>> 3.2s.
>>>>>>>>>>>> >>>> ffi_limit: runtime: 2m33s,    gctime: 1m39s,     systime:
>>>>>>>>>>>> >>>> 3.2s.
>>>>>>>>>>>> >>>> enc_sec: runtime: 25m01s,    gctime: 1m40s,     systime:
>>>>>>>>>>>> >>>> 11.0s.
>>>>>>>>>>>> >>>> sec_length: runtime: 4m11s,    gctime: 48.9s,     systime:
>>>>>>>>>>>> >>>> 3.5s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> compute_labels: runtime: 1h17m12s,    gctime: 13m50s,
>>>>>>>>>>>> >>>> systime:
>>>>>>>>>>>> >>>> 58.7s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> enc_secs_again: runtime: 2h00m15s,    gctime: 24m27s,
>>>>>>>>>>>> >>>> systime:
>>>>>>>>>>>> >>>> 1m46s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> upd_lab_len: runtime: 5m51s,    gctime: 55.9s,     systime:
>>>>>>>>>>>> >>>> 3.4s.
>>>>>>>>>>>> >>>> sec_length2: runtime: 3m32s,    gctime: 8.4s,     systime:
>>>>>>>>>>>> >>>> 1.2s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> compute_labels2: runtime: 1h27m09s,    gctime: 22m29s,
>>>>>>>>>>>> >>>> systime:
>>>>>>>>>>>> >>>> 1m17s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> enc_secs_again2: runtime: 2h37m41s,    gctime: 1h03m15s,
>>>>>>>>>>>> >>>> systime:
>>>>>>>>>>>> >>>> 3m01s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> pad_sectionruntime: 8m47s,    gctime: 35.8s,     systime:
>>>>>>>>>>>> >>>> 3.8s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> sec_ok: runtime: 10h46m09s,    gctime: 1h07m37s,
>>>>>>>>>>>> >>>> systime: 5m21s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> all_secs_ok: runtime: 4.5s,    gctime: 0.07667s,
>>>>>>>>>>>> >>>> systime: 0.00333s.
>>>>>>>>>>>> >>>> prog_to_bytes: runtime: 7m27s,    gctime: 2m23s,
>>>>>>>>>>>> >>>> systime: 6.3s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> flat_bytes: runtime: 45m21s,    gctime: 8m25s,     systime:
>>>>>>>>>>>> >>>> 8.2s.
>>>>>>>>>>>> >>>> expand_defs: runtime: 49m25s,    gctime: 31m53s,
>>>>>>>>>>>> >>>> systime: 1.6s.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> On 7 September 2016 at 08:43, Ramana Kumar
>>>>>>>>>>>> >>>> <Ramana.Kumar at cl.cam.ac.uk>
>>>>>>>>>>>> >>>> wrote:
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> More timing for a later commit. I believe this one includes
>>>>>>>>>>>> >>>>> Magnus's
>>>>>>>>>>>> >>>>> changes to add compile_seqs, which seem not to have had
>>>>>>>>>>>> >>>>> much effect. (I'm
>>>>>>>>>>>> >>>>> hoping I haven't messed up the timing somehow...  it's
>>>>>>>>>>>> >>>>> surprising how stable
>>>>>>>>>>>> >>>>> they are...)
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> commit: 83b3aaf
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> eval to_mod: runtime: 1m50s,    gctime: 5.8s,     systime:
>>>>>>>>>>>> >>>>> 1.0s.
>>>>>>>>>>>> >>>>> eval to_con: runtime: 8m41s,    gctime: 4.2s,     systime:
>>>>>>>>>>>> >>>>> 0.65000s.
>>>>>>>>>>>> >>>>> eval to_dec: runtime: 11.5s,    gctime: 6.4s,     systime:
>>>>>>>>>>>> >>>>> 0.46333s.
>>>>>>>>>>>> >>>>> eval to_exh: runtime: 3m27s,    gctime: 6.8s,     systime:
>>>>>>>>>>>> >>>>> 0.89000s.
>>>>>>>>>>>> >>>>> eval to_pat: runtime: 14.7s,    gctime: 1.7s,     systime:
>>>>>>>>>>>> >>>>> 0.11667s.
>>>>>>>>>>>> >>>>> eval to_clos: runtime: 5.5s,    gctime: 1.1s,     systime:
>>>>>>>>>>>> >>>>> 0.06333s.
>>>>>>>>>>>> >>>>> eval to_bvl: runtime: 2m29s,    gctime: 5.6s,     systime:
>>>>>>>>>>>> >>>>> 0.28333s.
>>>>>>>>>>>> >>>>> eval to_bvi:
>>>>>>>>>>>> >>>>> ... inline: runtime: 6.3s,    gctime: 1.3s,     systime:
>>>>>>>>>>>> >>>>> 0.06333s.
>>>>>>>>>>>> >>>>> ... optimise: runtime: 23m26s,    gctime: 28.8s,
>>>>>>>>>>>> >>>>> systime: 6.3s.
>>>>>>>>>>>> >>>>> ... compile: runtime: 2m18s,    gctime: 4.1s,     systime:
>>>>>>>>>>>> >>>>> 0.25000s.
>>>>>>>>>>>> >>>>> eval to_data: runtime: 8m23s,    gctime: 44.3s,
>>>>>>>>>>>> >>>>> systime: 3.2s.
>>>>>>>>>>>> >>>>> eval data_to_word: runtime: 8m23s,    gctime: 39.3s,
>>>>>>>>>>>> >>>>> systime: 3.1s.
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> word_to_word: runtime: 58m18s,    gctime: 4m53s,
>>>>>>>>>>>> >>>>> systime: 30.1s.
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> clash: runtime: 5m01s,    gctime: 18.4s,     systime: 2.5s.
>>>>>>>>>>>> >>>>> (external oracle) runtime: 2m31s,    gctime: 1.9s,
>>>>>>>>>>>> >>>>> systime:
>>>>>>>>>>>> >>>>> 0.20667s.
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> chunk: runtime: 1h01m18s,    gctime: 1m13s,     systime:
>>>>>>>>>>>> >>>>> 22.9s.
>>>>>>>>>>>> >>>>> (chunk results) runtime: 14m26s,    gctime: 5m57s,
>>>>>>>>>>>> >>>>> systime: 9.1s.
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> eval word_to_stack: runtime: 1h29m05s,    gctime: 4m38s,
>>>>>>>>>>>> >>>>> systime:
>>>>>>>>>>>> >>>>> 14.4s.
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> stack_alloc: runtime: 23m03s,    gctime: 58.3s,
>>>>>>>>>>>> >>>>> systime: 5.8s.
>>>>>>>>>>>> >>>>> stack_remove: runtime: 20m09s,    gctime: 1m29s,
>>>>>>>>>>>> >>>>> systime: 6.5s.
>>>>>>>>>>>> >>>>> stack_names: runtime: 15m05s,    gctime: 41.6s,
>>>>>>>>>>>> >>>>> systime: 3.9s.
>>>>>>>>>>>> >>>>> stack_to_lab: runtime: 35m11s,    gctime: 2m54s,
>>>>>>>>>>>> >>>>> systime: 12.0s.
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> On 6 September 2016 at 10:11, Yong Kiam
>>>>>>>>>>>> >>>>> <tanyongkiam at gmail.com> wrote:
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>> Hmm, that is surprising. I expected the main function to
>>>>>>>>>>>> >>>>>> have a good
>>>>>>>>>>>> >>>>>> number of If-s (which my change should have helped with),
>>>>>>>>>>>> >>>>>> but it seems like
>>>>>>>>>>>> >>>>>> that didn't affect the time at all.
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>> Here are two possibilities:
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>> 1) Caching lookups into the colouring function for the
>>>>>>>>>>>> >>>>>> chunk step. The
>>>>>>>>>>>> >>>>>> function is defined as an option-lookup into an oracle
>>>>>>>>>>>> >>>>>> sptree:
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>> https://github.com/CakeML/cakeml/blob/9937677d34446adace9f29d5719131f9a5b4aeac/compiler/backend/reg_alloc/reg_allocScript.sml#L110
>>>>>>>>>>>> >>>>>> I'm not sure how much that will help though, since the
>>>>>>>>>>>> >>>>>> variable names
>>>>>>>>>>>> >>>>>> should be in SSA already, so it is mostly looking up
>>>>>>>>>>>> >>>>>> different things each
>>>>>>>>>>>> >>>>>> time.
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>> 2) Make splitting more aggressive, i.e. the main function
>>>>>>>>>>>> >>>>>> should be
>>>>>>>>>>>> >>>>>> divided into more pieces so that its colouring sptree
>>>>>>>>>>>> >>>>>> isn't too big (it gets
>>>>>>>>>>>> >>>>>> larger when there are more variables too).
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>> Also, are you getting some kind of effect where every
>>>>>>>>>>>> >>>>>> parallel thread
>>>>>>>>>>>> >>>>>> is just waiting for one to finish (namely the one working
>>>>>>>>>>>> >>>>>> on the main
>>>>>>>>>>>> >>>>>> function)? 2) might help divide the work more evenly.
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>> On Mon, Sep 5, 2016 at 7:45 PM, Ramana Kumar
>>>>>>>>>>>> >>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> Some more timings, for two recent commits. Not a big
>>>>>>>>>>>> >>>>>>> difference, it
>>>>>>>>>>>> >>>>>>> seems.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> commit: 69ac2f9
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> eval to_mod: runtime: 1m45s,    gctime: 3.8s,
>>>>>>>>>>>> >>>>>>> systime: 0.36667s.
>>>>>>>>>>>> >>>>>>> eval to_con: runtime: 8m36s,    gctime: 5.5s,
>>>>>>>>>>>> >>>>>>> systime: 0.90667s.
>>>>>>>>>>>> >>>>>>> eval to_dec: runtime: 7.3s,    gctime: 2.3s,     systime:
>>>>>>>>>>>> >>>>>>> 0.17000s.
>>>>>>>>>>>> >>>>>>> eval to_exh: runtime: 3m27s,    gctime: 9.5s,
>>>>>>>>>>>> >>>>>>> systime: 1.1s.
>>>>>>>>>>>> >>>>>>> eval to_pat: runtime: 13.9s,    gctime: 1.5s,
>>>>>>>>>>>> >>>>>>> systime: 0.10333s.
>>>>>>>>>>>> >>>>>>> eval to_clos: runtime: 5.2s,    gctime: 1.1s,
>>>>>>>>>>>> >>>>>>> systime: 0.07000s.
>>>>>>>>>>>> >>>>>>> eval to_bvl: runtime: 2m28s,    gctime: 5.5s,
>>>>>>>>>>>> >>>>>>> systime: 0.22667s.
>>>>>>>>>>>> >>>>>>> eval to_bvi: ... inline: runtime: 6.0s,    gctime: 1.3s,
>>>>>>>>>>>> >>>>>>> systime:
>>>>>>>>>>>> >>>>>>> 0.09000s.
>>>>>>>>>>>> >>>>>>> ... optimise:runtime: 23m27s,    gctime: 6.9s,
>>>>>>>>>>>> >>>>>>> systime: 3.2s.
>>>>>>>>>>>> >>>>>>> ... compile: runtime: 2m33s,    gctime: 19.3s,
>>>>>>>>>>>> >>>>>>> systime: 1.2s.
>>>>>>>>>>>> >>>>>>> eval to_data: runtime: 8m29s,    gctime: 51.5s,
>>>>>>>>>>>> >>>>>>> systime: 3.4s.
>>>>>>>>>>>> >>>>>>> eval data_to_word: runtime: 8m41s,    gctime: 50.2s,
>>>>>>>>>>>> >>>>>>> systime:
>>>>>>>>>>>> >>>>>>> 3.4s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> word_to_word: runtime: 58m21s,    gctime: 4m28s,
>>>>>>>>>>>> >>>>>>> systime: 28.0s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> clash: runtime: 4m58s,    gctime: 15.5s,     systime:
>>>>>>>>>>>> >>>>>>> 2.7s.
>>>>>>>>>>>> >>>>>>> (external oracle) runtime: 2m32s,    gctime: 1.8s,
>>>>>>>>>>>> >>>>>>> systime:
>>>>>>>>>>>> >>>>>>> 0.24000s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> chunk: runtime: 1h13m05s,    gctime: 6m06s,     systime:
>>>>>>>>>>>> >>>>>>> 26.9s.
>>>>>>>>>>>> >>>>>>> (chunk results) runtime: 13m31s,    gctime: 2m47s,
>>>>>>>>>>>> >>>>>>> systime:
>>>>>>>>>>>> >>>>>>> 14.4s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> eval word_to_stack: runtime: 1h32m52s,    gctime: 6m12s,
>>>>>>>>>>>> >>>>>>> systime:
>>>>>>>>>>>> >>>>>>> 12.7s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> stack_alloc: runtime: 23m03s,    gctime: 42.2s,
>>>>>>>>>>>> >>>>>>> systime: 5.8s.
>>>>>>>>>>>> >>>>>>> stack_remove: runtime: 19m08s,    gctime: 22.7s,
>>>>>>>>>>>> >>>>>>> systime: 4.3s.
>>>>>>>>>>>> >>>>>>> stack_names: runtime: 15m45s,    gctime: 1m05s,
>>>>>>>>>>>> >>>>>>> systime: 5.3s.
>>>>>>>>>>>> >>>>>>> stack_to_lab: runtime: 41m59s,    gctime: 9m22s,
>>>>>>>>>>>> >>>>>>> systime: 8.6s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> commit: 9937677
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> eval to_mod: runtime: 1m51s,    gctime: 5.9s,
>>>>>>>>>>>> >>>>>>> systime: 0.90000s.
>>>>>>>>>>>> >>>>>>> eval to_con: runtime: 8m48s,    gctime: 4.4s,
>>>>>>>>>>>> >>>>>>> systime: 0.71667s.
>>>>>>>>>>>> >>>>>>> eval to_dec: runtime: 7.2s,    gctime: 2.3s,     systime:
>>>>>>>>>>>> >>>>>>> 0.15667s.
>>>>>>>>>>>> >>>>>>> eval to_exh: runtime: 3m30s,    gctime: 9.5s,
>>>>>>>>>>>> >>>>>>> systime: 1.2s.
>>>>>>>>>>>> >>>>>>> eval to_pat: runtime: 14.3s,    gctime: 1.6s,
>>>>>>>>>>>> >>>>>>> systime: 0.09000s.
>>>>>>>>>>>> >>>>>>> eval to_clos: runtime: 5.2s,    gctime: 0.95667s,
>>>>>>>>>>>> >>>>>>> systime:
>>>>>>>>>>>> >>>>>>> 0.07333s.
>>>>>>>>>>>> >>>>>>> eval to_bvl: runtime: 2m28s,    gctime: 5.2s,
>>>>>>>>>>>> >>>>>>> systime: 0.32667s.
>>>>>>>>>>>> >>>>>>> eval to_bvi: ... inline: runtime: 5.8s,    gctime:
>>>>>>>>>>>> >>>>>>> 0.88000s,
>>>>>>>>>>>> >>>>>>> systime: 0.05333s.
>>>>>>>>>>>> >>>>>>> ... optimise: runtime: 23m24s,    gctime: 5.8s,
>>>>>>>>>>>> >>>>>>> systime: 3.0s.
>>>>>>>>>>>> >>>>>>> ... compile: runtime: 2m27s,    gctime: 12.5s,
>>>>>>>>>>>> >>>>>>> systime: 0.78333s.
>>>>>>>>>>>> >>>>>>> eval to_data: runtime: 8m18s,    gctime: 34.2s,
>>>>>>>>>>>> >>>>>>> systime: 3.0s.
>>>>>>>>>>>> >>>>>>> eval data_to_word: runtime: 8m24s,    gctime: 37.4s,
>>>>>>>>>>>> >>>>>>> systime:
>>>>>>>>>>>> >>>>>>> 2.8s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> word_to_word: runtime: 57m40s,    gctime: 4m56s,
>>>>>>>>>>>> >>>>>>> systime: 30.1s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> clash: runtime: 5m12s,    gctime: 33.6s,     systime:
>>>>>>>>>>>> >>>>>>> 3.9s.
>>>>>>>>>>>> >>>>>>> (external oracle) runtime: 2m31s,    gctime: 1.8s,
>>>>>>>>>>>> >>>>>>> systime:
>>>>>>>>>>>> >>>>>>> 0.15667s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> chunk: runtime: 1h07m54s,    gctime: 6m07s,     systime:
>>>>>>>>>>>> >>>>>>> 29.5s.
>>>>>>>>>>>> >>>>>>> (chunk results) runtime: 13m21s,    gctime: 2m54s,
>>>>>>>>>>>> >>>>>>> systime:
>>>>>>>>>>>> >>>>>>> 14.0s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> eval word_to_stack: runtime: 1h34m00s,    gctime: 7m32s,
>>>>>>>>>>>> >>>>>>> systime:
>>>>>>>>>>>> >>>>>>> 8.1s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> stack_alloc: runtime: 22m35s,    gctime: 18.7s,
>>>>>>>>>>>> >>>>>>> systime: 3.7s.
>>>>>>>>>>>> >>>>>>> stack_remove: runtime: 19m04s,    gctime: 16.7s,
>>>>>>>>>>>> >>>>>>> systime: 3.9s.
>>>>>>>>>>>> >>>>>>> stack_names: runtime: 15m13s,    gctime: 37.8s,
>>>>>>>>>>>> >>>>>>> systime: 4.0s.
>>>>>>>>>>>> >>>>>>> stack_to_lab: runtime: 36m21s,    gctime: 3m29s,
>>>>>>>>>>>> >>>>>>> systime: 13.9s.
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> On 1 September 2016 at 09:04, Ramana Kumar
>>>>>>>>>>>> >>>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> Also, updated timings for the middle bit, as of 4d0d56b
>>>>>>>>>>>> >>>>>>>> (which is
>>>>>>>>>>>> >>>>>>>> also for the front half timings in my previous email).
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> eval data_to_word: runtime: 7m42s,    gctime: 35.7s,
>>>>>>>>>>>> >>>>>>>> systime:
>>>>>>>>>>>> >>>>>>>> 2.7s.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> word_to_word: runtime: 56m07s,    gctime: 4m48s,
>>>>>>>>>>>> >>>>>>>> systime: 28.7s.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> clash: runtime: 4m48s,    gctime: 16.0s,     systime:
>>>>>>>>>>>> >>>>>>>> 2.2s.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> (external oracle) runtime: 2m21s,    gctime: 1.7s,
>>>>>>>>>>>> >>>>>>>> systime:
>>>>>>>>>>>> >>>>>>>> 0.10667s.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> chunk: runtime: 1h10m00s,    gctime: 6m06s,     systime:
>>>>>>>>>>>> >>>>>>>> 27.1s.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> (use chunk) runtime: 12m26s,    gctime: 2m29s,
>>>>>>>>>>>> >>>>>>>> systime: 11.0s.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> eval word_to_stack: runtime: 1h29m28s,    gctime: 7m23s,
>>>>>>>>>>>> >>>>>>>> systime: 8.0s.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> stack_alloc: runtime: 21m13s,    gctime: 24.3s,
>>>>>>>>>>>> >>>>>>>> systime: 3.9s.
>>>>>>>>>>>> >>>>>>>> stack_remove: runtime: 18m12s,    gctime: 46.6s,
>>>>>>>>>>>> >>>>>>>> systime: 4.3s.
>>>>>>>>>>>> >>>>>>>> stack_names: runtime: 14m08s,    gctime: 32.1s,
>>>>>>>>>>>> >>>>>>>> systime: 4.1s.
>>>>>>>>>>>> >>>>>>>> stack_to_lab: runtime: 36m41s,    gctime: 1m19s,
>>>>>>>>>>>> >>>>>>>> systime: 10.6s.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> It doesn't look like 91d539b made a huge difference.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>> On 1 September 2016 at 08:42, Ramana Kumar
>>>>>>>>>>>> >>>>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>> I have some timings for the front half of the backend
>>>>>>>>>>>> >>>>>>>>> now:
>>>>>>>>>>>> >>>>>>>>> eval to_mod: runtime: 1m45s,    gctime: 9.6s,
>>>>>>>>>>>> >>>>>>>>> systime:
>>>>>>>>>>>> >>>>>>>>> 0.40000s.
>>>>>>>>>>>> >>>>>>>>> eval to_con: runtime: 7m00s,    gctime: 5.4s,
>>>>>>>>>>>> >>>>>>>>> systime:
>>>>>>>>>>>> >>>>>>>>> 0.80667s.
>>>>>>>>>>>> >>>>>>>>> eval to_dec: runtime: 7.0s,    gctime: 2.3s,
>>>>>>>>>>>> >>>>>>>>> systime: 0.13333s.
>>>>>>>>>>>> >>>>>>>>> eval to_exh: runtime: 3m18s,    gctime: 4.5s,
>>>>>>>>>>>> >>>>>>>>> systime:
>>>>>>>>>>>> >>>>>>>>> 0.67000s.
>>>>>>>>>>>> >>>>>>>>> eval to_pat: runtime: 15.9s,    gctime: 3.7s,
>>>>>>>>>>>> >>>>>>>>> systime:
>>>>>>>>>>>> >>>>>>>>> 0.50667s.
>>>>>>>>>>>> >>>>>>>>> eval to_clos: runtime: 5.0s,    gctime: 1.1s,
>>>>>>>>>>>> >>>>>>>>> systime:
>>>>>>>>>>>> >>>>>>>>> 0.07000s.
>>>>>>>>>>>> >>>>>>>>> eval to_bvl: runtime: 2m18s,    gctime: 5.2s,
>>>>>>>>>>>> >>>>>>>>> systime:
>>>>>>>>>>>> >>>>>>>>> 0.36000s.
>>>>>>>>>>>> >>>>>>>>> eval to_bvi:
>>>>>>>>>>>> >>>>>>>>> ... inline: runtime: 5.8s,    gctime: 1.1s,
>>>>>>>>>>>> >>>>>>>>> systime: 0.07667s.
>>>>>>>>>>>> >>>>>>>>> ... optimise: runtime: 22m52s,    gctime: 30.1s,
>>>>>>>>>>>> >>>>>>>>> systime: 6.1s.
>>>>>>>>>>>> >>>>>>>>> ... compile: runtime: 2m05s,    gctime: 6.9s,
>>>>>>>>>>>> >>>>>>>>> systime:
>>>>>>>>>>>> >>>>>>>>> 0.72000s.
>>>>>>>>>>>> >>>>>>>>> eval to_data: runtime: 5m44s,    gctime: 28.8s,
>>>>>>>>>>>> >>>>>>>>> systime: 2.1s.
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>> I'm not sure why there are suspicious timings for
>>>>>>>>>>>> >>>>>>>>> things like
>>>>>>>>>>>> >>>>>>>>> stack_alloc. One possibility is that the "time"
>>>>>>>>>>>> >>>>>>>>> function adds up times
>>>>>>>>>>>> >>>>>>>>> across all parallel threads. In that case, it would
>>>>>>>>>>>> >>>>>>>>> make sense to divide the
>>>>>>>>>>>> >>>>>>>>> time by 8. I would need to investigate whether this is
>>>>>>>>>>>> >>>>>>>>> actually what's going
>>>>>>>>>>>> >>>>>>>>> on...
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>> On 30 August 2016 at 19:36, Magnus Myreen
>>>>>>>>>>>> >>>>>>>>> <magnus.myreen at gmail.com>
>>>>>>>>>>>> >>>>>>>>> wrote:
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> My guess is that evaluation times get slower once we
>>>>>>>>>>>> >>>>>>>>>> pass dataLang
>>>>>>>>>>>> >>>>>>>>>> because the programs become a lot larger once the data
>>>>>>>>>>>> >>>>>>>>>> abstraction
>>>>>>>>>>>> >>>>>>>>>> has
>>>>>>>>>>>> >>>>>>>>>> been removed.
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> There are however some suspicious looking timings
>>>>>>>>>>>> >>>>>>>>>> here. I note
>>>>>>>>>>>> >>>>>>>>>> that
>>>>>>>>>>>> >>>>>>>>>> stack_alloc does near to nothing, but still takes 31
>>>>>>>>>>>> >>>>>>>>>> minutes to
>>>>>>>>>>>> >>>>>>>>>> run.
>>>>>>>>>>>> >>>>>>>>>> Here is stack_alloc's main compiler function:
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> https://github.com/CakeML/cakeml/blob/master/compiler/backend/stack_allocScript.sml#L166
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> Cheers,
>>>>>>>>>>>> >>>>>>>>>> Magnus
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> On 30 August 2016 at 03:43, Ramana Kumar
>>>>>>>>>>>> >>>>>>>>>> <Ramana.Kumar at cl.cam.ac.uk> wrote:
>>>>>>>>>>>> >>>>>>>>>> > I have some information now on where slow parts of
>>>>>>>>>>>> >>>>>>>>>> > the bootstrap
>>>>>>>>>>>> >>>>>>>>>> > evaluation
>>>>>>>>>>>> >>>>>>>>>> > are, which we could use to direct effort in case we
>>>>>>>>>>>> >>>>>>>>>> > want to
>>>>>>>>>>>> >>>>>>>>>> > speed things up.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > Currently I have timings for between dataLang and
>>>>>>>>>>>> >>>>>>>>>> > labLang. (I
>>>>>>>>>>>> >>>>>>>>>> > know the
>>>>>>>>>>>> >>>>>>>>>> > phases above dataLang are mostly quick, and the ones
>>>>>>>>>>>> >>>>>>>>>> > after
>>>>>>>>>>>> >>>>>>>>>> > labLang are
>>>>>>>>>>>> >>>>>>>>>> > mostly slow, but I will have to get actual numbers
>>>>>>>>>>>> >>>>>>>>>> > later)
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > eval data_to_word: runtime: 7m48s,    gctime: 21.9s,
>>>>>>>>>>>> >>>>>>>>>> > systime: 2.1s.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > word_to_word: runtime: 55m07s,    gctime: 5m31s,
>>>>>>>>>>>> >>>>>>>>>> > systime:
>>>>>>>>>>>> >>>>>>>>>> > 30.9s.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > clash: runtime: 5m33s,    gctime: 34.4s,
>>>>>>>>>>>> >>>>>>>>>> > systime: 4.4s.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > (external oracle) runtime: 2m30s,    gctime: 12.1s,
>>>>>>>>>>>> >>>>>>>>>> > systime:
>>>>>>>>>>>> >>>>>>>>>> > 1.5s
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > chunk: runtime: 1h04m14s,    gctime: 1m37s,
>>>>>>>>>>>> >>>>>>>>>> > systime: 20.8s.
>>>>>>>>>>>> >>>>>>>>>> > (use results of chunk) runtime: 11m21s,    gctime:
>>>>>>>>>>>> >>>>>>>>>> > 2m00s,
>>>>>>>>>>>> >>>>>>>>>> > systime:
>>>>>>>>>>>> >>>>>>>>>> > 10.1s.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > eval word_to_stack: runtime: 1h34m21s,    gctime:
>>>>>>>>>>>> >>>>>>>>>> > 9m42s,
>>>>>>>>>>>> >>>>>>>>>> > systime: 9.4s.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > stack_alloc: runtime: 31m55s,    gctime: 30.3s,
>>>>>>>>>>>> >>>>>>>>>> > systime:
>>>>>>>>>>>> >>>>>>>>>> > 5.4s.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > stack_remove: runtime: 25m43s,    gctime: 1m32s,
>>>>>>>>>>>> >>>>>>>>>> > systime:
>>>>>>>>>>>> >>>>>>>>>> > 6.9s.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > stack_names: runtime: 22m03s,    gctime: 3m01s,
>>>>>>>>>>>> >>>>>>>>>> > systime:
>>>>>>>>>>>> >>>>>>>>>> > 6.6s.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > stack_to_lab: runtime: 56m00s,    gctime: 4m38s,
>>>>>>>>>>>> >>>>>>>>>> > systime:
>>>>>>>>>>>> >>>>>>>>>> > 23.8s.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > What exactly these different phases are doing can be
>>>>>>>>>>>> >>>>>>>>>> > found by
>>>>>>>>>>>> >>>>>>>>>> > looking at
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > https://github.com/CakeML/cakeml/blob/master/compiler/bootstrap/evaluation/to_lab_x64BootstrapScript.sml.
>>>>>>>>>>>> >>>>>>>>>> > I think the one confusingly called "chunk" means
>>>>>>>>>>>> >>>>>>>>>> > checking the
>>>>>>>>>>>> >>>>>>>>>> > results of the
>>>>>>>>>>>> >>>>>>>>>> > external reg. alloc oracle.
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>> > _______________________________________________
>>>>>>>>>>>> >>>>>>>>>> > Dev mailing list
>>>>>>>>>>>> >>>>>>>>>> > Dev at cakeml.org
>>>>>>>>>>>> >>>>>>>>>> > https://lists.cakeml.org/listinfo/dev
>>>>>>>>>>>> >>>>>>>>>> >
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>> _______________________________________________
>>>>>>>>>>>> >>>>>>> Dev mailing list
>>>>>>>>>>>> >>>>>>> Dev at cakeml.org
>>>>>>>>>>>> >>>>>>> https://lists.cakeml.org/listinfo/dev
>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> _______________________________________________
>>>>>>>>>>>> >> Developers mailing list
>>>>>>>>>>>> >> Developers at cakeml.org
>>>>>>>>>>>> >> https://lists.cakeml.org/listinfo/developers
>>>>>>>>>>>> >>
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> > _______________________________________________
>>>>>>>>>>>> > Developers mailing list
>>>>>>>>>>>> > Developers at cakeml.org
>>>>>>>>>>>> > https://lists.cakeml.org/listinfo/developers
>>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>



More information about the Developers mailing list