[CakeML] cakeML build fails

Magnus Myreen magnus.myreen at cl.cam.ac.uk
Tue Jan 14 22:20:06 UTC 2014


I guess it is possible to compose: the translator's certificate
theorem, Ramana's compiler correctness theorem and my verified
bytecode-to-machine-code mapping. The result is a theorem that relates
the original function (fact) with the generated machine code w.r.t.
the ISA. However, the theorem would come with a rather complicated
precondition saying that the compiler's invariant must be met, and
that the initial state of the machine code must be set up to have an
ML heap correctly represented in memory+registers and have the
runtimes helper routines (including the GC) hanging around in the
machine state.

Cheers,
Magnus

On 15 January 2014 08:55, Konrad Slind <konrad.slind at gmail.com> wrote:
> Specific application: one example would be a guard (a HOL function)
> generated from a Guardol program. The input and output for the guard
> are supplied by other code or hardware. My specification wouldn't
> mention I/O at all. All I would want is some kind of link between the
> execution
> of the ISA code and the original function.
>
> Konrad.
>
>
>
> On Tue, Jan 14, 2014 at 3:24 PM, Magnus Myreen <magnus.myreen at cl.cam.ac.uk>
> wrote:
>>
>> On 15 January 2014 08:09, Konrad Slind <konrad.slind at gmail.com> wrote:
>> > Thanks Magnus. I will do as you suggest.
>> >
>> > I haven't absorbed your last point yet, so this is half-baked,
>> > but I can imagine using the compiler to generate executables
>> > that get linked in with I/O components coming from (say) a pre-existing
>> > C library, so that the repl is not involved at all. Is that a possbile
>> > scenario for cakeML?
>>
>> Not currently, no. We aren't able to produce stand-alone executables.
>> The invariants haven't been set up for the compiler to be used as a
>> stand-alone thing. Hmmm... Do you have some specific application in
>> mind? I'm guessing you'd like a foreign function interface.
>>
>> Regarding pre-existing C libraries: the current implementation of
>> CakeML uses C's getc and putc for I/O, but these are code pointers
>> that are passed into the verified blob of machine code and the proofs
>> make assumptions about these code pointers (e.g. a call to putc puts
>> the input char onto some output stream and then returns control
>> according to the C calling convention). The C code that wraps the
>> verified code (asm_code.s) is here, in case you want to see it:
>>
>> https://github.com/xrchz/vml/blob/master/x86-64/wrapper/wrapper.c
>>
>> Cheers,
>> Magnus
>>
>> > Cheers,
>> > Konrad.
>> >
>> >
>> >
>> > On Tue, Jan 14, 2014 at 2:51 PM, Magnus Myreen
>> > <magnus.myreen at cl.cam.ac.uk>
>> > wrote:
>> >>
>> >> Hi Konrad,
>> >>
>> >> >   I've downloaded cakeML from git and have been building it. I was
>> >> > warned
>> >> > that I needed 16 GB to do everything, so I expected failure. But I
>> >> > wanted
>> >> > to see how far it got before crashing. Here's the error message:
>> >>
>> >> As far as I know, the full bootstrap has only been successfully run on
>> >> a 64 GB machine, but maybe Ramana had some data suggesting 16 GB would
>> >> is enough.
>> >>
>> >> > 294 declarations still to go
>> >> > 0: (val opn_to_prim2)0 (val div_exc_cn)1 (val cdiv_exc)2 (val
>> >> > eq_exc_cn)3
>> >> > (val ceq_exc)4 (fun pat_to_cpat*)5 (fun exp_to_cexp*)6 (datatype
>> >> > call_context)7 (fun lunion)8 (val lshift)9 (fun free_vars*)10 (val
>> >> > bind_fv)11 (fun label_closures*)12 (val get_label)13 (val pushret)14
>> >> > (fun
>> >> > compile_envref)15 (val compile_varref)16 (val el_check)17 (val
>> >> > bool_to_tag)18 (val unit_tag)19 runtime: 21m34s,    gctime: 21m07s,
>> >> > systime: 2m40s.
>> >> > 1: (val block_tag)0 (fun num_fold)1 (val push_lab)2 (val emit_ceref)3
>> >> > (val
>> >> > emit_ceenv)4 (val closure_tag)5 (val cons_closure)6 (val
>> >> > update_refptr)7
>> >> > (val compile_closures)8 (val prim1_to_bc)9 (val prim2_to_bc)10 (fun
>> >> > compile*)Warning - Unable to increase stack - interrupting thread
>> >> > runtime: 17m06s,    gctime: 16m47s,     systime: 3m42s.
>> >> > Exception- Interrupt raised
>> >> > Holmake: Failed script build for compileReplDecsScript - exited with
>> >> > code 1
>> >> > make: *** [all] Error 1
>> >> >
>> >> >
>> >> > Questions:
>> >> >
>> >> > 1. Is this the expected place of failure? (This is on an 8 GB box
>> >> > running
>> >> > PolyML 5.5).
>> >>
>> >> That looks like the expected choke point.
>> >>
>> >> > 2. I just invoked "make" in the top level. What is the invocation if
>> >> > I
>> >> > wanted to avoid
>> >> >     the memory-expensive bootstrap?
>> >>
>> >> You'll need to replace
>> >>
>> >> vml/bootstrap/compileReplDecsScript.sml
>> >>
>> >> with
>> >>
>> >> vml/bootstrap/compileReplDecsCheatScript.sml
>> >>
>> >> and also delete "Cheat" in the second line in the file, i.e. the
>> >> new_theory command.
>> >>
>> >> > 2a. The build needs lem. This should be mentioned.
>> >>
>> >> Yes, lem is required for silly Holmake reasons.
>> >>
>> >> > 2b. In lem, the hol-lib directory has a spurious file called
>> >> > stringScript.sml that
>> >> >       derails the build. I've talked to Scott about this already.
>> >> >
>> >> > 3. I would like to explore the verification aspects of the system. In
>> >> > particular, suppose
>> >> >     I want to define (executable) functions in HOL and prove
>> >> > properties,
>> >> > then somehow
>> >> >     map the functions to cakeML and get some sort of
>> >> > "correctness-of-compilation"
>> >> >     theorem out that I can use to make a formal connection between
>> >> > the
>> >> > properties, the
>> >> >     function, and the executable code.
>> >> >
>> >> >     For example, I define factorial and prove that it is always
>> >> > positive:
>> >> >
>> >> >       val fact_def = Define `fact n = if n <= 0 then 1 else n * fact
>> >> > (n-1)`;
>> >> >       (* The fact function could be over integers, if need be. *)
>> >> >
>> >> >       val fact_pos = prove(``!n. 0 < fact n``, ....);
>> >> >
>> >> >       ... and now what steps do I take? I have read the relevant
>> >> > papers
>> >> > by
>> >> > Magnus and
>> >> >       others, but now I am seeking more concrete hints.
>> >>
>> >> To turn your fact function into CakeML, you'll need to run it through
>> >> the translator, i.e. write a file like
>> >>
>> >>   vml/translator/ml_translator_demoScript.sml
>> >>
>> >> for your fact function, which can use num. The translator turns HOL
>> >> ``:num`` into CakeML ints and maintains an invariant that the CakeML
>> >> ints are not negative.
>> >>
>> >> To reason about the machine code or bytecode that executes this CakeML
>> >> is a bit more tricky. The top-level theorem makes a statement about
>> >> the whole system in terms of an input stream of characters and an
>> >> output stream of characters. I guess we'd need a verified pretty
>> >> printer to turn CakeML AST into a string that is guaranteed to parse
>> >> back as that CakeML AST. Alternatively, one can try to evaluate (in
>> >> the logic) a the application of the parser to a string representing
>> >> that AST and check that the parser maps that string into the desired
>> >> AST. Then you'll need to somehow talk about your fact function in the
>> >> context of all the other code that gets input during execution of the
>> >> read-eval-print loop. The final theorem would be about the output
>> >> string and terminate/diverge behaviour of the entire system with the
>> >> fact function fed in on the input stream.
>> >>
>> >> I'm rather vague about the last parts because we haven't yet worked
>> >> through all the details. But we aim to do so in conjunction with the
>> >> verified HOL light case study.
>> >>
>> >> Cheers,
>> >> Magnus
>> >>
>> >> > Thanks,
>> >> > Konrad.
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Users mailing list
>> >> > Users at cakeml.org
>> >> > https://lists.cakeml.org/listinfo/users
>> >> >
>> >
>> >
>
>



More information about the Users mailing list