<div dir="auto">I think the bounds checks are more complicated than that because BoundsCheckByte checks strict < the array length, but for offset+length you want <=. Could there easily be a version that checks less or equal instead?<div dir="auto"><br></div><div dir="auto">Yes I'll try to factor out a helper for the take/drop logic. The difference between chars and bytes makes it more complicated than you might have hoped though. I guess I should map everything into the target type before calling the helper.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On 20 Mar 2017 6:11 pm, "Magnus Myreen" <<a href="mailto:magnus.myreen@gmail.com">magnus.myreen@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 20 March 2017 at 07:59, Ramana Kumar <<a href="mailto:Ramana.Kumar@cl.cam.ac.uk">Ramana.Kumar@cl.cam.ac.uk</a>> wrote:<br>

> I've just committed semantics for the primitives for strings/bytearrays,<br>

> which you can see here:<br>

> <a href="https://github.com/CakeML/cakeml/blob/strcat/semantics/semanticPrimitives.lem#L509" rel="noreferrer" target="_blank">https://github.com/CakeML/<wbr>cakeml/blob/strcat/semantics/<wbr>semanticPrimitives.lem#L509</a><br>

><br>

> I'd really like to get these right before pushing them through, since it's<br>

> quite a lot of work.<br>

> I have a few concerns currently:<br>

><br>

> 1. It's not obvious that the bounds checking can be done efficiently. The<br>

> BoundsCheckByte primitive doesn't work neatly with checking whether an<br>

> offset+length pair is correct, so I'd have to do some special casing, e.g.,<br>

> when the length = 0. Should we have a different semantics which allows<br>

> negative or too long lengths and does something reasonable with them? Or<br>

> could there be another efficient operation like BoundsCheckByte for checking<br>

> an offset+length?<br>

<br>

If negative lengths are allowed, then the bounds check for such copies could be:<br>

<br>

  BoundsCheckByte offset andalso BoundsCheckByte (offset+length)<br>

<br>

otherwise:<br>

<br>

  BoundsCheckByte (offset + length) andalso 0 <= offset * length<br>

<br>

> 2. It takes work to add new primitives (update type system, inferencer, all<br>

> backend do_app proofs..), and these 4 are reasonably complicated despite<br>

> being quite similar to each other. Could we get away with a smaller or<br>

> simpler set? E.g., maybe we only really need CopyAw8Aw8 but give it a<br>

> flexible semantics taking either bytearrays or strings? I don't know if that<br>

> can work though.<br>

><br>

> If we add similar primitives for polymorphic arrays and vectors, it would be<br>

> good to factor out as much commonality as possible in the semantics... Sadly<br>

> it looks like a lot of copy-paste-tweaking is inevitable... I'd rather only<br>

> add the polymorphic ones after the byte ones are done, just to avoid too<br>

> much explosion of work.<br>

<br>

The difference between byte and value vectors/arrays is the values<br>

they carry, but the indexing and chopping the input is the same for<br>

all of them. Could you not move that logic into a helper function that<br>

is then used throughout many intermediate languages and also for both<br>

byte and value vectors/arrays?<br>

<br>

Cheers,<br>

Magnus<br>

<br>

> On 20 March 2017 at 17:32, Magnus Myreen <<a href="mailto:magnus.myreen@gmail.com">magnus.myreen@gmail.com</a>> wrote:<br>

>><br>

>> Sounds good to me. One thing to note: the semantics should read all<br>

>> the values before writing them. This is so that we give the obviously<br>

>> correct semantics to copying inside the same array. The implementation<br>

>> will need to check whether it should do left-to-right or right-to-left<br>

>> copying so that the code doesn't accidentally read from what it just<br>

>> wrote.<br>

>><br>

>> Also, as I pointed out in my other email, I think these memcpy<br>

>> primitives should also exist for normal value arrays.<br>

>><br>

>> Cheers,<br>

>> Magnus<br>

>><br>

>> On 20 March 2017 at 06:29, Ramana Kumar <<a href="mailto:Ramana.Kumar@cl.cam.ac.uk">Ramana.Kumar@cl.cam.ac.uk</a>> wrote:<br>

>> > I've been rethinking these primitives after the discussion at the last<br>

>> > hangout, and have come up with a different set altogether. Can you see a<br>

>> > simpler or more elegant approach to the one described below?<br>

>> ><br>

>> > Here is the new approach I am considering:<br>

>> ><br>

>> > 4 copying primitives in the source language going from string/bytearray<br>

>> > to<br>

>> > string/bytearray.<br>

>> > The source comes with an offset and a length to copy.<br>

>> > If the destination is a string, a new string is created. If the<br>

>> > destination<br>

>> > is a bytearray, it must be provided.<br>

>> ><br>

>> > Concatenation (of lists of strings/arrays) and conversions between<br>

>> > (whole)<br>

>> > strings and (whole) bytearrays can be implemented in the basis library<br>

>> > in<br>

>> > terms of these primitives. And the primitives should be efficiently<br>

>> > implementable in terms of a byte-based memcpy primitive further down.<br>

>> > (There<br>

>> > will need to be bounds checking in the source-level semantics (i.e.,<br>

>> > Subscript exception can be raised), and this will sometimes be<br>

>> > unfortunate<br>

>> > (i.e., when obviously in bounds), but I don't think this is too costly.)<br>

>> ><br>

>> > On 14 March 2017 at 16:52, Ramana Kumar <<a href="mailto:Ramana.Kumar@cl.cam.ac.uk">Ramana.Kumar@cl.cam.ac.uk</a>><br>

>> > wrote:<br>

>> >><br>

>> >> Hi all,<br>

>> >><br>

>> >> I've started adding string/bytearray conversion and concatenation<br>

>> >> primitives (issues 244 and 245). Before getting too deep into updating<br>

>> >> the compiler etc., may I request a review of the semantics? Here they<br>

>> >> are:<br>

>> >><br>

>> >><br>

>> >><br>

>> >> <a href="https://github.com/CakeML/cakeml/commit/67dd15bbd03f516be618ba72f1d56a2764209263" rel="noreferrer" target="_blank">https://github.com/CakeML/<wbr>cakeml/commit/<wbr>67dd15bbd03f516be618ba72f1d56a<wbr>2764209263</a><br>

>> >><br>

>> >> I noticed that v_to_char_list might be better as vs_to_char_list, to<br>

>> >> be run after v_to_list (rather than duplicating its<br>

>> >> list-deconstruction functionality). But I leave such refactoring for<br>

>> >> another time.<br>

>> >><br>

>> >> Cheers,<br>

>> >> Ramana<br>

>> ><br>

>> ><br>

>> ><br>

>> > ______________________________<wbr>_________________<br>

>> > Developers mailing list<br>

>> > <a href="mailto:Developers@cakeml.org">Developers@cakeml.org</a><br>

>> > <a href="https://lists.cakeml.org/listinfo/developers" rel="noreferrer" target="_blank">https://lists.cakeml.org/<wbr>listinfo/developers</a><br>

>> ><br>

><br>

><br>

</blockquote></div></div>