<div dir="ltr"><div>I'm down to BVI on the strcat branch now, and to go further I will soon need the memcpy primitives/stubs in wordLang, to implement the new CopyByte op.<br><br></div>We may need to coordinate this. On which branch are you planning to implement those stubs in wordLang?<br></div><div class="gmail_extra"><br><div class="gmail_quote">On 20 March 2017 at 17:29, Magnus Myreen <span dir="ltr"><<a href="mailto:magnus.myreen@gmail.com" target="_blank">magnus.myreen@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 17 March 2017 at 12:28, Ramana Kumar <<a href="mailto:Ramana.Kumar@cl.cam.ac.uk">Ramana.Kumar@cl.cam.ac.uk</a>> wrote:<br>

> Hi Magnus,<br>

><br>

> Thanks for the heads up. I should say that by "primitive in wordLang"<br>

> I think my language was confused: I never meant a wordLang primitive,<br>

> I mean a primitive in BVL that is implemented in wordLang.<br>

<br>

</span>Ah, OK.<br>

<span class=""><br>

> Out of the discussion we had today, one point came up for when you<br>

> implement your ConsExtend stubs: it would be best to have a generic<br>

> memcpy (on words) that can be re-used to make an efficient memcpy on<br>

> bytes. In other words, I hope your stub won't fuse the functionality<br>

> required for ConsExtend with the basic copying primitive.<br>

<br>

</span>There should be two memcpy stubs one for words and one for bytes. The<br>

one on words can assume that all the words are word aligned. The one<br>

on bytes can use the word version if the addresses and lengths it<br>

copies happen to fit various constraints.<br>

<span class=""><br>

> Also, Scott suggested at one point looking at how GNU libc implements<br>

> memcpy for ideas on an efficient implementation... it's a<br>

> sophisticated piece of assembly code and highly arch dependent; I<br>

> don't think we're at that level of complexity yet.. see e.g.<br>

> <a href="https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S;hb=HEAD" rel="noreferrer" target="_blank">https://sourceware.org/git/?p=<wbr>glibc.git;a=blob_plain;f=<wbr>sysdeps/x86_64/multiarch/<wbr>memmove-vec-unaligned-erms.S;<wbr>hb=HEAD</a><br>

<br>

</span>An efficient memcpy is a very important tool and I'm sure a lot of<br>

effort has gone into them. I think we should eventually have carefully<br>

crafted stubs, but we are restricted by our target-neutral asm<br>

language. For word memcpy, I bet the best we can do is unroll the<br>

copying loop a few times but not so much that we cause register<br>

spilling. The byte memcpy is tricker because one wants to use<br>

word-sized memory access even in cases where the addresses aren't<br>

aligned, but our asm semantics requires all word-sized access to be<br>

aligned. This is possible to implement efficiently with our<br>

target-neutral asm language, but will be more than just a few hours of<br>

work.<br>

<br>

If I am to implement the byte memcpy, then the first version won't be<br>

optimised to do word-sized memory accesses for byte memcpy.<br>

<br>

I see from Ramana's recent email that there was discussion about<br>

exposing these in the source language. I think that's a good thing,<br>

and that they should also be available for normal value arrays.<br>

(Ramana's email was only talking about byte arrays and strings).<br>

<br>

Cheers,<br>

Magnus<br>

<div class="HOEnZb"><div class="h5"><br>

<br>

> On 17 March 2017 at 16:48, Magnus Myreen <<a href="mailto:magnus.myreen@gmail.com">magnus.myreen@gmail.com</a>> wrote:<br>

>> Hi Ramana,<br>

>><br>

>> I'm on holiday at the moment, but want to write a short reply.<br>

>><br>

>> I will be implementing a mem copy stub in data-to-word early next week for<br>

>> Scott's ConsExtend. That mem copy will copy word for word.<br>

>><br>

>> Mem copy should not be primitive in wordlang. Having it as a primitive<br>

>> doesn't buy you anything. This also applies to byte by byte mem copy.<br>

>><br>

>> For efficiency, you want to implement these copy routines as stubs in<br>

>> wordlang as opposed to stubs in higher levels like BVL, BVI or DataLang.<br>

>><br>

>> Cheers,<br>

>> Magnus<br>

>> On Fri, 17 Mar 2017 at 06:07, Ramana Kumar <<a href="mailto:Ramana.Kumar@cl.cam.ac.uk">Ramana.Kumar@cl.cam.ac.uk</a>><br>

>> wrote:<br>

>>><br>

>>> Hi developers,<br>

>>><br>

>>> This question is especially for those familiar with wordLang.<br>

>>><br>

>>> Would it be reasonable to implement a memcpy primitive in wordLang? In<br>

>>> particular, I would want to add a primitive to BVL/BVI that given a<br>

>>> byte array, an offset, and another byte array, copies the contents of<br>

>>> the latter into the former starting at the offset.<br>

>>><br>

>>> The question is whether it is possible to do this efficiently in<br>

>>> wordLang even if the offset is not word aligned.<br>

>>><br>

>>> Obviously I can already write a byte-by-byte copying routine in BVL<br>

>>> (or even higher). I'm trying to figure out how to actually be more<br>

>>> efficient than that when implementing concatenation and<br>

>>> string/bytearray conversion primitives.<br>

>>><br>

>>> (Another annoying thing I noticed is that currently the only way to<br>

>>> create a byte array forces you to write some initial dummy replicated<br>

>>> value into it, even if you're going to overwrite them all right after.<br>

>>> But I don't know what a good primitive to use instead would look like<br>

>>> - some super create-and-copy-with-offsets primitive maybe, but that's<br>

>>> pretty complicated.)<br>

>>><br>

>>> Cheers,<br>

>>> Ramana<br>

>>><br>

>>> ______________________________<wbr>_________________<br>

>>> Developers mailing list<br>

>>> <a href="mailto:Developers@cakeml.org">Developers@cakeml.org</a><br>

>>> <a href="https://lists.cakeml.org/listinfo/developers" rel="noreferrer" target="_blank">https://lists.cakeml.org/<wbr>listinfo/developers</a><br>

</div></div></blockquote></div><br></div>