[CakeML-dev] strings

Magnus Myreen myreen at chalmers.se
Fri Jan 20 11:21:13 UTC 2017


On 20 January 2017 at 01:14, Ramana Kumar <Ramana.Kumar at cl.cam.ac.uk> wrote:
> Hi Magnus et al,
>
> I am struggling with some aspects of the design of the new string
> representation (as packed byte arrays, rather than the current blocks
> with 1 char per word). The plan I am following (on the strings branch)
> is that closLang has a new ByteVector value, and these are compiled to
> ByteArray in BVL.
>
> In particular, I want to ask about equality. As discussed, we must be
> able to distinguish ByteArrays that are compared by pointer from
> ByteArrays that must be compared by contents. The question is: Do I
> need to modify data_to_word's equality stub code to follow every
> RefPtr to check something in the header word indicating whether it's a
> compare-by-contents reference, or should I put that information into
> the pointer itself?

Yes, the code stub must be modified to read the actual header word to
examine the bits. We shouldn't waste pointer bits for such information.

> If it should be visible in the pointer itself (without following it),
> then I think I'll need some help figuring out what the pointer
> representation actually is and where things can be encoded in it. For
> example, could there be some way to use ptr_bits for these
> compare-by-contents ByteArrays that won't get them confused with
> Blocks?
>
> If we are using the pointer, then I think it's RefPtr that should get
> an extra flag, not ByteArray, in BVL.

Above is irrelevant because the answer is to not put this information
into pointers.

> If we are not using the pointer, then I'd love to know if there are
> any hints or comments for deciphering what Equal_code in data_to_word
> is doing. (Surely there must be a better approach than painstakingly
> reading the code/proof..? Wasn't there any pseudocode at some point?)

My approach would be start by adjusting the implemention of the
shallow embedding of the equality function, i.e.

https://github.com/CakeML/cakeml/blob/cbc10641edec7ea3e4920ca89374898021d38c35/compiler/backend/proofs/data_to_wordPropsScript.sml#L6277

and adjusting its verification proof

https://github.com/CakeML/cakeml/blob/cbc10641edec7ea3e4920ca89374898021d38c35/compiler/backend/proofs/data_to_wordPropsScript.sml#L6444

before moving on to adjusting the deeply embedded code in wordLang.

The shallow embedding (first link above) has a few comments and I hope
the names of the functions are understandable, e.g. word_header
returns the address of the header word and the header word itself. You
can see that various bits are checked individually in the header word
and one of the cases says "(* is byte array *)", which suggests
that's roughly where you want to be editing this definition.

Cheers,
Magnus

On 20 January 2017 at 01:14, Ramana Kumar <Ramana.Kumar at cl.cam.ac.uk> wrote:
> Hi Magnus et al,
>
> I am struggling with some aspects of the design of the new string
> representation (as packed byte arrays, rather than the current blocks
> with 1 char per word). The plan I am following (on the strings branch)
> is that closLang has a new ByteVector value, and these are compiled to
> ByteArray in BVL.
>
> In particular, I want to ask about equality. As discussed, we must be
> able to distinguish ByteArrays that are compared by pointer from
> ByteArrays that must be compared by contents. The question is: Do I
> need to modify data_to_word's equality stub code to follow every
> RefPtr to check something in the header word indicating whether it's a
> compare-by-contents reference, or should I put that information into
> the pointer itself?
>
> If it should be visible in the pointer itself (without following it),
> then I think I'll need some help figuring out what the pointer
> representation actually is and where things can be encoded in it. For
> example, could there be some way to use ptr_bits for these
> compare-by-contents ByteArrays that won't get them confused with
> Blocks?
>
> If we are using the pointer, then I think it's RefPtr that should get
> an extra flag, not ByteArray, in BVL.
>
> If we are not using the pointer, then I'd love to know if there are
> any hints or comments for deciphering what Equal_code in data_to_word
> is doing. (Surely there must be a better approach than painstakingly
> reading the code/proof..? Wasn't there any pseudocode at some point?)
>
> Cheers,
> Ramana
>
> _______________________________________________
> Developers mailing list
> Developers at cakeml.org
> https://lists.cakeml.org/listinfo/developers



More information about the Developers mailing list