[CakeML-dev] Adding polymorphic compare to CakeML

Magnus Myreen magnus.myreen at gmail.com
Tue Nov 29 22:51:30 UTC 2016


On 29 November 2016 at 22:25, Scott Owens <S.A.Owens at kent.ac.uk> wrote:
>
>> On 2016/11/28, at 23:54, Magnus Myreen <magnus.myreen at gmail.com> wrote:
>>
>> On 29 November 2016 at 10:29, Scott Owens <S.A.Owens at kent.ac.uk> wrote:
>>> So, what would the source semantics do for values that contain refs or closures? I suppose that closures can always return 0, to be compatible with =, but what about references? In OCaml, compare traverses the references, as does =. But our = does a pointer comparison on refs, and we probably want compare to do so as well, but we can’t do < on the pointer since the GC can move them. We could go back to raising exceptions, but then compare can’t use the quick pointer equality check as an optimisation for non-references.
>>
>> compare would not follow references. It would return default values
>> for references and closures that are compatible with equality, as we
>> currently have it. This means that for references that are not the
>> same it would have to return a default: either less-than or
>> greater-than. This is why I didn't promise a clean semantics for these
>> cases!
>
> Ick. So compare wouldn’t be a total order for references. One of the big uses of compare in OCaml is to stick things into balanced binary trees simply and easily. Doing so with this design will silently give wrong results without warning if references are involved.
>
>> An alternative is to have compare have type 'a -> 'a -> bool (instead
>> of int) and have it mean that compare the intuition is that it is <=
>> for most types. However, for values with non-equal references it
>> returns false (i.e. we avoid saying whether it is less-than or
>> greater-than).
>
> I don’t think this helps; it’s still not a total order. The int is better since you only need 1 comparison to determine between <, =, >, which you need to implement a binary search tree.
>
>> However, if we add compare then I'd like it to be type compatible with
>> OCaml so that one can more easily port OCaml code to CakeML (e.g.
>> the sources of HOL light).
>
> Agreed.
>
> Have we checked carefully what types HOL light uses compare on, that there are no refs?
>
> You can actually implement this correctly with the cooperation of the GC, so it might be of to have the confusing semantics at first and fix it later. There are two approaches:
>
> 1) The GC never re-orders a ref cell in memory. You could even put all ref cells in their own mark/sweep/compact collected area, exactly how you do this with a generational collection depends on how you’re doing the write barriers to detect when a ref in the old generation is made to point to the nursery.

Not rearranging the refs seems like a lot of work just to ensure a
nice semantics for compare applied to (unusual inputs) refs.

> 2) Each ref gets an extra field that is an index used for comparison. This has some cost even when not using compare. Also once you overflow the index, you need to compact the index space. The GC seems like the natural place to do this.

Ramana and I discussed this and we agree that it's ugly to have the
extra index in every ref. The index could either be a Number, i.e.
arbitrary size, or fixed width, neither of which is ideal for
different reasons.

> Hybrid strategies are possible, e.g. #2 for the nursery and #1 for the older generation. A nice side effect of doing one of these is that you can also implement hashCode:’a -> int

We could also make compare be a bit different from equal:
 A) compare could return 0 (i.e. equal-to) for all refs and all closures
 B) compare could raise an exception for all refs and all closures

I think (A) is somewhat compatible with our semantics for equality.
However, (B) is nicer for the programmer.

Cheers,
Magnus



More information about the Developers mailing list