Skip to content

Conversation

@gasche
Copy link
Member

@gasche gasche commented Mar 3, 2024

This RFC proposes to add minimal support in the OCaml implementation for uniform arrays, which always use the standard Array_tag representation and never Double_array_tag.

The compiler would offer specialized built-in and C primitives to work with uniform arrays, and a minimal CamlinternalUniform_array module in the standard library exposing these primitives.

This would make it easier to write some unsafe idioms that enforce uniform arrays in more complex ways today, simplifying the code of Domain.DLS, the unboxed implementation of Dynarray, and the Uniform_array module of Jane Street's Base library and making it easier to reason about their correctness.

Full version, rendered.

@lthls
Copy link

lthls commented Mar 3, 2024

I'd suggest to look into CamlinternalOO if you want to see some existing uses of uniform arrays in the standard library.
Given that this code is crucial to the correctness of objects and classes, it's rather safe to hope that optimisers will not do anything funny about it.
(There are a lot of interesting pieces of code in this file. For instance:

type item = DummyA | DummyB | DummyC of int
let _ = [DummyA; DummyB; DummyC 0] (* to avoid warnings *)

let dummy_item = (Obj.magic () : item)

Why not just dummy_item = DummyA ? I guess we'll never know.)

@gasche
Copy link
Member Author

gasche commented Mar 3, 2024

Thanks for the pointer. From looking at it briefly, I cannot tell whether I would recommend trying to use explicit uniform arrays for this purpose. The "methods" array seems to be a bytecode array with a mix of methods and arguments (so indeed uniform arrays would be a good choice here). The obj type itself is advertised as an array internally, but in fact uses a non-0 tag, so it is not obviously a uniform array -- or at least it falls outside the scope I would naturally consider.

Why not just dummy_item = DummyA ? I guess we'll never know.

Git suggests that this is a historic artifact, before those constructors were introduced item was an abstract type, so it was natural to inhabit it with Obj.magic (). When Jacques moved to a concrete definition in 6a940ef65d7b70f94e221f4b6731b4ed7a9c410e, he could have used DummyA at this point but he did not change the definition.

(What I cannot tell is what DummyB is for. DummyC looks like it could have been vaguely used for dummy_met, but in the end the author decided to use an atom to save an allocation or something.)

@lthls
Copy link

lthls commented Mar 4, 2024

Coming back to the proposal:
I believe that you can get a correct implementation by using Obj.new_block for Uniform_array.make, and regular array functions for all the rest. If you want an extra bit of performance (or, more likely, are annoyed about the compiler not noticing that field accesses cannot allocate), you can expose the Pfield_computed and Psetfield_computed Lambda primitives using new %-prefixed external declarations and use that for get and set.
One issue is that initialising an array now involves a loop calling caml_modify on each iteration; if this is an issue for you you will have to go back to C, although you can probably refactor caml_make_vect to expose a version that does not perform the flat float array optimisation.
So, very little new code in the runtime and the compiler.

I don't see the point on insisting on a particular tag: Array_tag doesn't exist, the tag of an array has never been specified (except for flat float arrays), so I think we should allow any tag up to Obj.last_non_constant_constructor_tag (that doesn't include the object tag used in CamlinternalOO but that's fine with me).

@gasche
Copy link
Member Author

gasche commented Mar 4, 2024

I believe that you can get a correct implementation by using Obj.new_block for Uniform_array.make, and regular array functions for all the rest. If you want an extra bit of performance (or, more likely, are annoyed about the compiler not noticing that field accesses cannot allocate), you can expose the Pfield_computed and Psetfield_computed Lambda primitives using new %-prefixed external declarations and use that for get and set.

Currently translprim.ml has:

    "%array_length", Primitive ((Parraylength gen_array_kind), 1);
    "%array_safe_get", Primitive ((Parrayrefs gen_array_kind), 2);
    "%array_safe_set", Primitive ((Parraysets gen_array_kind), 3);
    "%array_unsafe_get", Primitive ((Parrayrefu gen_array_kind), 2);
    "%array_unsafe_set", Primitive ((Parraysetu gen_array_kind), 3);

    "%floatarray_length", Primitive ((Parraylength Pfloatarray), 1);
    "%floatarray_safe_get", Primitive ((Parrayrefs Pfloatarray), 2);
    "%floatarray_safe_set", Primitive ((Parraysets Pfloatarray), 3);
    "%floatarray_unsafe_get", Primitive ((Parrayrefu Pfloatarray), 2);
    "%floatarray_unsafe_set", Primitive ((Parraysetu Pfloatarray), 3);

I would propose to add just 5 extra lines of code:

    "%uniform_array_length", Primitive ((Parraylength Paddrarray), 1);
    "%uniform_array_safe_get", Primitive ((Parrayrefs Paddrarray), 2);
    "%uniform_array_safe_set", Primitive ((Parraysets Paddrarray), 3);
    "%uniform_array_unsafe_get", Primitive ((Parrayrefu Paddrarray), 2);
    "%uniform_array_unsafe_set", Primitive ((Parraysetu Paddrarray), 3);

One issue is that initialising an array now involves a loop calling caml_modify on each iteration; if this is an issue for you you will have to go back to C, although you can probably refactor caml_make_vect to expose a version that does not perform the flat float array optimisation. [...] So, very little new code in the runtime and the compiler.

Yes, this is the idea. My idea would be to specialize primitives to carve out less-smart versions, so the amount of new code should be very small in any case.

I don't see the point on insisting on a particular tag: Array_tag doesn't exist, the tag of an array has never been specified (except for flat float arrays), so I think we should allow any tag up to Obj.last_non_constant_constructor_tag (that doesn't include the object tag used in CamlinternalOO but that's fine with me).

For now I am not really interested in extending the proposal to explicitly cover non-standard tag choices, which sound a bit dubious to me. If the implementation naturally allows them I will look the other way.

@lthls
Copy link

lthls commented Mar 5, 2024

I would propose to add just 5 extra lines of code:

I don't think you need them. On all platforms for which we support the native compiler, Parraylength is compiled the exact same way no matter the kind. The unsafe accessors are exactly Pfield_computed and Psetfield_computed (which we should expose so that CamlinternalOO uses them instead of array accesses). And it's only very slightly less efficient to implement the safe versions in OCaml compared to using the primitives.

But I guess that shows the difference in how we see this: you seem to want a full-fledged data structure with compiler support, I just want a decent way to manipulate dynamically sized blocks full of values.

@OlivierNicole
Copy link

This looks very useful to have to increase legibility and safety in the important use cases you mention. I can also imagine cases when I’d want to enforce absence of allocation in some hot paths.

@gasche
Copy link
Member Author

gasche commented May 13, 2024

Coq/Rocq also includes a small module of uniform arrays in its codebase, as part of the definition of its "persistent arrays": https://github.com/coq/coq/blob/d7d392191a367839b0f5a7772e48b8f24e9f1b3e/kernel/parray.ml . My understanding is that the need for uniformity arises from the compilation scheme of {vm,native}_compute, which introduces arrays of Coq values, which include native floats at a type that can contain non-float values, the "accumulator" values.

@SkySkimmer
Copy link

I'm not sure if we want uniform arrays or immutable uniform arrays (or both).
Also we probably want array literals to work with (i)uarrays instead of having to mess with (i)uarray.init or some such thing.
cc @ppedrot

@OlivierNicole
Copy link

Happily, literals should not be a problem now that we have type-disambiguated array literals!

@OlivierNicole
Copy link

I am still in favor of this. For the record, Jane Street’s Base library also has a Uniform_array module. Given that this RFC has received few reactions but no particular opposition, should we consider it “safe” for whoever has the time to start working on a PR?

I'm not sure if we want uniform arrays or immutable uniform arrays (or both).

We definitely want at least uniform mutable arrays as they would be useful to simplify the implementation of the Dynarray module. I don’t see a big reason to have immutable uniform arrays, but I also don’t see a reason not to have them if a use case shows up.

@goldfirere
Copy link
Contributor

I'm in favor.

@alainfrisch
Copy link
Contributor

Is there no future where the uniform representation, already available by a configure flag, becomes the only supported mode? It removes complexity (in the runtime and type system), makes generic array accesses faster, and avoids the need for this new module.

@gasche
Copy link
Member Author

gasche commented Nov 20, 2025

Such a future exists, but the path that I personally see toward it is to make floatarray as reasonably pleasant to use as array. Then we will be able to defend against Xavier's reasonable criticism that --no-flat-float-array makes beginner numeric code surprisingly slow, and the alternative is inconvenient and/or reserved to experts. We have made progress in this direction -- notably with @nojb's work on array literal overloading -- but we are far from being there.

@goldfirere
Copy link
Contributor

For what it’s worth, I agree with @gasche there. We’re on a similar journey within Jane Street. We could, at any time, just turn off the flat-float-array-optimization in the compiler we use to build our apps. But it would degrade the performance of the many float arrays we have lying around. So we don’t. And, sadly, floatarray isn’t quite the answer either, because we also want support for t arrays where type t = private float. So one possibility we’re considering is something like ’a buoyant_array, where we require that ’a floats (that is, is represented at runtime by a pointer to a float block). How do we ensure this? We could add stuff to our kind system (along the lines of my presentation at the OCaml Workshop just passed (I’m on mobile and getting a link is awkward, sorry)), or maybe a runtime check at array initialization (which could be optimized away if we know that ’a floats at compile time). Another possibility is to use our support for unboxed types to use float# array instead of float array.

But regardless of which approach, we need to build a viable alternative to float array before we can imagine removing the existing optimization, in my opinion.

@ppedrot
Copy link

ppedrot commented Nov 20, 2025

My 2cts: we can survive a while in Rocq with our own Obj.magic-based uniform arrays, the real issue about float arrays is that it pervades the runtime and makes a lot of untyped code dangerous for VM / native / extraction due to semantic concerns.

I'm sure I'm rehashing a problem well-known by the runtime experts, but basically if you define OCaml types as semantic predicates on values, then all types A must satisfy the property that if v ∈ A for some float v, then all w ∈ A are floats. In particular, this implies that Obj.t is not an OCaml type, even if we removed Obj.obj and treated it as a universal type. As a result, adding more quirks to the type system to segregate types depending on whether they may contain floats is elegant from a type-theory point of view, but still terrible from the untyped runtime view we rely on in Rocq.

@stedolan
Copy link
Contributor

And, sadly, floatarray isn’t quite the answer either, because we also want support for t arrays where type t = private float. So one possibility we’re considering is something like ’a buoyant_array, where we require that ’a floats (that is, is represented at runtime by a pointer to a float block). How do we ensure this?

Reading this comment made me realise that the stdlib's Array.Floatarray module is unnecessarily restricted, and could be implemented as follows instead, giving it a more general signature:

module Floatarray = struct
  type 'a t

  module Create (X : sig type t = private float end) = struct
    external create : int -> X.t t = "caml_floatarray_create"
  end
  include Create (struct type t = float end)

  external length : 'a t -> int = "%floatarray_length"
  external get : 'a t -> int -> 'a = "%floatarray_safe_get"
  external set : 'a t -> int -> 'a -> unit = "%floatarray_safe_set"
  external unsafe_get : 'a t -> int -> 'a = "%floatarray_unsafe_get"
  external unsafe_set : 'a t -> int -> 'a -> unit
    = "%floatarray_unsafe_set"
end

This interface allows code to manipulate foo Floatarray.t knowing nothing about foo, and create new foo Floatarray.t if it knows that foo = private float.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants