Skip to content

add Random.fork(rng::Xoshiro) to split rng into a new instance #58193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

rfourquet
Copy link
Member

@rfourquet rfourquet commented Apr 22, 2025

When new tasks are spawned, the task-local RNG of the current task is "split", or "forked", into a new RNG for the new task. This is achieved in a fast and sound way, involving a secondary RNG "hidden" in the 5th field of the TaskLocalRNG struct.
Xoshiro was made to mirror TaskLocalRNG and also has a 5th field, mostly unused. It was useful for example when you want to save the state of TaskLocalRNG() via copy(TaskLocalRNG()), which returns a Xoshiro.

This PR re-implements in julia this forking mechanism from jl_rng_split() in "src/task.c", to allow forking Xoshiro RNGs independently of task spawning. This is in particular useful for parallel (reproducible) computations, where tasks can be spawned with a forked (explicit) RNG from an initial Xoshiro object.

There are alternatives, like "jumping", or for example seeding new RNGs objects from a master seed and a "worker id", like in Xoshiro([master_seed, worker_id]), but these techniques require some coordination. For example, it can be unsafe to locally create a new RNG via jumping, because you might have collision with another RNG created from the parent task also via jumping with the same number of steps.
In contrast, "forking" allows to make local decisions about the shape of the computation without risks of collisions.
And it's simpler: you just write forked_rng = Random.fork(src_rng).

Alternative names could be

  • spawn, which is used in Numpy, but that I find less descriptive than fork; or
  • split, perhaps what is mostly found in the literature, and also appears in the name jl_rng_split; one problem being that Base.split means something else.

"Fork" seems appropriate as it's used in the name Random.forkRand for array-filling with SIMD, and in the long comment above jl_rng_split.

When new tasks are spawned, the task-local RNG of the current task is
"split", or "forked", into a new RNG for the new task. This is achieved
in a fast and sound way, involving a secondary RNG "hidden" in the 5th
field of the `TaskLocalRNG` struct.
`Xoshiro` was made to mirror `TaskLocalRNG` and also has a 5th field,
mostly unused. It was useful for example when you want to save the state
of `TaskLocalRNG()` via `copy(TaskLocalRNG())`, which returns a `Xoshiro`.

This PR re-implements in julia this forking mechanism from `jl_rng_split()`
in "src/task.c", to allow forking `Xoshiro` RNGs independently of task spawning.
This is in particular useful for parallel (reproducible) computations, where tasks
can be spawned with a forked (explicit) RNG from an initial `Xoshiro` object.

There are alternatives, like "jumping", or for example seeding new RNGs objects
from a master seed and a "worker id", like in `Xoshiro((master_seed, id))`, but
these techniques require some coordination. For example, it can be unsafe to
locally create a new RNG via jumping, because you might have collision with
another RNG created from the parent task also via jumping with the same number
of steps.
In contrast, "forking" allows to make local decisions about the shape of the
computation without risks of collisions.

Alternative names could be
* `spawn`, which is used in Numpy, but that I find less descriptive than `fork`; or
* `split`, perhaps what is mostly found in the literature, and also appears in the name
  `jl_rng_split`; one problem being that `Base.split` means something else.

"Fork" seems appropriate as it's used in the name `Random.forkRand` for array-filling
with SIMD, and in the long comment above `jl_rng_split`.
@rfourquet rfourquet added randomness Random number generation and the Random stdlib feature Indicates new feature / enhancement requests labels Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates new feature / enhancement requests randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant