execution: fix split scheduler preservation for late subscribers (P2300) by arpittkhandelwal · Pull Request #7165 · TheHPXProject/hpx

arpittkhandelwal · 2026-04-11T06:23:02Z

Problem Statement

This PR addresses a P2300 correctness violation in the split algorithm. Previously, the split sender failed to preserve the associated scheduler when a receiver connected after the predecessor had already completed (the "late subscriber" scenario). In these cases, the completion signal was fired inline, bypassing the execution context guaranteed by the sender's completion scheduler.

Proposed Changes

Virtualized Completion Hooks: Introduced a virtual void schedule_completion(continuation_type&&) method to the shared_state base class. This allows subclasses to reroute completion signals through the appropriate execution context.
Scheduler-Aware shared_state: Implemented shared_state_scheduler, a new subclass that captures the attached scheduler. Overrode schedule_completion to dispatch stored continuations via hpx::execution::experimental::schedule(sched).
Safe Asynchronous Management: Implemented a self-owning schedule_op_holder to manage the lifetime of the schedule() operation state.
Memory Safety: Adopted the standard HPX allocator pattern, rebinding the shared_state allocator to handle internal task metadata without raw new/delete.
Race Prevention: Added an intrusive_ptr owner guard before calling start() to prevent use-after-free if a scheduler executes synchronously.
CPO & Dispatch Refactoring: Updated split_t overloads to support generic schedulers, enabling both automatic scheduler discovery and explicit injection.
Cleaned up constructor SFINAE in split_sender to handle no_scheduler, run_loop_scheduler, and generic Scheduler types without ambiguity.

Verification Results
New Test Suite: Added algorithm_split_scheduler.cpp which specifically targets the late-subscriber race condition.
Regression Testing: Verified that the legacy no_scheduler and run_loop paths remain unaffected.
Performance: Used HPX_NO_UNIQUE_ADDRESS and intrusive pointers to keep metadata overhead at an absolute minimum.

codacy-production · 2026-04-11T06:24:38Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

When split(scheduler, sender) is used and the predecessor has already completed (predecessor_done == true) by the time add_continuation() is called by a new subscriber, the downstream completion signal was fired inline on whatever thread called add_continuation(). This violated the P2300 get_completion_scheduler<CPO> contract, which requires that completions on the set_value and set_stopped signals are dispatched by the scheduler passed to split. The code itself acknowledged this with a TODO comment: // TODO: Should this preserve the scheduler? It does not // if we call set_* inline. Fix: * Add a virtual schedule_completion(continuation_type&&) method to shared_state with a default implementation that fires inline (preserving existing behaviour for the scheduler-free case). * Replace the two "fire inline" paths inside add_continuation (the predecessor_done fast-path and the lock-then-done path) with calls to schedule_completion, so all completion dispatch goes through a single overridable hook. * Add shared_state_scheduler<Sched> — a new subclass of shared_state — that overrides schedule_completion to post the continuation through schedule(sched). The operation state is kept alive via a self-owning intrusive_ptr-based holder (mirroring the pattern in start_detached.hpp), so the async lifetime is correct regardless of how quickly the thread pool processes the work item. * Add a second constructor overload to split_sender for generic (non-run_loop) schedulers that allocates shared_state_scheduler instead of plain shared_state. * Add algorithm_split_scheduler unit test that covers: - Basic split with no scheduler (regression guard) - split with thread_pool_scheduler: late subscriber receives value on the pool, not inline - Multiple concurrent late subscribers all receive the value - ensure_started (eager submission) is unaffected No behavioural change for the scheduler-free split or the run_loop split; only the generic-scheduler path gains the new subclass. Signed-off-by: arpittkhandelwal <arpitkhandelwal810@gmail.com>

…plit

StellarBot · 2026-04-11T13:08:08Z

Performance test report

HPX Performance

Comparison

BENCHMARK	FORK_JOIN_EXECUTOR	PARALLEL_EXECUTOR	SCHEDULER_EXECUTOR
For Each	(=)	(=)	(=)

Info

Property	Before	After
HPX Datetime	2026-03-09T14:08:29+00:00	2026-04-11T12:55:48+00:00
HPX Commit	`0eeca86`	`9deb045`
Envfile
Compiler	/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8	/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime	2026-03-09T09:15:24.034803-05:00	2026-04-11T08:05:19.438544-05:00
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Clustername	rostam	rostam

Comparison

BENCHMARK	NO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch	++

Info

Property	Before	After
HPX Datetime	2026-03-09T14:08:29+00:00	2026-04-11T12:55:48+00:00
HPX Commit	`0eeca86`	`9deb045`
Envfile
Compiler	/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8	/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime	2026-03-09T09:17:15.638328-05:00	2026-04-11T08:07:02.860839-05:00
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Clustername	rostam	rostam

Comparison

BENCHMARK	FORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATOR	PARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATOR	SCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add	(=)	-	--
Stream Benchmark - Scale	(=)	--	--
Stream Benchmark - Triad	(=)	-	--
Stream Benchmark - Copy	=	+++	++

Info

Property	Before	After
HPX Datetime	2026-03-09T18:50:37+00:00	2026-04-11T12:55:48+00:00
HPX Commit	`ba89f5d`	`9deb045`
Envfile
Compiler	/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8	/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime	2026-03-09T17:49:10.837937-05:00	2026-04-11T08:07:23.108751-05:00
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Clustername	rostam	rostam

Explanation of Symbols

Symbol	MEANING
=	No performance change (confidence interval within ±1%)
(=)	Probably no performance change (confidence interval within ±2%)
(+)/(-)	Very small performance improvement/degradation (≤1%)
+/-	Small performance improvement/degradation (≤5%)
++/--	Large performance improvement/degradation (≤10%)
+++/---	Very large performance improvement/degradation (>10%)
?	Probably no change, but quite large uncertainty (confidence interval with ±5%)
??	Unclear result, very large uncertainty (±10%)
???	Something unexpected…

hkaiser · 2026-04-11T14:22:08Z

@arpittkhandelwal What's the difference to #6911?

arpittkhandelwal · 2026-04-11T17:52:23Z

@arpittkhandelwal What's the difference to #6911?

I think the main difference between this PR and #6911 is all about where the 'source of truth' for the scheduler lives. PR #6911 takes a receiver-centric approach—it essentially looks downstream at the moment of connection to see if the receiver has a preferred scheduler. This is a nice safety net, but it doesn't quite address the core problem: the split(scheduler, sender) call itself is currently ignoring the scheduler we specifically gave it whenever a subscriber arrives late.
In this PR, I’ve gone with a sender-centric approach to make sure we actually follow the algorithm's contract. By virtualizing the completion dispatch through a schedule_completion hook in the shared state, we get a few big wins:

Better Compliance: The split_sender now strictly follows the P2300 contract by completing on the provided scheduler, even if the receiver doesn't have an environment or a specific scheduler of its own.
Proper Discovery: It allows the sender to correctly report its scheduler through get_completion_scheduler.
Efficiency: Instead of using start_detached for every late subscriber (which adds extra allocations), we use a specialized shared_state_scheduler. This fits much better into the existing HPX shared-state patterns and keeps things efficient.
Basically, while #6911 is a good generic fix, this implementation feels like the architectural step needed to make HPX's split truly P2300 compliant.

hkaiser · 2026-04-11T18:38:11Z

@isidorostsa could you have a look, please?

arpittkhandelwal · 2026-04-29T19:28:54Z

@isidorostsa ping

Copilot

Pull request overview

Fixes a P2300 correctness issue in execution::split where late subscribers (connecting after the predecessor has completed) could receive completion inline instead of via the sender’s completion scheduler.

Changes:

Introduces shared_state::schedule_completion and routes late-subscriber completions through it.
Adds shared_state_scheduler<Sched> to dispatch late completions via schedule(sched) for generic schedulers.
Adds a new unit test target algorithm_split_scheduler.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
libs/core/execution/include/hpx/execution/algorithms/split.hpp	Adds scheduler-aware completion routing for late subscribers and new generic-scheduler split support.
libs/core/execution/tests/unit/algorithm_split_scheduler.cpp	Adds regression coverage intended to exercise the late-subscriber scheduler-preservation behavior.
libs/core/execution/tests/unit/CMakeLists.txt	Registers the new unit test.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Copilot · 2026-04-30T01:19:06Z

+                      , op_state(hpx::execution::experimental::connect(
+                            hpx::execution::experimental::schedule(s),
+                            schedule_receiver{
+                                hpx::intrusive_ptr<schedule_op_holder>(this)}))
+                    {


schedule_op_holder constructs schedule_receiver with intrusive_ptr<schedule_op_holder>(this) while ref_count starts at 0. If schedule(s) or connect(...) throws during op_state construction, the temporary intrusive_ptr can decrement the count back to 0 and call intrusive_ptr_release on a partially-constructed object, which is undefined behaviour. Consider establishing an owning ref before creating the receiver (e.g., set ref_count to 1 and pass intrusive_ptr(this, /add_ref=/false), then create the external owner with add_ref=false), or restructure so op_state is created after the owner intrusive_ptr exists.

Copilot · 2026-04-30T01:19:07Z

+                    continuation_type&& continuation) override
+                {
+                    using holder_alloc_type =
+                        typename schedule_op_holder::holder_alloc_type;
+                    using holder_alloc_traits =
+                        std::allocator_traits<holder_alloc_type>;
+                    using holder_unique_ptr =
+                        std::unique_ptr<schedule_op_holder,
+                            util::allocator_deleter<holder_alloc_type>>;
+
+                    holder_alloc_type holder_alloc(this->alloc);
+                    holder_unique_ptr p(
+                        holder_alloc_traits::allocate(holder_alloc, 1),
+                        hpx::util::allocator_deleter<holder_alloc_type>{
+                            holder_alloc});
+                    holder_alloc_traits::construct(holder_alloc, p.get(),
+                        HPX_MOVE(continuation), sched, holder_alloc);
+
+                    hpx::intrusive_ptr<schedule_op_holder> owner(p.release());
+                    hpx::execution::experimental::start(owner->op_state);


shared_state_scheduler::schedule_completion is invoked from split_sender::operation_state::start (which is noexcept). However this implementation performs allocations and calls schedule/connect, any of which may throw; an exception escaping here will trigger std::terminate and can make this path much less robust than the previous inline completion. Consider making schedule_completion noexcept and handling failures explicitly (e.g., catch and either invoke the continuation inline as a fallback, or terminate with a clear message).

Suggested change

continuation_type&& continuation) override

{

using holder_alloc_type =

typename schedule_op_holder::holder_alloc_type;

using holder_alloc_traits =

std::allocator_traits<holder_alloc_type>;

using holder_unique_ptr =

std::unique_ptr<schedule_op_holder,

util::allocator_deleter<holder_alloc_type>>;

holder_alloc_type holder_alloc(this->alloc);

holder_unique_ptr p(

holder_alloc_traits::allocate(holder_alloc, 1),

hpx::util::allocator_deleter<holder_alloc_type>{

holder_alloc});

holder_alloc_traits::construct(holder_alloc, p.get(),

HPX_MOVE(continuation), sched, holder_alloc);

hpx::intrusive_ptr<schedule_op_holder> owner(p.release());

hpx::execution::experimental::start(owner->op_state);

continuation_type&& continuation) noexcept override

{

try

{

using holder_alloc_type =

typename schedule_op_holder::holder_alloc_type;

using holder_alloc_traits =

std::allocator_traits<holder_alloc_type>;

using holder_unique_ptr =

std::unique_ptr<schedule_op_holder,

util::allocator_deleter<holder_alloc_type>>;

holder_alloc_type holder_alloc(this->alloc);

holder_unique_ptr p(

holder_alloc_traits::allocate(holder_alloc, 1),

hpx::util::allocator_deleter<holder_alloc_type>{

holder_alloc});

holder_alloc_traits::construct(holder_alloc, p.get(),

HPX_MOVE(continuation), sched, holder_alloc);

hpx::intrusive_ptr<schedule_op_holder> owner(

p.release());

hpx::execution::experimental::start(owner->op_state);

}

catch (...)

{

try

{

HPX_MOVE(continuation)();

}

catch (...)

{

std::terminate();

}

}

isidorostsa · 2026-04-30T08:04:21Z

Hey @arpittkhandelwal thanks for the PR!
As you may have seen, we are in the process of retiring our internal stdexec implementation (#7123).
Is there a reason to single out split and keep our own implementation active?

arpittkhandelwal requested a review from hkaiser as a code owner April 11, 2026 06:23

arpittkhandelwal closed this Apr 11, 2026

arpittkhandelwal reopened this Apr 11, 2026

arpittkhandelwal force-pushed the fix/split-scheduler-preservation branch from 5d7786a to 4afbe1b Compare April 11, 2026 12:28

execution: fix Clang-20 redeclaration and friend-function errors in s…

97300ee

…plit

hkaiser added category: senders/receivers Implementations of the p0443r14 / p2300 + p1897 proposals type: enhancement type: compatibility issue labels Apr 11, 2026

Fix split test compile errors and clang-format

a8dc458

Copilot AI review requested due to automatic review settings April 29, 2026 19:24

Copilot started reviewing on behalf of arpittkhandelwal April 29, 2026 19:24 View session

Apply clang-format to split.hpp

c0f2bb3

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Comment thread libs/core/execution/include/hpx/execution/algorithms/split.hpp Outdated

Comment thread libs/core/execution/tests/unit/algorithm_split_scheduler.cpp

Comment thread libs/core/execution/tests/unit/algorithm_split_scheduler.cpp Outdated

arpittkhandelwal added 2 commits April 30, 2026 01:08

Address Copilot feedback on split and tests

5253d68

Fix clang-21 C++20 modules ADL issues with get and emplace

3b7a3c0

Copilot AI review requested due to automatic review settings April 30, 2026 01:11

Copilot started reviewing on behalf of arpittkhandelwal April 30, 2026 01:12 View session

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Uh oh!

Conversation

arpittkhandelwal commented Apr 11, 2026

Problem Statement

Proposed Changes

Uh oh!

codacy-production Bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

StellarBot commented Apr 11, 2026

HPX Performance

Comparison

Info

Comparison

Info

Comparison

Info

Explanation of Symbols

Uh oh!

hkaiser commented Apr 11, 2026

Uh oh!

arpittkhandelwal commented Apr 11, 2026

Uh oh!

hkaiser commented Apr 11, 2026

Uh oh!

arpittkhandelwal commented Apr 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

isidorostsa commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codacy-production Bot commented Apr 11, 2026 •

edited

Loading