Add execution plan chaining #446

adamruzicka · 2024-05-17T11:45:04Z

This commit enables execution plans to be chained. Assuming there is an execution plan EP1, another execution plan EP2 can be chained onto EP1. When chained, EP2 will stay in scheduled state until EP1 goes to stopped state. An execution plan can be chained onto multiple prerequisite execution plans, in which case it will be run once all the prerequisite execution plans are stopped.

It builds on mechanisms which were already present. When an execution plan is chained, it behaves in the same way as if it was scheduled for future execution. A record is created in dynflow_delayed_table and once the conditions for it to execute are right, it is dispatched by the delayed executor. Because of this, there might be small delay between when the prerequisites finishs and the chained plan is started.

TODOs:

somehow show the prerequisite execution plans in the console

ianballou · 2024-06-04T19:50:12Z

This will be helpful to solve the issue in Katello where a composite content view auto publish can be triggered for content views that are still publishing. They are separate execution chains that need to rely on each other.

ianballou · 2025-10-21T15:52:10Z

I'm finally starting to look into the Katello issue more seriously, so I'm going to test this out.

ianballou · 2025-10-22T19:35:57Z

lib/dynflow/persistence_adapters/sequel.rb

      end

-      def find_past_delayed_plans(time)
+      def find_ready_delayed_plans(time)


I had an issue where one composite CV publish was waiting on two children component CV publishes. However, the task would still start after the quicker child finished.

I don't know Dynflow super well, so I employed some AI tool help:

/begin robot

Issue: The find_ready_delayed_plans query in lib/dynflow/persistence_adapters/sequel.rb had a bug when handling execution plans with multiple dependencies.

It would return a delayed plan as "ready" if ANY dependency was stopped, instead of waiting for ALL dependencies to stop.

Root Cause: The original query used LEFT JOINs:

LEFT JOIN dependencies ON delayed.uuid = dependencies.execution_plan_uuid LEFT JOIN execution_plans ON dependencies.blocked_by_uuid = execution_plans.uuid WHERE (state IS NULL OR state = 'stopped')

With multiple dependencies (e.g., plan D depends on A and B):

If A is 'running' and B is 'stopped', the LEFT JOIN produces 2 rows
The WHERE clause filters out the row with A ('running')
But keeps the row with B ('stopped')
Result: D is returned as "ready" even though A is still running

Fix: Changed to NOT EXISTS subquery to ensure NO dependencies are in a non-stopped state:

WHERE NOT EXISTS ( SELECT 1 FROM dependencies LEFT JOIN execution_plans ON dependencies.blocked_by_uuid = execution_plans.uuid WHERE dependencies.execution_plan_uuid = delayed.execution_plan_uuid AND execution_plans.state IS NOT NULL AND execution_plans.state != 'stopped' )

Result: Chained execution plans now correctly wait for ALL dependencies to complete before running, as documented in the original PR description.

/end robot

I tested this out, and afterwards the publish did indeed wait properly for the slower child to finish.

It's possible I'm using this chaining method incorrectly in my development branch, but let me know what you think of the above.

So:

diff --git a/lib/dynflow/persistence_adapters/sequel.rb b/lib/dynflow/persistence_adapters/sequel.rb index b36298b..502aadf 100644 --- a/lib/dynflow/persistence_adapters/sequel.rb +++ b/lib/dynflow/persistence_adapters/sequel.rb @@ -146,14 +146,22 @@ module Dynflow def find_ready_delayed_plans(time) table_name = :delayed + # Find delayed plans where ALL dependencies (if any) are either non-existent or stopped + # We use NOT EXISTS to ensure no dependency is in a non-stopped state table(table_name) - .left_join(TABLES[:execution_plan_dependency], execution_plan_uuid: :execution_plan_uuid) - .left_join(TABLES[:execution_plan], uuid: :blocked_by_uuid) .where(::Sequel.lit('start_at IS NULL OR (start_at <= ? OR (start_before IS NOT NULL AND start_before <= ?))', time, time)) - .where(::Sequel[{ state: nil }] | ::Sequel[{ state: 'stopped' }]) .where(:frozen => false) + .where(::Sequel.lit( + "NOT EXISTS ( + SELECT 1 + FROM #{TABLES[:execution_plan_dependency]} dep + LEFT JOIN #{TABLES[:execution_plan]} ep ON dep.blocked_by_uuid = ep.uuid + WHERE dep.execution_plan_uuid = #{TABLES[table_name]}.execution_plan_uuid + AND ep.state IS NOT NULL + AND ep.state != 'stopped' + )" + )) .order_by(:start_at) - .select_all(TABLES[table_name]) .all .map { |plan| load_data(plan, table_name) } end

Oh yeah, this was put together rather quickly, I'll have to take a look at this again

And the suggestion looks reasonable, although I'll try to reduce raw sql as much as possible

Oh yeah, this was put together rather quickly, I'll have to take a look at this again

It seems to work well for being a prototype!

ianballou · 2025-10-23T18:06:26Z

I'm planning on submitting a Foreman Tasks PR for integration there. In Katello I currently am including some ForemanTasks addons for the proof of concept.

ianballou · 2025-10-23T19:53:48Z

Is there a good way to get the input data from the scheduled chained task? So far it looks like the inputs aren't available yet so they need to be grabbed from ForemanTasks.dynflow.world.persistence.load_delayed_plan(task.external_id).args.

adamruzicka · 2025-10-24T08:19:50Z

Is there a good way to get the input data

How would you define input data? What you get from ForemanTasks.dynflow.world.persistence.load_delayed_plan(task.external_id).args are the arguments that will end up being passed to #plan. The root action's #input may end up being different after the whole planning goes through.

What would be the use case?

ianballou · 2025-10-24T13:43:22Z

Is there a good way to get the input data from the scheduled chained task? So far it looks like the inputs aren't available yet so they need to be grabbed from ForemanTasks.dynflow.world.persistence.load_delayed_plan(task.external_id).args.

Is there a good way to get the input data

How would you define input data? What you get from ForemanTasks.dynflow.world.persistence.load_delayed_plan(task.external_id).args are the arguments that will end up being passed to #plan. The root action's #input may end up being different after the whole planning goes through.

What would be the use case?

Okay, bear with me here.

When publishing component 2+ CVs that are part of a composite with auto publish enabled, 2+ composite CV publishes will be triggered. This is bad because the latter publish will error out on the Lock being taken. Plus, since the parent composite publish will wait on its publishing component children, only 1 composite CV publish is needed.

So, my implementation in Katello needs to look up the chained & scheduled composite CV publish task in order to skip auto publishing if there is already a task waiting. And thus I need to access the arguments that were passed to plan (which I was calling the input, but I suppose that is different from the actual task input). From the arguments I can find the content view ID to ensure that it is the composite waiting to be published.

ianballou · 2025-10-24T14:03:14Z

So, my implementation in Katello needs to look up the chained & scheduled composite CV publish task in order to skip auto publishing if there is already a task waiting.

I am rethinking this. If a child publishes while the parent is publishing, we would want a new CCV version to be published. That would mean chaining the new composite content view publish on the scheduled/running one. However, when I tried this, there were Lock errors. Could it work such that the lock checking is deferred to when the scheduled task actually runs?

Katello has a workaround for this via the Event queue, and it polls on lock failures. When it was implemented 6 years ago though chaining in Dynflow was wished for :) Katello/katello#8188

For now I'm going to just poll on the locking error for this one particular case, it is a bit of a corner case I think.

ianballou · 2025-10-27T17:53:22Z

Here is the related Katello PR: Katello/katello#11540

ianballou · 2025-10-29T16:02:46Z

From a meeting about this and Katello/katello#11540, the following was determined to be added to this PR:

Some very basic UI details about the chain information
Error handling so that a chained task is removed from the schedule if a dependency fails (optional, but on by default)
The OR relationship between different chained dependencies turning into AND, which is related to Add execution plan chaining #446 (comment)

@adamruzicka , feel free to correct me if the expected outcome was different.

ianballou · 2025-11-05T21:22:28Z

I have a quick AI prototype of how to show the info in the Dynflow console:

diff --git a/lib/dynflow/execution_plan.rb b/lib/dynflow/execution_plan.rb
index e2dd251..b948eba 100644
--- a/lib/dynflow/execution_plan.rb
+++ b/lib/dynflow/execution_plan.rb
@@ -432,7 +432,7 @@ module Dynflow
     end
 
     def to_hash
-      recursive_to_hash id:                id,
+      hash = recursive_to_hash id:                id,
                         class:             self.class.to_s,
                         label:             label,
                         state:             state,
@@ -446,6 +446,22 @@ module Dynflow
                         execution_time:    execution_time,
                         real_time:         real_time,
                         execution_history: execution_history.to_hash
+
+      # Add dependency information for chained execution plans
+      if state == :scheduled
+        deps = load_dependencies
+        hash[:dependencies] = deps if deps.any?
+      end
+
+      hash
+    end
+
+    def load_dependencies
+      persistence.adapter.db[:dynflow_execution_plan_dependencies]
+        .where(execution_plan_uuid: id)
+        .select_map(:blocked_by_uuid)
+    rescue
+      []
     end
 
     def save
diff --git a/web/views/show.erb b/web/views/show.erb
index 7e895f6..efbf130 100644
--- a/web/views/show.erb
+++ b/web/views/show.erb
@@ -31,6 +31,18 @@
     <b>Start before:</b>
     <%= h(@plan.delay_record.start_before || "-") %>
   </p>
+  <% dependencies = @plan.load_dependencies %>
+  <% if dependencies.any? %>
+    <p>
+      <b>Waiting for:</b>
+      <%= dependencies.size %> task<%= dependencies.size > 1 ? 's' : '' %> to complete
+    </p>
+    <p style="margin-left: 20px;">
+      <% dependencies.each do |dep_id| %>
+        <a href="<%= url("/#{dep_id}") %>"><%= h(dep_id) %></a><br/>
+      <% end %>
+    </p>
+  <% end %>
 <% end %>

adamruzicka · 2025-11-07T13:21:30Z

@ianballou I borrowed bits and pieces from what you suggested here, could you please take it for a spin?

ianballou · 2025-11-17T20:33:09Z

@adamruzicka I've tested this a bunch now with Katello/katello#11540.

It seems to be working really well. The scheduled task waits on its children and cancels itself if one of them fails.

Do you need me to perform a code review as well? Or can we have someone with an outside opinion come do that?

ofedoren

Few cents, can be ignored:

ofedoren · 2025-11-27T13:24:07Z

lib/dynflow/debug/telemetry/persistence.rb

  end
 end

 ::Dynflow::Persistence.prepend ::Dynflow::Debug::Persistence


I might be missing something, but shouldn't it use ::Dynflow::Debug::Telemetry::Persistence instead? And if so, I think the whole file is not being loaded then since there was no error to imply that...

Probably? But let's address that elsewhere, tracked as #463

ofedoren · 2025-11-27T13:25:00Z

lib/dynflow/persistence_adapters/sequel_migrations/025_create_execution_plan_dependencies.rb

+                          end
+      foreign_key :execution_plan_uuid, :dynflow_execution_plans, on_delete: :cascade, **column_properties
+      foreign_key :blocked_by_uuid, :dynflow_execution_plans, on_delete: :cascade, **column_properties
+      index :blocked_by_uuid


Maybe it's worth to also index :execution_plan_uuid?

ofedoren · 2025-11-27T13:31:57Z

lib/dynflow/debug/telemetry/persistence.rb

  module Debug
    module Telemetry
      module Persistence
        methods = [


Why this list contains quite a few duplicates?

Same as #446 (comment)

adamruzicka · 2025-11-28T15:21:04Z

and cancels itself if one of them fails.

That wasn't there before, I just added that.

Do you need me to perform a code review as well?

I won't say no to that offer

ianballou · 2025-12-01T22:16:35Z

and cancels itself if one of them fails.

That wasn't there before, I just added that.

Strange, I wonder what canceled them then. Anyway, I'll give this a re-test.

ianballou · 2025-11-07T20:00:11Z

web/views/show.erb

  <%= h(@plan.ended_at) %>
 </p>

+<% if @plan.state == :scheduled && @plan.delay_record %>


Looking nice:

Don't know if it's presentable on the UI but the chaining being displayed only while chained task is scehduled and waiting seems like losing information after the job runs on the dynflow UI..Might be good to persist the UI chain info for debugging reasons?

Do you mean like, for example, change from "Waiting for execution plans:" to "Waited for execution plans:" once the task actually runs? And start showing the old chained plans, of course.

Changed, in the end I went with static "depends on"

ianballou · 2025-12-01T23:07:37Z

lib/dynflow/director.rb

+        if plan.start_before.nil?
+          blocker_ids = world.persistence.find_execution_plan_dependencies(execution_plan_id)
+          statuses = world.persistence.find_execution_plan_statuses({ filters: { uuid: blocker_ids } })
+          failed = statuses.select { |_uuid, status| status[:state] == 'stopped' && status[:result] == 'error' }


To test this I tried the following:

Create a chained composite CV task by publishing two child content views

Force cancel one of the dependencies, which left the task in a funny state "stopped - pending"

Force canceling didn't seem to trigger the chained task to run, so I did the following on the Force cancelled task:

ForemanTasks::Task.where(id: '63fcef24-bc37-4445-9221-44382f216442').update(result: 'error')

After that, I noticed the chained task actually started running - I though it would halt itself with an error.

Is my test here flawed somehow?

Actually - @sjha4 in your testing, this might be good to try to reproduce. Maybe I just had a timing issue.

This is weird. Updating the ForemanTasks::Task object should have no impact on anything as that's completely external to dynflow.

sjha4 · 2025-12-02T18:24:54Z

lib/dynflow/delayed_plan.rb


+    def failed_dependencies(uuids)
+      bullets = uuids.map { |u| "- #{u}" }.join("\n")
+      msg = "Execution plan could not be started because some of its preqrequisite execution plans failed:\n#{bullets}"


Typo in prerequisite

ianballou

Just some less important comments, generally this is looking fine to me and has been working well in testing.

ianballou · 2025-12-08T19:51:53Z

lib/dynflow/persistence_adapters/sequel.rb

+        # Subquery to find delayed plans that have at least one non-stopped dependency
+        plans_with_unfinished_deps = table(:execution_plan_dependency)
+                                     .join(TABLES[:execution_plan], uuid: :blocked_by_uuid)
+                                     .where(::Sequel.~(state: 'stopped'))


Just to check - when you force unlock a task, does it go to the 'stopped' state? If it doesn't, we might need a workflow for unlinking the scheduled task from the one that was force unlocked.

Actually, my comment above says it goes to the stopped state. In which case, since it doesn't go to stopped - error, I believe the parent chained task should start running. I'm not sure how feasible it would be to cause force unlock to unschedule chained parents.
I'd be okay with force unlock continuing to run the parent tasks since it's pretty much a debug action.

Actually, my comment above says it goes to the stopped state.

It does

I'm not sure how feasible it would be to cause force unlock to unschedule chained parents.

It would probably be on the more difficult end of the spectrum, so I'd prefer to not go down that path.

ianballou · 2025-12-08T19:52:29Z

lib/dynflow/director.rb

+        if plan.start_before.nil?
+          blocker_ids = world.persistence.find_execution_plan_dependencies(execution_plan_id)
+          statuses = world.persistence.find_execution_plan_statuses({ filters: { uuid: blocker_ids } })
+          failed = statuses.select { |_uuid, status| status[:state] == 'stopped' && status[:result] == 'error' }


Actually - @sjha4 in your testing, this might be good to try to reproduce. Maybe I just had a timing issue.

ianballou · 2025-12-08T19:54:19Z

test/future_execution_test.rb

+          _(plan2.errors.first.message).must_match(/#{plan1.id}/)
+        end
+
+        it 'cancels the chained plan if at least one prerequisite fails' do


Is it worth adding a test as well for force unlocking a prerequisite task?

ianballou · 2025-12-12T18:10:47Z

web/views/show.erb

+<% dependencies = @plan.world.persistence.find_execution_plan_dependencies(@plan.id) %>
+<% if dependencies.any? %>
+  <p>
+    <b>Depends on execution plans:</b>


@sjha4 it looks like what you asked for, a permanent dependency history, is here now :)

Awesome..It was hard to test the dependencies on the UI without this.. :)

ianballou · 2025-12-16T16:01:35Z

The only thing I haven't been able to test yet is multiple dependencies showing up. I'd like to get that going first, and then we should be good to merge here. Testing has been solid so far.

ianballou · 2025-12-16T17:32:52Z

lib/dynflow/persistence_adapters/sequel.rb

        save :delayed, { execution_plan_uuid: execution_plan_id }, value, with_data: false
      end

+      def chain_execution_plan(first, second)


This patch:

diff --git a/lib/dynflow/persistence_adapters/sequel.rb b/lib/dynflow/persistence_adapters/sequel.rb index c673090..0de3ea7 100644 --- a/lib/dynflow/persistence_adapters/sequel.rb +++ b/lib/dynflow/persistence_adapters/sequel.rb @@ -196,7 +196,11 @@ module Dynflow end def chain_execution_plan(first, second) - save :execution_plan_dependency, { execution_plan_uuid: second }, { execution_plan_uuid: second, blocked_by_uuid: first }, with_data: false + # Insert dependency directly without checking for existing records. + # The table is designed to allow multiple dependencies per execution plan. + # Using save() causes upsert behavior that overwrites existing dependencies. + record = { execution_plan_uuid: second, blocked_by_uuid: first } + with_retry { table(:execution_plan_dependency).insert(record) } end def load_step(execution_plan_id, step_id)

Caused multiple dependencies to start showing up for me. I'm unsure if it's save to do the inserts here like this, but it worked around the upserting causing trouble.

I took a different approach, but it should be fixed.

ianballou · 2025-12-16T17:34:27Z

lib/dynflow/world.rb

      Scheduled[execution_plan.id]
    end

+    def chain(plan_uuids, action_class, *args)


I think I'm seeing the chaining only keeping one of the chained tasks instead of them all.

I tested publishing 3 content views. The first takes a really long time. After publishing them in order of speed, I sometimes see that only one of the faster content views was made as a dependency of the composite content view publish.

I thought it could be a Katello PR issue, but after adding debug logging, I found I was passing in two children, yet it the slower child was not waited on (and the new Dynflow chaining UI showed this).

Claude dug around in the code a bit, and looked at:

https://github.com/Dynflow/dynflow/blob/3dea62325aacc96f8df8117f88657970c61f2836/lib/dynflow/persistence_adapters/sequel.rb#L367C1-L386C10

def save(what, condition, value, with_data: true, update_conditions: {}) table = table(what) existing_record = with_retry { table.first condition } unless condition.empty? if value record = prepare_record(what, value, (existing_record || condition), with_data) if existing_record record = prune_unchanged(what, existing_record, record) return value if record.empty? condition = update_conditions.merge(condition) return with_retry { table.where(condition).update(record) } <-------- else with_retry { table.insert record } end else existing_record and with_retry { table.where(condition).delete } end value end

It's suggesting that the upsert logic in here is causing the other chained methods to be overwritten. I tried getting around the upsert logic and only then did I see multiple dependencies in the Dynflow UI for the composite task. See my other comment for the patch.

ianballou

Nice, this is looking good now! I verified that multiple dependencies can exist at the same time.

This commit enables execution plans to be chained. Assuming there is an execution plan EP1, another execution plan EP2 can be chained onto EP1. When chained, EP2 will stay in scheduled state until EP1 goes to stopped state. An execution plan can be chained onto multiple prerequisite execution plans, in which case it will be run once all the prerequisite execution plans are stopped and have not failed. If the prerequisite execution plan ends with stopped-error, the chained execution plan(s) will fail. If the prerequisite execution plan is halted, the chained execution plan(s) will be run. It builds on mechanisms which were already present. When an execution plan is chained, it behaves in the same way as if it was scheduled for future execution. A record is created in dynflow_delayed_table and once the conditions for it to execute are right, it is dispatched by the delayed executor. Because of this, there might be small delay between when the prerequisites finishs and the chained plan is started.

adamruzicka · 2025-12-18T11:21:58Z

Thank you @ianballou & @sjha4 !

ianballou reviewed Oct 22, 2025

View reviewed changes

ianballou mentioned this pull request Oct 27, 2025

Composite cv chaining Katello/katello#11540

Open

ianballou mentioned this pull request Nov 3, 2025

Support for Dynflow chaining theforeman/foreman-tasks#786

Closed

adamruzicka force-pushed the chaining branch from f72d41f to a10f7ae Compare November 7, 2025 12:29

ofedoren reviewed Nov 27, 2025

View reviewed changes

adamruzicka force-pushed the chaining branch from 4d16c6f to 0120550 Compare November 28, 2025 15:15

adamruzicka mentioned this pull request Nov 28, 2025

Possibly dead code in lib/dynflow/debug/telemetry/persistence.rb #463

Open

adamruzicka force-pushed the chaining branch 3 times, most recently from 14a748f to c5b835c Compare December 1, 2025 08:41

ianballou reviewed Dec 1, 2025

View reviewed changes

sjha4 reviewed Dec 2, 2025

View reviewed changes

ianballou reviewed Dec 8, 2025

View reviewed changes

adamruzicka force-pushed the chaining branch from d202ece to e51d132 Compare December 9, 2025 09:56

ianballou reviewed Dec 12, 2025

View reviewed changes

adamruzicka force-pushed the chaining branch from e51d132 to 4c5e39e Compare December 16, 2025 15:15

ianballou reviewed Dec 16, 2025

View reviewed changes

adamruzicka force-pushed the chaining branch from 96b7739 to 14c3b8f Compare December 17, 2025 10:02

ianballou approved these changes Dec 17, 2025

View reviewed changes

adamruzicka force-pushed the chaining branch from 14c3b8f to 410de4e Compare December 18, 2025 09:51

adamruzicka merged commit 5538dd4 into Dynflow:master Dec 18, 2025
12 checks passed

adamruzicka deleted the chaining branch December 18, 2025 11:22

Add execution plan chaining #446

Add execution plan chaining #446

Uh oh!

Conversation

adamruzicka commented May 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianballou commented Jun 4, 2024

Uh oh!

ianballou commented Oct 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ianballou Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ianballou commented Oct 23, 2025

Uh oh!

ianballou commented Oct 23, 2025

Uh oh!

adamruzicka commented Oct 24, 2025

Uh oh!

ianballou commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianballou commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianballou commented Oct 27, 2025

Uh oh!

ianballou commented Oct 29, 2025

Uh oh!

ianballou commented Nov 5, 2025

Uh oh!

adamruzicka commented Nov 7, 2025

Uh oh!

ianballou commented Nov 17, 2025

Uh oh!

ofedoren left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adamruzicka commented Nov 28, 2025

Uh oh!

ianballou commented Dec 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ianballou Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adamruzicka Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adamruzicka commented May 17, 2024 •

edited

Loading

ianballou Oct 23, 2025 •

edited

Loading

ianballou commented Oct 24, 2025 •

edited

Loading

ianballou commented Oct 24, 2025 •

edited

Loading

ianballou Dec 5, 2025 •

edited

Loading

adamruzicka Dec 9, 2025 •

edited

Loading

ianballou Dec 12, 2025 •

edited

Loading