feat(cron): goal-driven auto-disable usercron jobs (implements #816)#818
Conversation
99e406d to
f8d3d16
Compare
OpenAB PR ScreeningThis is auto-generated by the OpenAB project-screening flow for context collection and reviewer handoff.
Screening report## IntentPR #818 tries to let The operator-visible problem: today, a goal-oriented scheduled job can keep running even after it has succeeded, causing repeated agent runs, noise, wasted compute, and possible repeated Discord/thread activity. This PR adds a completion check so a job can prove success and then persist FeatFeature. Behaviorally, It also persists Who It ServesPrimary beneficiary: agent runtime operators and deployers running goal-driven scheduled jobs. Secondary beneficiaries: maintainers and reviewers, because completed recurring work becomes explicit state instead of ambient scheduler behavior. Rewritten PromptImplement goal-completion auto-disable for Add optional disable_on_success = "command"
disable_on_success_match = "required output marker"
disable_on_success_timeout_secs = 120
disable_on_success_working_dir = "/path"Before running the normal cron prompt, execute the completion command when configured. Treat the goal as complete only if the command exits Persist scheduler writebacks atomically where possible, avoid affecting non-usercron config, and add focused tests for success, missing marker, nonzero exit, timeout, missing Merge PitchThis is worth advancing because it closes a real scheduler lifecycle gap: goal-driven jobs need a first-class way to stop themselves after success. Risk profile is moderate. The behavior touches scheduler execution and config persistence, so reviewer concern will likely center on writeback safety, race conditions, TOML preservation, and whether shell-command success checks are too footgun-prone. The PR is directionally useful, but the large Best-Practice ComparisonOpenClaw principles that apply:
Hermes Agent principles that apply:
Implementation OptionsConservative option: merge only the config fields, completion check, and skip behavior, but do not write back Balanced option: keep the PR’s core behavior, but harden writeback. Require stable Ambitious option: introduce a durable scheduler state layer separate from the user-authored TOML. Store job completion, thread routing, run history, retry state, and disable reasons in a scheduler-owned state file or database. This aligns better with OpenClaw/Hermes long-term patterns but is larger than this PR. Comparison Table
RecommendationAdvance the balanced path. The feature solves a concrete lifecycle problem and matches the direction of gateway-owned scheduling, but merge discussion should focus on making persistence boring: stable If the current PR does not already guarantee safe writeback semantics, split that hardening into the required follow-up before merge rather than treating it as optional polish. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
chaodu-agent
left a comment
There was a problem hiding this comment.
Blocking note before merge: the timeout path around disable_on_success does not actually guarantee the spawned command is terminated. Tokio process handles continue running after drop unless kill_on_drop is enabled or the child is explicitly killed/reaped. In check_disable_on_success, timeout(child.output()) drops the output future on timeout and returns NotAchieved, but a long-running command may keep executing in the background. That violates the documented runaway-command mitigation and can leave repeated goal checks piling up. Please switch to explicit spawn + timeout around wait/output with kill/reap on timeout, or set kill_on_drop(true) before output and add a regression test using a long sleep.
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
LGTM ✅ — Clean implementation of goal-driven auto-disable for usercron jobs. CI green, well-tested, ready to merge. 四問框架 Review1. What problem does it solve?Usercron jobs fire indefinitely until manually disabled. This PR adds a "stop condition" — before sending the scheduled prompt, run a command; if it exits 0 AND prints a configured marker, the goal is achieved and the job auto-disables. This enables "escape room" mode where agents work autonomously until an objective is met. 2. How does it solve it?
3. What was considered?
4. Is this the best approach?Yes for Phase 1. The design is minimal and correct:
Traffic Light🟢 INFO — Pipe draining via separate tokio tasks before 🟢 INFO — The decision to NOT clear in-flight indices on usercron reload is correct. A scheduler writeback (thread_id or enabled=false) changes mtime; clearing indices would allow the same job to fire concurrently while its previous run is still active. 🟢 INFO — 🟢 INFO — Test coverage is thorough: validation tests, writeback tests, success/failure/timeout async tests, and the existing validate_cronjobs tests updated with new fields. |
Implements the design from #816 (ADR: goal-driven cronjob).
Summary
Adds goal-driven cronjobs — scheduled tasks that automatically disable themselves once a goal is achieved.
Motivation
Today, cronjobs fire indefinitely on a schedule. But many real-world use cases are goal-oriented: "keep running tests and fixing code until they pass." Once the goal is met, the job should stop. Without this, users must manually disable jobs or the agent keeps working on a solved problem.
How It Works
A usercron job can define a completion check — a command that runs before each scheduled prompt. If the command exits
0AND its output contains a configurable marker, the goal is considered complete:Execution Flow
Why Both Exit Code AND Marker?
Plain
exit 0is too easy to satisfy accidentally (e.g.,npm testexits 0 if no tests exist). The marker (OPENAB_GOAL_SUCCESS) is an explicit signal that the specific goal was met, not just that a command ran without error.Implementation
disable_on_successcommand with configurable timeout and working directoryenabled = falseandthread_idback to$HOME/.openab/cronjob.tomlby stableid[[jobs]]in external file), not baseline[[cron.jobs]]— keeps writeback limited to user-managed filesNew Usercron Fields
iddisable_on_successdisable_on_success_matchdisable_on_success_timeout_secs60disable_on_success_working_dirUse Cases
Re-enabling a Disabled Job
When a goal is achieved, OAB writes
enabled = falseto$HOME/.openab/cronjob.toml. To restart the job:This is currently a manual edit (or the AI agent can do it if asked). A future enhancement could add a slash command (e.g.
/cron enable fix-unit-tests) or allow the agent to re-enable jobs via tool call.Tests
git diff --check✅Discord thread: https://discord.com/channels/1491295327620169908/1504239931940409587
Comparison with Other Agents
enabled = truePros and Cons
Best for: Engineering goals with clear pass/fail signals (tests pass, deploy healthy, migration complete).