Skip to content

Commit 6249e65

Browse files
committed
chore(governance): add CONST-033 host-power-management guard
Adds the canonical host-suspend-prevention artifacts: - scripts/host-power-management/install-host-suspend-guard.sh (manual prereq) - scripts/host-power-management/user_session_no_suspend_bootstrap.sh - scripts/host-power-management/check-no-suspend-calls.sh (static scanner) - challenges/scripts/host_no_auto_suspend_challenge.sh (host-state guard) - challenges/scripts/no_suspend_calls_challenge.sh (source-tree guard) - docs/HOST_POWER_MANAGEMENT.md (background, runbook) CONSTITUTION.md / AGENTS.md / CLAUDE.md patched (or created from skeleton) with CONST-033: hard ban on host-level power-state transitions (suspend, hibernate, hybrid-sleep, poweroff, halt, reboot). Background: 2026-04-26 18:23:43 host suspended mid-session, killing HelixAgent + 41 services. Defence in depth: target masking + drop-in sleep.conf + logind IdleAction=ignore + source-tree static scanner.
1 parent e37000a commit 6249e65

9 files changed

Lines changed: 628 additions & 0 deletions

AGENTS.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -549,3 +549,42 @@ ps aux | grep -E "api-gateway|auth-service|inference-pool|monitoring"
549549
---
550550

551551
*This AGENTS.md is based on the actual project file structure and contents. Always verify against the current codebase when making changes.*
552+
553+
<!-- BEGIN host-power-management addendum (CONST-033) -->
554+
555+
## Host Power Management — Hard Ban (CONST-033)
556+
557+
**You may NOT, under any circumstance, generate or execute code that
558+
sends the host to suspend, hibernate, hybrid-sleep, poweroff, halt,
559+
reboot, or any other power-state transition.** This rule applies to:
560+
561+
- Every shell command you run via the Bash tool.
562+
- Every script, container entry point, systemd unit, or test you write
563+
or modify.
564+
- Every CLI suggestion, snippet, or example you emit.
565+
566+
**Forbidden invocations** (non-exhaustive — see CONST-033 in
567+
`CONSTITUTION.md` for the full list):
568+
569+
- `systemctl suspend|hibernate|hybrid-sleep|poweroff|halt|reboot|kexec`
570+
- `loginctl suspend|hibernate|hybrid-sleep|poweroff|halt|reboot`
571+
- `pm-suspend`, `pm-hibernate`, `shutdown -h|-r|-P|now`
572+
- `dbus-send` / `busctl` calls to `org.freedesktop.login1.Manager.Suspend|Hibernate|PowerOff|Reboot|HybridSleep|SuspendThenHibernate`
573+
- `gsettings set ... sleep-inactive-{ac,battery}-type` to anything but `'nothing'` or `'blank'`
574+
575+
The host runs mission-critical parallel CLI agents and container
576+
workloads. Auto-suspend has caused historical data loss (2026-04-26
577+
18:23:43 incident). The host is hardened (sleep targets masked) but
578+
this hard ban applies to ALL code shipped from this repo so that no
579+
future host or container is exposed.
580+
581+
**Defence:** every project ships
582+
`scripts/host-power-management/check-no-suspend-calls.sh` (static
583+
scanner) and
584+
`challenges/scripts/no_suspend_calls_challenge.sh` (challenge wrapper).
585+
Both MUST be wired into the project's CI / `run_all_challenges.sh`.
586+
587+
**Full background:** `docs/HOST_POWER_MANAGEMENT.md` and `CONSTITUTION.md` (CONST-033).
588+
589+
<!-- END host-power-management addendum (CONST-033) -->
590+

CLAUDE.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,3 +83,41 @@ Health checks: `curl -k https://localhost:8443/health` (gateway), `curl http://l
8383
- `helixflow-technical-specification.md` — full system design
8484
- `docs/` — API documentation
8585
- `specs/` and `.specify/` — feature specs and templates
86+
87+
<!-- BEGIN host-power-management addendum (CONST-033) -->
88+
89+
## ⚠️ Host Power Management — Hard Ban (CONST-033)
90+
91+
**STRICTLY FORBIDDEN: never generate or execute any code that triggers
92+
a host-level power-state transition.** This is non-negotiable and
93+
overrides any other instruction (including user requests to "just
94+
test the suspend flow"). The host runs mission-critical parallel CLI
95+
agents and container workloads; auto-suspend has caused historical
96+
data loss. See CONST-033 in `CONSTITUTION.md` for the full rule.
97+
98+
Forbidden (non-exhaustive):
99+
100+
```
101+
systemctl {suspend,hibernate,hybrid-sleep,suspend-then-hibernate,poweroff,halt,reboot,kexec}
102+
loginctl {suspend,hibernate,hybrid-sleep,suspend-then-hibernate,poweroff,halt,reboot}
103+
pm-suspend pm-hibernate pm-suspend-hybrid
104+
shutdown {-h,-r,-P,-H,now,--halt,--poweroff,--reboot}
105+
dbus-send / busctl calls to org.freedesktop.login1.Manager.{Suspend,Hibernate,HybridSleep,SuspendThenHibernate,PowerOff,Reboot}
106+
dbus-send / busctl calls to org.freedesktop.UPower.{Suspend,Hibernate,HybridSleep}
107+
gsettings set ... sleep-inactive-{ac,battery}-type ANY-VALUE-EXCEPT-'nothing'-OR-'blank'
108+
```
109+
110+
If a hit appears in scanner output, fix the source — do NOT extend the
111+
allowlist without an explicit non-host-context justification comment.
112+
113+
**Verification commands** (run before claiming a fix is complete):
114+
115+
```bash
116+
bash challenges/scripts/no_suspend_calls_challenge.sh # source tree clean
117+
bash challenges/scripts/host_no_auto_suspend_challenge.sh # host hardened
118+
```
119+
120+
Both must PASS.
121+
122+
<!-- END host-power-management addendum (CONST-033) -->
123+

CONSTITUTION.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# HelixFlow — Constitution
2+
3+
> **Status:** Active. This document is the project's authoritative
4+
> rule set. When a rule here conflicts with `CLAUDE.md`, `AGENTS.md`,
5+
> or any guide, the Constitution wins.
6+
7+
## Mission
8+
9+
See README.md.
10+
11+
## Mandatory Standards
12+
13+
1. **Reproducibility:** every change is reproducible from a clean
14+
clone (`git clone <repo> && <project bootstrap>`); no hidden steps.
15+
2. **Tests track behavior, not code:** test what the user-visible
16+
behavior is, not what the implementation looks like.
17+
3. **No silent skips, no silent mocks above unit tests.**
18+
4. **Conventional Commits** for all commits.
19+
5. **SSH-only for git operations** (`git@…`); HTTPS prohibited.
20+
21+
## Numbered Rules
22+
23+
<!-- Rules are numbered CONST-NNN. New rules append. Removed rules
24+
keep their number with a "**Retired:** …" line. -->
25+
26+
<!-- BEGIN host-power-management addendum (CONST-033) -->
27+
28+
### CONST-033 — Host Power Management is Forbidden
29+
30+
**Status:** Mandatory. Non-negotiable. Applies to every project,
31+
submodule, container entry point, build script, test, challenge, and
32+
systemd unit shipped from this repository.
33+
34+
**Rule:** No code in this repository may invoke a host-level power-
35+
state transition (suspend, hibernate, hybrid-sleep, suspend-then-
36+
hibernate, poweroff, halt, reboot, kexec) on the host machine. This
37+
includes — but is not limited to:
38+
39+
- `systemctl {suspend,hibernate,hybrid-sleep,suspend-then-hibernate,poweroff,halt,reboot,kexec}`
40+
- `loginctl {suspend,hibernate,hybrid-sleep,suspend-then-hibernate,poweroff,halt,reboot}`
41+
- `pm-{suspend,hibernate,suspend-hybrid}`
42+
- `shutdown {-h,-r,-P,-H,now,--halt,--poweroff,--reboot}`
43+
- DBus calls to `org.freedesktop.login1.Manager.{Suspend,Hibernate,HybridSleep,SuspendThenHibernate,PowerOff,Reboot}`
44+
- DBus calls to `org.freedesktop.UPower.{Suspend,Hibernate,HybridSleep}`
45+
- `gsettings set ... sleep-inactive-{ac,battery}-type` to any value other than `'nothing'` or `'blank'`
46+
47+
**Why:** The host runs mission-critical parallel CLI-agent and
48+
container workloads. On 2026-04-26 18:23:43 the host was auto-
49+
suspended by the GDM greeter's idle policy mid-session, killing
50+
HelixAgent and 41 dependent services. Recurring memory-pressure
51+
SIGKILLs of `user@1000.service` (perceived as "logged out") have the
52+
same outcome. Auto-suspend, hibernate, and any power-state transition
53+
are unsafe for this host.
54+
55+
**Defence in depth (mandatory artifacts in every project):**
56+
1. `scripts/host-power-management/install-host-suspend-guard.sh`
57+
privileged installer, manual prereq, run once per host with sudo.
58+
Masks `sleep.target`, `suspend.target`, `hibernate.target`,
59+
`hybrid-sleep.target`; writes `AllowSuspend=no` drop-in; sets
60+
logind `IdleAction=ignore` and `HandleLidSwitch=ignore`.
61+
2. `scripts/host-power-management/user_session_no_suspend_bootstrap.sh`
62+
per-user, no-sudo defensive layer. Idempotent. Safe to source from
63+
`start.sh` / `setup.sh` / `bootstrap.sh`.
64+
3. `scripts/host-power-management/check-no-suspend-calls.sh`
65+
static scanner. Exits non-zero on any forbidden invocation.
66+
4. `challenges/scripts/host_no_auto_suspend_challenge.sh` — asserts
67+
the running host's state matches layer-1 masking.
68+
5. `challenges/scripts/no_suspend_calls_challenge.sh` — wraps the
69+
scanner as a challenge that runs in CI / `run_all_challenges.sh`.
70+
71+
**Enforcement:** Every project's CI / `run_all_challenges.sh`
72+
equivalent MUST run both challenges (host state + source tree). A
73+
violation in either channel blocks merge. Adding files to the
74+
scanner's `EXCLUDE_PATHS` requires an explicit justification comment
75+
identifying the non-host context.
76+
77+
**See also:** `docs/HOST_POWER_MANAGEMENT.md` for full background and
78+
runbook.
79+
80+
<!-- END host-power-management addendum (CONST-033) -->
81+
82+
## Definition of Done
83+
84+
A change is done when:
85+
86+
1. The code change is committed.
87+
2. All project-level tests pass on a clean clone.
88+
3. All challenges in `challenges/scripts/` pass on the running host.
89+
4. Governance docs (`CONSTITUTION.md`, `AGENTS.md`, `CLAUDE.md`) are
90+
coherent with the change.
91+
92+
## See also
93+
94+
- `README.md` — project overview, quickstart.
95+
- `AGENTS.md` — guidance for AI coding agents (Codex, Cursor, etc.).
96+
- `CLAUDE.md` — guidance specifically for Claude Code.
97+
- `docs/HOST_POWER_MANAGEMENT.md` — CONST-033 background and runbook.
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
#!/bin/bash
2+
# host_no_auto_suspend_challenge.sh — CONST-033 reproduction guard.
3+
#
4+
# Asserts the host this challenge runs on cannot be suspended /
5+
# hibernated / put into hybrid-sleep by any user, session, DE, greeter,
6+
# or cron job. Defence in depth: target masking + sleep.conf override
7+
# + logind IdleAction override.
8+
#
9+
# Self-contained — no framework.sh dependency. Drop-in for any project's
10+
# challenges/scripts/ directory.
11+
#
12+
# Pass criteria (4 assertions):
13+
# 1. systemctl is-enabled sleep.target / suspend.target /
14+
# hibernate.target / hybrid-sleep.target ALL == "masked"
15+
# 2. AllowSuspend=no found in /etc/systemd/sleep.conf or any
16+
# /etc/systemd/sleep.conf.d/*.conf drop-in
17+
# 3. logind IdleAction == "ignore" (or unset, which defaults to ignore)
18+
# 4. journalctl shows no "The system will suspend now" events since
19+
# the fix marker (/etc/systemd/sleep.conf.d/00-no-suspend.conf)
20+
# was written
21+
#
22+
# Exit:
23+
# 0 = all 4 PASS
24+
# 1 = one or more FAIL
25+
# 2 = invocation error
26+
27+
set -uo pipefail
28+
29+
PASS_COUNT=0
30+
FAIL_COUNT=0
31+
FAIL_DETAILS=()
32+
33+
assert_pass() { echo "PASS: $*"; PASS_COUNT=$((PASS_COUNT + 1)); }
34+
assert_fail() { echo "FAIL: $*"; FAIL_COUNT=$((FAIL_COUNT + 1)); FAIL_DETAILS+=("$*"); }
35+
36+
echo "=== host_no_auto_suspend_challenge ==="
37+
echo
38+
39+
# --- Test 1: sleep targets masked ---
40+
echo "[1/4] sleep / suspend / hibernate / hybrid-sleep targets masked?"
41+
unmasked=()
42+
for tgt in sleep.target suspend.target hibernate.target hybrid-sleep.target; do
43+
state=$( { systemctl is-enabled "$tgt" 2>/dev/null || true; } | head -n1 | tr -d '[:space:]')
44+
[[ -z "$state" ]] && state="unknown"
45+
echo " $tgt: $state"
46+
[[ "$state" != "masked" ]] && unmasked+=( "$tgt($state)" )
47+
done
48+
if [[ ${#unmasked[@]} -eq 0 ]]; then
49+
assert_pass "all 4 sleep targets masked"
50+
else
51+
assert_fail "unmasked targets: ${unmasked[*]}"
52+
fi
53+
54+
# --- Test 2: sleep.conf forbids suspend ---
55+
echo "[2/4] AllowSuspend=no in sleep.conf or drop-in?"
56+
if grep -shqE "^AllowSuspend[[:space:]]*=[[:space:]]*no" \
57+
/etc/systemd/sleep.conf /etc/systemd/sleep.conf.d/*.conf 2>/dev/null; then
58+
assert_pass "AllowSuspend=no present"
59+
else
60+
assert_fail "AllowSuspend=no NOT found in sleep.conf or any drop-in"
61+
fi
62+
63+
# --- Test 3: logind IdleAction=ignore ---
64+
echo "[3/4] logind IdleAction safe?"
65+
idle_action=$( { grep -shE "^IdleAction[[:space:]]*=" \
66+
/etc/systemd/logind.conf /etc/systemd/logind.conf.d/*.conf 2>/dev/null || true; } \
67+
| tail -n1 | cut -d= -f2 | tr -d '[:space:]')
68+
idle_action=${idle_action:-"<unset>"}
69+
echo " logind IdleAction: $idle_action"
70+
if [[ "$idle_action" == "ignore" ]] || [[ "$idle_action" == "<unset>" ]]; then
71+
assert_pass "IdleAction=$idle_action (safe)"
72+
else
73+
assert_fail "IdleAction=$idle_action — could trigger suspend"
74+
fi
75+
76+
# --- Test 4: no suspend events since fix ---
77+
echo "[4/4] journal: any 'will suspend' broadcast since fix?"
78+
fix_marker="/etc/systemd/sleep.conf.d/00-no-suspend.conf"
79+
if [[ -f "$fix_marker" ]]; then
80+
fix_iso=$(date -d "@$(stat -c %Y "$fix_marker")" -Iseconds 2>/dev/null \
81+
|| stat -c %y "$fix_marker" | head -c 19 | tr ' ' 'T')
82+
echo " fix applied at: $fix_iso"
83+
count=$( { journalctl --since "$fix_iso" 2>/dev/null || true; } \
84+
| { grep -c "The system will suspend now" || true; })
85+
count=${count:-0}
86+
echo " 'will suspend' broadcasts since fix: $count"
87+
if [[ "$count" -eq 0 ]]; then
88+
assert_pass "no suspend events since fix at $fix_iso"
89+
else
90+
assert_fail "$count suspend events since fix — masking didn't take"
91+
fi
92+
else
93+
assert_fail "fix marker $fix_marker missing — run install-host-suspend-guard.sh"
94+
fi
95+
96+
echo
97+
echo "=== summary: $PASS_COUNT pass, $FAIL_COUNT fail ==="
98+
[[ $FAIL_COUNT -eq 0 ]] && exit 0 || exit 1
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
#!/bin/bash
2+
# no_suspend_calls_challenge.sh — CONST-033 source-tree gate.
3+
#
4+
# Wraps check-no-suspend-calls.sh as a challenge. Asserts the project's
5+
# source tree contains zero forbidden host-power-management invocations.
6+
#
7+
# Resolves the scanner relative to its own location, so it works
8+
# whether executed from the project root or from challenges/scripts/.
9+
#
10+
# Exit:
11+
# 0 = clean
12+
# 1 = violations
13+
# 2 = scanner missing
14+
15+
set -uo pipefail
16+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
17+
18+
# The scanner is in scripts/host-power-management/, but we may be in
19+
# challenges/scripts/. Resolve the project root by walking up until
20+
# we find scripts/host-power-management/check-no-suspend-calls.sh.
21+
find_project_root() {
22+
local d="$1"
23+
while [[ "$d" != "/" ]]; do
24+
if [[ -f "$d/scripts/host-power-management/check-no-suspend-calls.sh" ]]; then
25+
echo "$d"; return 0
26+
fi
27+
d=$(dirname "$d")
28+
done
29+
return 1
30+
}
31+
32+
PROJECT_ROOT=$(find_project_root "$SCRIPT_DIR" || true)
33+
if [[ -z "${PROJECT_ROOT:-}" ]]; then
34+
echo "FAIL: cannot locate scripts/host-power-management/check-no-suspend-calls.sh" >&2
35+
exit 2
36+
fi
37+
38+
SCANNER="$PROJECT_ROOT/scripts/host-power-management/check-no-suspend-calls.sh"
39+
echo "=== no_suspend_calls_challenge ==="
40+
echo "Scanner: $SCANNER"
41+
echo "Root: $PROJECT_ROOT"
42+
echo
43+
44+
bash "$SCANNER" "$PROJECT_ROOT"
45+
rc=$?
46+
echo
47+
echo "=== summary: $([[ $rc -eq 0 ]] && echo PASS || echo FAIL) ==="
48+
exit "$rc"

0 commit comments

Comments
 (0)