lossyrob · lossyrob · Mar 31, 2026 · Apr 2, 2026 · Apr 2, 2026 · Apr 2, 2026
diff --git a/agents/PAW Review.agent.md b/agents/PAW Review.agent.md
@@ -48,6 +48,13 @@ When the user requests society-of-thought review mode:
 - Pass any user-specified interaction mode, interactive setting, or model preferences
 - Do NOT select specialists yourself — let the SoT engine handle adaptive selection from the full set
 
+## Embedded Review Control State
+
+- If `ReviewContext.md` contains `## Hardened State`, treat that section as the durable source of truth for review-stage items and terminal external-review facts.
+- Keep any built-in TODOs aligned only as an execution mirror.
+- Before yield, delegation, or GitHub posting, reconcile the embedded state when present. If reconciliation cannot make the state `current`, STOP and report the blocker instead of delegating or posting. If the section is absent, continue in legacy best-effort mode and explicitly note that hardened protections are inactive.
+- Do not advance past review-stage items or terminal external-review facts that remain unresolved when hardened state is present.
+
 ## Skill-Based Execution
 
 {{#vscode}}

diff --git a/agents/PAW.agent.md b/agents/PAW.agent.md
@@ -7,7 +7,7 @@ You are a workflow orchestrator using a **hybrid execution model**: interactive
 
 ## Initialization
 
-On first request, identify work context from environment (current branch, `.paw/work/` directories) or user input. If no matching WorkflowContext.md exists, load `paw-init` to bootstrap. If resuming existing work, derive TODO state from completed artifacts. Load `paw-workflow` skill for reference documentation (activity tables, artifact structure, PR routing).
+On first request, identify work context from environment (current branch, `.paw/work/` directories) or user input. If no matching WorkflowContext.md exists, load `paw-init` to bootstrap. If resuming existing work, derive TODO state from the embedded `## Hardened State` section when present; otherwise derive it from completed artifacts. Load `paw-workflow` skill for reference documentation (activity tables, artifact structure, PR routing).
 
 ## Workflow Rules
 
@@ -105,7 +105,12 @@ When calling `paw_new_session`, include resume hint: intended next activity + re
 
 ## Workflow Tracking
 
-Use TODOs to externalize workflow steps.
+Use TODOs as a mirror of active required workflow items.
+
+- If `WorkflowContext.md` contains `## Hardened State`, treat that section as the durable source of truth for required items, gate items, configured procedure items, and later terminal states.
+- Keep TODOs aligned only with items whose status is `pending`, `in_progress`, or `blocked`; do not treat TODOs as the portable source of truth when the embedded state exists.
+- Before yield, delegation, or side-effect execution, reconcile the embedded state when present. If reconciliation cannot make the state `current`, STOP and report the blocker instead of delegating or mutating git/GitHub state. If the section is absent, continue in legacy best-effort mode and explicitly note that hardened protections are inactive.
+- When hardened state is present, do not advance past required activity items, gate items, or configured procedure items that remain `pending`, `in_progress`, or `blocked`.
 
 **Core rule**: After completing ANY activity, determine if you're at a stage boundary (see Stage Boundary Rule). If yes, delegate to `paw-transition` before doing anything else.
 

diff --git a/docs/guide/stage-transitions.md b/docs/guide/stage-transitions.md
@@ -36,6 +36,16 @@ continue
 
 This tells the PAW agent to proceed to the recommended next activity.
 
+## Hardened Control-State and Reconciliation
+
+Current PAW workflows embed a compact `## Hardened State` section inside `WorkflowContext.md` (and `ReviewContext.md` for PAW Review).
+
+- The embedded state records required activities, gate items, configured-procedure items, and terminal outcomes that matter for progression.
+- Built-in TODOs mirror the active required items for execution convenience, but the embedded artifact state remains the durable source of truth across resume and handoff.
+- Transition, status, handoff, resume, and repository-mutation paths reconcile against this state before proceeding.
+
+If the hardened section is absent, PAW continues in **legacy best-effort mode** and reports that hardened protections are inactive.
+
 ## Review Policies
 
 PAW supports four review policies that control when the workflow pauses for human review at artifact boundaries:
@@ -134,6 +144,16 @@ Session Policy: per-stage
 
 To change the policy for an existing workflow, edit `WorkflowContext.md` directly.
 
+## Exact Configured Procedure Enforcement
+
+Planning-docs review, final review, and PAW Review evaluation run the configured review mode exactly:
+
+- `single-model` runs the single-model procedure
+- `multi-model` runs the configured multi-model procedure
+- `society-of-thought` runs the specialist-based procedure
+
+If the configured procedure cannot run in the current context, PAW blocks with an explicit reason instead of silently downgrading to a different mode.
+
 ## Inline Instructions
 
 You can customize activity behavior by adding instructions to your requests:
@@ -202,7 +222,8 @@ The `paw-status` skill analyzes:
 2. **Phase Progress**: Current implementation phase and completion status
 3. **Git State**: Branch, commits ahead/behind, uncommitted changes
 4. **PR Analysis**: Open PRs, review comments needing attention
-5. **Recommended Actions**: Clear next steps based on current state
+5. **Control-State Status**: Reconciled gate items, configured-procedure state, terminal review state, and legacy-mode reporting
+6. **Recommended Actions**: Clear next steps based on current state
 
 ## Handling PR Review Comments
 
@@ -273,3 +294,4 @@ Session Policy controls chat context management. This is primarily relevant for
 
 Session Policy is stored in `WorkflowContext.md` and can be changed by editing the file directly.
 
+When `per-stage` mode opens a fresh session in VS Code, the next session resumes from the same embedded control state in `WorkflowContext.md` or `ReviewContext.md`. Session boundaries do not reset required activities or review-stage progress.
diff --git a/docs/reference/agents.md b/docs/reference/agents.md
@@ -11,10 +11,19 @@ PAW uses two AI chat modes ("agents") that orchestrate workflow activities throu
 
 Both agents follow the same pattern: a compact orchestrator that loads a workflow skill for guidance, then delegates activities to specialized skills via subagents.
 
-> **Platform runtime note:** Copilot CLI runs installed `agents/`, `skills/`, and prompt content directly. The VS Code extension uses `src/` TypeScript for commands, setup, and handoff, but that code is not executed in CLI sessions.
+> **Platform runtime note:** Copilot CLI runs installed `agents/`, `skills/`, and prompt content directly. The VS Code extension uses `src/` TypeScript for commands, setup, status, and handoff, but that code is not executed in CLI sessions. Both runtimes rely on the same durable contract in `WorkflowContext.md` and `ReviewContext.md`; VS Code surfaces preserve and forward that contract rather than implementing alternate workflow logic.
 
 > **PAW Lite** is available as the `paw-lite` skill — any agent can load it on demand.
 
+## Hardened-State Parity
+
+Current PAW and PAW Review sessions embed a compact `## Hardened State` section inside their context artifacts.
+
+- `WorkflowContext.md` tracks required implementation activities, gate items, configured review procedures, and lifecycle markers
+- `ReviewContext.md` tracks review-stage progression, configured review mode, and terminal external-review outcomes
+- Built-in TODOs mirror the active required items for execution convenience, but the embedded artifact state remains the durable source of truth across resume and handoff
+- If hardened state is absent, agents continue in legacy best-effort mode and say so explicitly
+
 ---
 
 ## Implementation Workflow
@@ -32,7 +41,7 @@ Both agents follow the same pattern: a compact orchestrator that loads a workflo
 1. Loads the `paw-workflow` skill for orchestration guidance
 2. Resolves activity skills using the current platform's mechanism (`paw_get_skills` / `paw_get_skill` in VS Code; installed skill files in Copilot CLI)
 3. Delegates activities to specialized skills
-4. Applies Review Policy and Session Policy for workflow control
+4. Reads `WorkflowContext.md`, reconciles hardened state when present, and applies Review Policy and Session Policy for workflow control
 
 **Hybrid Execution Model:**
 
@@ -43,6 +52,8 @@ Both agents follow the same pattern: a compact orchestrator that loads a workflo
 
 This preserves conversation flow for interactive work while leveraging fresh context for focused research and review.
 
+Before transitions, resume/handoff, or repository mutation, the PAW agent reconciles the embedded workflow state with artifact, git, and PR reality. Unresolved required items block progression with explicit reasons.
+
 **Activity Skills:**
 
 | Skill | Capabilities | Primary Artifacts |
@@ -123,7 +134,7 @@ This preserves conversation flow for interactive work while leveraging fresh con
 
 **Invocation (Copilot CLI):** `copilot --agent PAW-Review` then provide the PR number or URL
 
-**Architecture:** The PAW Review agent uses a skills-based architecture. The shared review logic lives in skills; VS Code-specific automation remains in `src/` only.
+**Architecture:** The PAW Review agent uses a skills-based architecture. The shared review logic lives in skills; VS Code-specific automation remains in `src/` only. The agent reads `ReviewContext.md`, enforces hardened review state when present, and falls back to legacy best-effort mode when it is absent.
 
 1. Loads the `paw-review-workflow` skill for orchestration
 2. Executes activity skills via subagents for each stage
@@ -162,7 +173,7 @@ This preserves conversation flow for interactive work while leveraging fresh con
    - **Critique response**: Updates comments per recommendations, marks final status
    - **GitHub posting**: Creates pending review with only approved comments
 
-**Comment Evolution:** ReviewComments.md shows full history for each comment: original → assessment → updated → posted status. Skipped comments remain visible for manual inclusion if reviewer disagrees with critique.
+**Comment Evolution:** ReviewComments.md shows full history for each comment: original → assessment → updated → posted status. Skipped comments remain visible for manual inclusion if reviewer disagrees with critique. Terminal external-review outcomes such as pending review creation or manual-posting guidance are persisted in `ReviewContext.md`.
 
 **Human Control:** Pending review is never auto-submitted. User reviews comments, edits/deletes as needed, consults ReviewComments.md for full context, then submits manually.
 

diff --git a/docs/reference/artifacts.md b/docs/reference/artifacts.md
@@ -71,6 +71,9 @@ PAW workflows produce durable Markdown artifacts that trace reasoning and decisi
 | Final Review Models | Comma-separated model names |
 | Final Review Specialists | `all`, comma-separated names, or `adaptive:<N>` (society-of-thought only) |
 | Final Review Interaction Mode | `parallel` or `debate` (society-of-thought only) |
+| Final Review Specialist Models | `none`, model pool, pinned pairs, or mixed (society-of-thought only) |
+| Final Review Perspectives | `none`, `auto`, or comma-separated perspective names (society-of-thought only) |
+| Final Review Perspective Cap | Positive integer (society-of-thought only) |
 | Planning Docs Review | `enabled` or `disabled` |
 | Plan Generation Mode | `single-model` or `multi-model` |
 | Plan Generation Models | Comma-separated model names |
@@ -89,6 +92,13 @@ PAW workflows produce durable Markdown artifacts that trace reasoning and decisi
 - `Artifact Lifecycle` controls git tracking for `.paw/work/` files only; it does **not** manage worktree creation, retention, or cleanup
 - `WorkflowContext.md` deliberately stores portable metadata only; VS Code may keep machine-specific worktree paths in local extension state, while Copilot CLI sessions rely on the committed metadata plus git/worktree evidence instead
 
+**Hardened-state section (current workflows):**
+
+- `WorkflowContext.md` includes a compact `## Hardened State` block recording required activities, gate items, configured-procedure items, and lifecycle markers
+- Built-in TODOs mirror the active required items from this block for in-session execution, but the embedded artifact state remains the durable source of truth
+- `paw-transition`, `paw-status`, handoff/resume paths, and final PR readiness checks reconcile against this section before proceeding
+- If the section is absent, the workflow runs in legacy best-effort mode and reports that hardened protections are inactive
+
 ### Spec.md
 
 **Purpose:** Testable requirements document defining what the feature must do.
@@ -222,6 +232,8 @@ This decouples intent capture from phase elaboration, preserving implementer mom
 
 When perspectives are active, each finding includes a `**Perspective**` field indicating which lens surfaced it (`baseline` for findings without a perspective overlay).
 
+`REVIEW-SYNTHESIS.md` is also the artifact that resolves the configured evaluation/review procedure for society-of-thought flows. If the configured mode requires synthesis and this artifact is missing, the workflow stays blocked instead of silently falling back to another procedure.
+
 ---
 
 ## Review Workflow Artifacts
@@ -243,6 +255,12 @@ When perspectives are active, each finding includes a `**Perspective**` field in
 | CI Status | Passing, failing, pending |
 | Flags | CI failures, breaking changes suspected |
 
+**Hardened-state section (current reviews):**
+
+- `ReviewContext.md` includes a compact `## Hardened State` block for review-job identifier, configured review mode, required stage items, and terminal external-review outcomes
+- The section lets resume, status, critique, and posting paths re-enter the same review job without re-inferring state from prompt prose
+- If the section is absent, PAW Review uses legacy best-effort mode and says so explicitly
+
 ### ResearchQuestions.md
 
 **Purpose:** Research questions to guide baseline codebase analysis.
@@ -329,6 +347,8 @@ When perspectives are active, each finding includes a `**Perspective**` field in
 - **Comment text** — What gets posted
 - **Rationale** — Evidence, baseline pattern, impact, best practice
 - **Assessment** — Usefulness, accuracy, trade-offs (never posted)
+- **Final marker** — `**Final**:` status that determines whether the comment is ready for posting
+- **Posted/manual state** — Pending review IDs or manual-posting guidance reflected in the associated review control state
 
 ---