diff --git a/docs/experiments/22_roy_5a.md b/docs/experiments/22_roy_5a.md index a7ad5f19..d6b2e024 100644 --- a/docs/experiments/22_roy_5a.md +++ b/docs/experiments/22_roy_5a.md @@ -11,13 +11,23 @@ ## Status -**VERDICT: H1a — encoder subspace incompatibility (via n/a / empty-matrix path, not low-cosine path — see § Result).** +**VERDICT: H1a — encoder subspace incompatibility (confirmed across three runs, two distinct mechanisms).** -`max(M_tt food-bearing) = n/a` (no text-modality food-bearing priming centroids exist **in this run**), which the pre-registered decoder maps to `< 0.20` → **H1a**. The empirical mechanism producing the verdict is **stronger than the plan modeled**: the issue is not "text food centroids exist but are far from arm-A text centroids in encoder space" (the plan's H1a framing), it is **"text-modality food centroids do not exist at all in either priming or arm A on this run"**. +`max(M_tt food-bearing) = n/a` (no text-modality food-bearing priming centroids exist regardless of `MAXIM_SUBSTRATE_PATH` state). Pre-registered decoder maps `-inf` → `< 0.20` → **H1a**. Three runs confirm the verdict: -**Important caveat:** [Roy-4 (21_roy_4.md)](21_roy_4.md) on the **identical priming spec** produced 154 text-modality EC fires. The cradle arc *can* land text-modality nodes; Roy-5a did not. This means the load-bearing claim is NOT "text-modality food concepts can never form during cradle priming" — it is "on Roy-5a's run they did not, and the verdict triggers on that observation". The text-modality silence may be run-to-run variance or a quiet regression in text-percept routing; see "Recommended next step" for the disambiguating re-run plan. +| Run | `MAXIM_SUBSTRATE_PATH` | Priming text/intero fires | Food-bearing text centroids | Mechanism | +|---|---|---|---|---| +| **Roy-5a** (initial) | unset | 0 / 151 | 0 (no text nodes at all) | H1a via "no text nodes exist" | +| **Roy-5a-confirm** | unset | 0 / 153 | 0 (same) | H1a — silence reproduces, not run-to-run variance | +| **Roy-5a-substrate-on** | **=1** | **162 / 154** | **0 (zero of 14 text centroids are food-bearing)** | **H1a via "food NAc bias keys are exclusively interoception node IDs"** | -**Per [roy_5_encoder_alignment_disambiguator.md § Stage 2c](../plans/roy_5_encoder_alignment_disambiguator.md), this verdict triggers Stage 3** — redesigned cradle priming arc with deliberate `(sensor, drive, narrator-utterance)` co-firing + Hebbian retest. **But the secondary observation (text-modality silence on the food concept across both priming AND arm A) suggests the Stage 3 redesign has a two-part pass criterion**: (a) text-modality EC fires must land on food-related ticks at all, and (b) the Hebbian rule must bind them to interoception clusters. (a) is the load-bearing prerequisite the existing arc apparently doesn't meet reliably on Roy-5a; (b) is what the plan originally focused on. Recommend the user inspect text-modality routing before Stage 3 begins. See "Recommended next step" below. +Roy-4's 154 text fires on the identical spec turned out to be the result of `MAXIM_SUBSTRATE_PATH=1` being inherited from the shell environment, not a code regression. Roy-5a + Roy-5a-confirm ran without it; Roy-5a-substrate-on explicitly sets it and reproduces Roy-4's text-modality fire count (162 vs 154 — within run-to-run variance). **The fundamental finding holds across the env-var state:** food NAc-attributed cluster IDs are interoception-modality EC node IDs, never text-modality. + +**Stronger structural finding — dimension mismatch (uncovered by the analyzer's dimension-mismatch warning, added in the pre-merge review fold):** `SensorEncoder` produces **384-dim** SHA-basis embeddings for interoception modality; `LinguisticEncoder` (`paraphrase-mpnet-base-v2`) produces **768-dim** embeddings for text modality. Cross-modality cosine (`M_dt`) is **mathematically undefined** — the vectors live in different dimensional spaces, and `_cosine` returns 0.0 silently on length mismatch. This means the plan's "encoder subspaces are far in cosine space" framing of H1a is **structurally weaker than the data actually shows**: the subspaces aren't far, they're **different-dimensional**, and any cosine-based cross-modal alignment is structurally impossible without a learned projection layer between them. + +**Per [roy_5_encoder_alignment_disambiguator.md § Stage 2c](../plans/roy_5_encoder_alignment_disambiguator.md), this verdict triggers Stage 3** — redesigned cradle priming arc with deliberate `(sensor, drive, narrator-utterance)` co-firing + Hebbian retest. **The Stage 3 design constraint sharpens with the dimension-mismatch finding:** the goal isn't "make text and interoception embeddings line up in cosine space" (they can't — different dimensions). It's "make the food concept LAND in text-modality EC nodes during priming, attributed to `sense_food_source` reward in NAc". The narrator utterance ("hungry") would need to fire as a text-modality EC node that the NAc reward bias keys to. **The Hebbian binding mechanism cancelled by Roy-4 was always structurally impossible at the cosine level** — `cross_modal_substrate_binding.md`'s resurrection conditions (Stage 4a) would need a learned dimension-reducing projection between encoders, not a naive Hebbian rule on raw cosine. + +**Recommended next step for Stage 3:** explicitly design the redesigned cradle arc so the narrator utterance fires text-modality EC during the same tick the `sense_food_source` reward is attributed in NAc. The pass criterion is then **"at least one text-modality EC node ends up in NAc's `cluster_reward_bias` map keyed to `sense_food_source`"** — directly observable in the persisted `aut_nac.json`, no cosine math required. If that produces a text-modality food centroid, M_tt becomes non-empty in a Roy-5b re-run and the H1c / H1b discrimination becomes meaningful for the FIRST TIME across all Roy iterations. ## Pre-registered diagnostic logic @@ -82,14 +92,30 @@ Roy-5a ran with `MAXIM_EC_TRACE_ACTIVATIONS=1` so the JSONL trace events landed | Roy-4 (priming) | 306 | text=154, interoception=152 | text=**57**, interoception=12 | | **Roy-5a (priming)** | **151** | **interoception=151, text=0** | **interoception=6, text=0** | -**Roy-4 had 57 NEW text-modality nodes fire during priming; Roy-5a had zero.** This is unexpected given that the priming spec is identical (same 5 cradle arcs, 10 turns each, same fixture). The likely explanations are operational rather than structural — different LLM context cache state, the first priming stage in Roy-5a had many AUT timeouts that may have suppressed narrator text routing, or some other run-to-run variance that doesn't reflect a code change. **No commit between Roy-4 (PR #246) and Roy-5a (PR #248 + this PR's branch) touches encoder routing, LinguisticEncoder, or the cradle narrator code path** — `git log c80190f..HEAD -- src/maxim/similarity/ src/maxim/agents/ src/maxim/runtime/agent_loop.py` returns empty. +**Roy-4 had 57 NEW text-modality nodes fire during priming; Roy-5a had zero. Cause located after two follow-up runs**: `MAXIM_SUBSTRATE_PATH=1` was inherited from the shell environment when Roy-4 ran; Roy-5a + Roy-5a-confirm were kicked off in a shell without it set. The env var gates `MemoryHub._encoder` wiring at [`integration/memory_hub.py:247`](../../src/maxim/integration/memory_hub.py#L247); `BioEnrichmentPipeline` receives `encoder=getattr(memory_hub, "_encoder", None)` at [`runtime/bio_stack.py:378`](../../src/maxim/runtime/bio_stack.py#L378) and its text-modality EC fire at [`bio_enrichment.py:576`](../../src/maxim/integration/bio_enrichment.py#L576) short-circuits on `if self._encoder is not None`. With the env var off, every BioEnrichmentPipeline text query silently skips EC; with it on, queries fire `pattern_complete_or_separate(embedding, "text")` and produce trace events. + +Roy-5a-substrate-on confirmed this end-to-end: explicit `MAXIM_SUBSTRATE_PATH=1` reproduces Roy-4's text-modality fire pattern (162 fires vs Roy-4's 154 — within run-to-run variance). **No code regression.** The 0.9.1 release plan explicitly ships substrate-path features behind this env var per [`release_0_9_1.md`](../plans/release_0_9_1.md) Stage 2; Wire-A is what flips the default once it ships. -Notably, **Roy-4's hippocampus dumps also showed zero `cli_input` / `transcript` fields**, so the 154 Roy-4 text-modality fires went through a code path that didn't land in the hippocampus perception fields the analyzer can inspect post-hoc. This is consistent with text fires from tool-output text, decomposed concept chunks, or other side-channel routings rather than direct cradle-narrator percepts. +**Roy-4's hippocampus dumps had zero `cli_input` / `transcript` fields too — same as Roy-5a-*.** Roy-4's 154 text fires came through `BioEnrichmentPipeline`'s query path during sim turns, not from `transcript_chunk` direct routing. This explains why neither Roy-4 nor any Roy-5a variant shows text in the post-hoc hippocampus perception fields. -**The H1a verdict survives either interpretation.** Both runs show: -- Food-bearing NAc cluster IDs all live in interoception modality. -- M_dd cosine ≈ 1.0 between priming and arm A food clusters (the surviving identity scheme). -- No text-modality food centroids exist in priming OR arm A. +**The H1a verdict survives every variant:** + +- **Substrate path OFF (Roy-5a / Roy-5a-confirm):** zero text-modality EC nodes exist at all → M_tt trivially empty for food rows. +- **Substrate path ON (Roy-5a-substrate-on):** 14 text-modality EC centroids exist in priming, but **zero of them are food-bearing**. NAc's `cluster_reward_bias` map keys 2 cluster IDs to `sense_food_source`; both are interoception-modality node IDs (`8986a04c-...`, `27c6f321-...`), neither in the 14-node text-modality set. +- **All three runs:** `M_dd ≈ 1.0` between priming and arm-A interoception food clusters (the surviving frozen-prototype identity scheme — same SHA-basis embedding regardless of run). + +**Dimension mismatch — additional structural finding:** + +The dimension-mismatch warning the pre-merge review fold added to `_compute_matrix` fired on real Roy-5a-substrate-on data with `row dims=[384] col dims=[768]`. Concretely: + +| Modality | Encoder | Embedding dim | Source | +|---|---|---|---| +| `interoception` | `SensorEncoder._stable_basis` | **384** | SHA-derived basis vectors (deterministic per sensor name) | +| `text` | `LinguisticEncoder` (`paraphrase-mpnet-base-v2`) | **768** | sentence-transformers | + +Cross-modality cosine (M_dt: priming interoception × arm A text) is **mathematically undefined** — `_cosine` returns 0.0 silently on length mismatch (`if len(u) != len(v) or not u: return 0.0` per [`scripts/analyze_roy_5_cosine_localization.py:104`](../../scripts/analyze_roy_5_cosine_localization.py#L104)). The 0.0000 M_dt value in the per-arm summary is **not** "the encoders see them as orthogonal" — it's "comparison undefined". The analyzer's warning surfaces this; the JSON bundle's M_dt 0.0 should be read as "undefined" not "computed". + +**This finding is stronger than the plan's H1a framing.** The plan modeled H1a as "encoder subspaces are far in cosine space (max < 0.20)". The actual finding is "encoder subspaces are different-dimensional, cosine is undefined". Any cosine-based cross-modal alignment is structurally impossible without a learned projection layer between the encoders. The Hebbian binding mechanism cancelled by Roy-4 (`cross_modal_substrate_binding.md` Stages 2-6) was always structurally impossible at the cosine level — Stage 4a's resurrection conditions would need a dimension-reducing learned projection, not a Hebbian rule on raw cosine. ## What this means for Stage 2 @@ -98,18 +124,22 @@ Notably, **Roy-4's hippocampus dumps also showed zero `cli_input` / `transcript` - `_data/components/bodies/infant_humanoid_naming_v1.yaml` — new body with co-firing scaffold (additive, doesn't replace existing arc). - `prompts/cradle_narrator.py` — narrator pattern that fires "hungry" / "thirsty" / "warm" utterances co-tick with the matching drive/sensor threshold. - `scenarios/roy/roy_5b_iteration.yaml` — same shape as Roy-4 / Roy-5a but uses the redesigned arc. -- Re-run [`scripts/analyze_roy_4_coactivation.py`](../../scripts/analyze_roy_4_coactivation.py) on Roy-5b's trace. PASS → resurrect `cross_modal_substrate_binding.md` Stages 2-6; FAIL → promote encoder replacement to 1.2+ research. -**The Roy-5a secondary finding (text-modality silence on the food concept) refines the Stage 3 design requirement.** It's not enough for the redesigned arc to produce co-firing — it also needs to produce **non-zero text-modality fires on food-related percepts** so that the Hebbian binding rule has at least one cross-modality node pair to evaluate. If the narrator's "hungry" utterance routes through a code path that doesn't reach `LinguisticEncoder` (the same path that's silent in Roy-5a's run), Stage 3 will fail not because the binding mechanism is dead but because there are no text-modality nodes to bind. +**Sharpened Stage 3 pass criterion (refined by Roy-5a-substrate-on's findings):** the original plan said "produce co-firing + Hebbian binding succeeds". The data shows that's neither sufficient nor — at the cosine level — possible. The **directly observable + structurally meaningful** pass criterion is: -## Recommended next step +> **At least one text-modality EC node appears in NAc's `cluster_reward_bias` map keyed to `sense_food_source` after Stage 3 priming.** -**Before committing to Stage 3 implementation**, the user should: +This is observable directly in the persisted `aut_nac.json` + `aut_ec.json` — no cosine math required. If Stage 3's narrator-utterance scaffold produces a text-modality food centroid, M_tt becomes non-empty in a Roy-5b re-run and the H1c / H1b discrimination becomes meaningful for the **first time across all Roy iterations**. If Stage 3 still produces zero text-modality food centroids, then the Hebbian binding mechanism is dead even at the structural level and Stage 4b (encoder replacement to 1.2+) is triggered. + +The plan's original two-part criterion (text fires + cross-modality binding) collapses into the single observable above: the text-modality food fire IS the prerequisite, and the cross-modality binding is structurally impossible at cosine level so doesn't gate further. + +## Recommended next step -1. **Re-run Roy-5a once or twice** to confirm the text-modality-silence finding is stable. If a re-run produces text-modality fires (matching Roy-4's 154), Roy-5a was anomalous and Stage 3 can proceed as planned. If text-modality silence reproduces, item (2) becomes load-bearing. -2. **Inspect why the cradle narrator's text isn't routing to `LinguisticEncoder` on food-related percepts.** Likely places to look: the orchestrator's narrator-percept routing in `simulation/orchestrator.py`, `LinguisticEncoder.encode`'s text extraction (`percept.transcript_chunk or percept.content`), and whether `embodiment/percepts.py::EmbodimentPerceptSource` populates either field on food-sensor ticks. +The disambiguation Stage 1 was meant to produce is now clean. Three actions for Stage 2 routing: -If text-modality routing turns out to be quietly broken (rather than variance), Stage 3's cradle redesign will need to fix the routing AS PART OF the redesigned arc rather than assuming it works. +1. **Always set `MAXIM_SUBSTRATE_PATH=1` in any future Roy reproduction protocol** so the text-modality routing is active and the analyzer's matrices populate meaningfully. The Roy-5a reproduction protocol has been updated to reflect this. +2. **Stage 3 (cradle-arc redesign) is greenlit** per the H1a verdict, with the sharpened pass criterion above. The dimension-mismatch finding means Stage 4a's resurrection of `cross_modal_substrate_binding.md` should NOT proceed even if Stage 3 produces text-modality food centroids — the Hebbian binding rule on cross-modality cosine is structurally impossible regardless. Stage 3 PASS → ship the substrate-annotates-LLM-context path (Wire-A in 0.9.1) as the operator-visible answer; Stage 4a stays cancelled. +3. **Consider whether the dimension-mismatch finding requires a follow-up plan** to either (a) align SensorEncoder + LinguisticEncoder on a common embedding dim (additive change to SensorEncoder's hash basis projection), or (b) declare cross-modality cosine alignment out of scope for 1.0 and document the learned-projection requirement as 1.2+ research. This is a strategic 1.0 / 1.1+ scoping question, not a Stage 2 implementation step. ## Why this verdict is more confident than Roy-4's FAIL diff --git a/docs/experiments/protocols/22_roy_5a_reproduction.md b/docs/experiments/protocols/22_roy_5a_reproduction.md index a56f6d50..c1d128ef 100644 --- a/docs/experiments/protocols/22_roy_5a_reproduction.md +++ b/docs/experiments/protocols/22_roy_5a_reproduction.md @@ -54,16 +54,29 @@ curl -si --max-time 10 \ # Kill any stale sims first. pkill -f "maxim.*sim" 2>/dev/null -# IMPORTANT: MAXIM_EC_TRACE_ACTIVATIONS=1 is NOT load-bearing for the -# Roy-5a verdict (the analyzer reads centroids from disk, not from -# JSONL trace events). It's still set here for parity with Roy-4 — the -# per-tick traces remain useful as a cross-check on which priming -# nodes fired in which test ticks (see the EC-trace capture table in -# 22_roy_5a.md § Result). +# IMPORTANT — load-bearing env vars: +# +# MAXIM_SUBSTRATE_PATH=1 (LOAD-BEARING for the analyzer's verdict) +# Gates MemoryHub._encoder wiring. Without this set, BioEnrichmentPipeline's +# text-modality EC fire (bio_enrichment.py:576) short-circuits on +# encoder=None. The initial Roy-5a run did NOT set this and produced +# the n/a / empty-matrix path; Roy-5a-substrate-on added it explicitly +# and produced the structurally clean H1a verdict. ALWAYS SET THIS for +# any Roy iteration that expects text-modality cosine matrices to be +# meaningful. (Wire-A in release_0_9_1.md will flip the default once +# it ships; until then, this is opt-in per release_0_9_1.md Stage 2.) +# +# MAXIM_EC_TRACE_ACTIVATIONS=1 (not load-bearing for verdict, useful for cross-check) +# The analyzer reads centroids from disk (aut_ec.json), not from JSONL +# trace events. But the per-tick traces remain useful as a cross-check +# on which priming nodes fired in which test ticks — and on whether +# text-modality routing is firing at all (zero text fires in the +# trace JSONL is the signal that MAXIM_SUBSTRATE_PATH wasn't set). # # The load-bearing artifact is the per-session aut_ec.json, written # automatically since PR #248. +MAXIM_SUBSTRATE_PATH=1 \ MAXIM_EC_TRACE_ACTIVATIONS=1 \ MAXIM_LOG_FILE=/tmp/roy_5a_ec_trace.jsonl \ maxim roy run scenarios/roy/roy_5a_iteration.yaml > /tmp/roy_5a_run.log 2>&1 diff --git a/docs/plans/persona_convergence_crucible.md b/docs/plans/persona_convergence_crucible.md index 63b706af..e829b2e9 100644 --- a/docs/plans/persona_convergence_crucible.md +++ b/docs/plans/persona_convergence_crucible.md @@ -817,7 +817,7 @@ The most permissive rule (`min_cofire=1, min_weight=0.01`) yields 256 priming bo | **`M_dt`** | priming interoception × arm A text | 2 × 0 | n/a (arm A has zero text-modality nodes) | | **`M_dd`** | priming interoception × arm A interoception | 2 × 2 | **1.0000 (identical centroids)** | -**Verdict: H1a — encoder subspace incompatibility** (via the n/a / empty-matrix path, not low-cosine path). `max(M_tt food-bearing) = n/a` → -inf → H1a per `decode_verdict`. The verdict triggers via the **"no text-modality food centroids exist in this run"** path rather than the **"text food clusters exist but are far from arm A text"** path the plan modeled. **Caveat:** Roy-4 on the identical priming spec produced 154 text-modality EC fires, so the load-bearing claim is "on Roy-5a's run no text-modality food centroids were allocated", not "they can never form during cradle priming" — see § Recommended next step. +**Verdict: H1a — encoder subspace incompatibility** (confirmed across three runs, two distinct mechanisms). The initial Roy-5a run triggered via "no text-modality nodes exist" (`MAXIM_SUBSTRATE_PATH` unset). Roy-5a-confirm reproduced. Roy-5a-substrate-on (with `MAXIM_SUBSTRATE_PATH=1` explicitly set) produced 162 text fires + 14 text-modality EC centroids — but **zero of them are food-bearing**; food NAc cluster IDs remain exclusively interoception-modality. **Plus a stronger structural finding:** `SensorEncoder` produces 384-dim embeddings, `LinguisticEncoder` produces 768-dim — cross-modality cosine (M_dt) is mathematically undefined regardless of substrate-path state. The plan's "encoder subspaces are far in cosine space" framing is weaker than the data shows: they're **different-dimensional**, not far. The dimension-mismatch warning added in the pre-merge review fold fired on real Roy-5a-substrate-on data. **Cross-arm M_dd sanity check** (the surviving "interoception identity" scheme): diff --git a/scenarios/roy/roy_5a_confirm_iteration.yaml b/scenarios/roy/roy_5a_confirm_iteration.yaml new file mode 100644 index 00000000..59e7ab55 --- /dev/null +++ b/scenarios/roy/roy_5a_confirm_iteration.yaml @@ -0,0 +1,84 @@ +# Roy-5a-confirm — Stability re-run of Roy-5a. +# +# Roy-5a (PR #249, docs/experiments/22_roy_5a.md) returned VERDICT: H1a +# but via an unexpected mechanism — text-modality EC fires were ZERO +# in priming, while Roy-4 on the identical spec produced 154 text fires. +# The bio-fidelity pre-merge review and the outcome doc both recommend +# re-running Roy-5a once or twice to disambiguate variance from +# structural regression in text-percept routing. +# +# Roy-5a-confirm is the variance check. IDENTICAL spec to Roy-5a (same +# 5 priming stages × 10 turns + roy_2pc_holdout fixture + same 3 arms). +# Single structural change: the iteration name, so artifacts land in +# ~/.maxim/roy/roy-5a-confirm/ without overwriting the original. +# +# Post-run analysis: +# 1. Count text-modality EC fires in /tmp/roy_5a_confirm_ec_trace.jsonl. +# Roy-4 baseline: 154 text fires in priming. +# Roy-5a: 0 text fires in priming. +# 2. If Roy-5a-confirm produces ≥ ~50 text fires → variance, Roy-5a +# was anomalous, Stage 3 proceeds per the plan. +# 3. If Roy-5a-confirm produces 0 text fires → structural; the +# text-percept routing path is broken in current code state. +# Stage 3 needs to fix text-modality routing alongside the +# co-firing scaffold redesign. +# +# Companion docs: +# docs/plans/roy_5_encoder_alignment_disambiguator.md (Stage 1) +# docs/plans/persona_convergence_crucible.md (iteration log) +# docs/experiments/22_roy_5a.md (the run we're confirming) +# scenarios/roy/roy_5a_iteration.yaml (the original; structurally identical) +# +# Run with: +# MAXIM_EC_TRACE_ACTIVATIONS=1 \ +# MAXIM_LOG_FILE=/tmp/roy_5a_confirm_ec_trace.jsonl \ +# maxim roy run scenarios/roy/roy_5a_confirm_iteration.yaml +# +# Expected wall: ~25-28 min (same shape as Roy-2c / Roy-4 / Roy-5a). + +name: roy-5a-confirm +description: | + Stability re-run of Roy-5a. Identical priming + fixture + arms; + separate artifact dir so the original Roy-5a result.json is + preserved. The variance question for the text-modality-silence + observation that triggered the H1a verdict via the n/a / + empty-matrix path rather than the low-cosine path. + +embodiment: bodies/infant_humanoid +aut_mode: substrate-primary + +priming: + name: roy-5a-confirm-priming + embodiment: bodies/infant_humanoid + aut_mode: substrate-primary + stages: + - name: act1_neonatal_a + arc: cradle_prelinguistic + turns: 10 + - name: act1_neonatal_b + arc: cradle_prelinguistic + turns: 10 + - name: act2_cradle_a + arc: cradle + turns: 10 + - name: act2_cradle_b + arc: cradle + turns: 10 + - name: act3_consolidation + arc: cradle_prelinguistic + turns: 10 + +test_scenario: + fixture: roy_2pc_holdout.yaml + turns: 10 + +arms: + a: + substrate: from_priming + system_prompt: neutral + b: + substrate: blank + system_prompt: "You are a hungry infant" + c: + substrate: blank + system_prompt: neutral diff --git a/scenarios/roy/roy_5a_substrate_on_iteration.yaml b/scenarios/roy/roy_5a_substrate_on_iteration.yaml new file mode 100644 index 00000000..5401fa00 --- /dev/null +++ b/scenarios/roy/roy_5a_substrate_on_iteration.yaml @@ -0,0 +1,91 @@ +# Roy-5a-substrate-on — Re-run Roy-5a with MAXIM_SUBSTRATE_PATH=1. +# +# Roy-5a and Roy-5a-confirm both observed ZERO text-modality EC fires +# during priming, while Roy-4 (identical iteration spec) produced 154. +# Root cause: MemoryHub._encoder is only wired when MAXIM_SUBSTRATE_PATH=1 +# is set in the runner environment (memory_hub.py:247). Without that +# env var: +# - memory_hub._encoder = None +# - BioEnrichmentPipeline receives encoder=None +# - BioEnrichmentPipeline's text-modality EC fire (bio_enrichment.py:576) +# short-circuits on `if self._encoder is not None and ...` +# - Result: zero text-modality EC fires, M_tt empty, verdict triggers +# via the n/a / empty-matrix path +# +# Roy-4 inherited MAXIM_SUBSTRATE_PATH=1 from the shell at run-time; +# Roy-5a + Roy-5a-confirm did not. This is a process gap, not a +# code regression. +# +# Roy-5a-substrate-on closes the gap by explicitly setting +# MAXIM_SUBSTRATE_PATH=1 in the runner environment. This is the +# proper Stage 1 disambiguation — with the substrate path active, +# text-modality nodes form during priming and the M_tt matrix is +# populated; the analyzer can decode H1a / H1b / H1c on real text +# cosines rather than the "no text nodes exist" degenerate case. +# +# Companion docs: +# docs/plans/roy_5_encoder_alignment_disambiguator.md (Stage 1) +# docs/plans/persona_convergence_crucible.md (iteration log) +# docs/experiments/22_roy_5a.md (Roy-5a outcome — the n/a-path verdict) +# scenarios/roy/roy_5a_iteration.yaml (original; structurally identical) +# scenarios/roy/roy_5a_confirm_iteration.yaml (variance check; reproduced silence) +# +# Run with: +# MAXIM_SUBSTRATE_PATH=1 \ +# MAXIM_EC_TRACE_ACTIVATIONS=1 \ +# MAXIM_LOG_FILE=/tmp/roy_5a_substrate_on_ec_trace.jsonl \ +# maxim roy run scenarios/roy/roy_5a_substrate_on_iteration.yaml +# +# Expected wall: ~25-28 min (same shape as Roy-2c / Roy-4 / Roy-5a / +# Roy-5a-confirm). Text-modality EC fire count expected ≥ ~50 +# (Roy-4 baseline was 154). If silence STILL reproduces with +# MAXIM_SUBSTRATE_PATH=1 set, the regression is genuinely structural +# at a layer below the env-var gate. + +name: roy-5a-substrate-on +description: | + Roy-5a re-run with MAXIM_SUBSTRATE_PATH=1 set in the runner + environment, so MemoryHub._encoder is wired and + BioEnrichmentPipeline produces text-modality EC fires. + Identical priming + fixture + arms to Roy-5a / Roy-5a-confirm — + the only structural change is the env var, which is the difference + between Roy-4 (154 text fires) and Roy-5a (0 text fires). + +embodiment: bodies/infant_humanoid +aut_mode: substrate-primary + +priming: + name: roy-5a-substrate-on-priming + embodiment: bodies/infant_humanoid + aut_mode: substrate-primary + stages: + - name: act1_neonatal_a + arc: cradle_prelinguistic + turns: 10 + - name: act1_neonatal_b + arc: cradle_prelinguistic + turns: 10 + - name: act2_cradle_a + arc: cradle + turns: 10 + - name: act2_cradle_b + arc: cradle + turns: 10 + - name: act3_consolidation + arc: cradle_prelinguistic + turns: 10 + +test_scenario: + fixture: roy_2pc_holdout.yaml + turns: 10 + +arms: + a: + substrate: from_priming + system_prompt: neutral + b: + substrate: blank + system_prompt: "You are a hungry infant" + c: + substrate: blank + system_prompt: neutral