fix(roy-5a): MAXIM_SUBSTRATE_PATH=1 was the missing env var#251
Merged
Conversation
User asked for the recommended re-run cycle to confirm Roy-5a's
text-modality silence. Three follow-up runs traced the root cause:
Run 1: Roy-5a-confirm — identical spec, no env var changes.
Result: 0 text-modality EC fires (same as Roy-5a). N=2 confirmed
silence is NOT run-to-run variance.
Investigation: traced the gate. MemoryHub._encoder is only wired
when MAXIM_SUBSTRATE_PATH=1 (memory_hub.py:247). Bio_stack passes
encoder=getattr(memory_hub, "_encoder", None) to
BioEnrichmentPipeline (bio_stack.py:378). BioEnrichmentPipeline's
text-modality EC fire (bio_enrichment.py:576) short-circuits on
encoder=None. Roy-4's 154 text fires came from BioEnrichmentPipeline
queries, requiring the env var. Roy-4 had it inherited from the
shell; Roy-5a + Roy-5a-confirm didn't.
Run 2: Roy-5a-substrate-on — same spec, MAXIM_SUBSTRATE_PATH=1
explicitly set in the runner env.
Result: 162 text-modality fires (Roy-4 had 154), 34 NEW text
centroids. The substrate path was the gating issue, not a
code regression.
Analyzer re-run on Roy-5a-substrate-on data:
- Priming: 14 text centroids + 2 interoception centroids.
- But ZERO of the 14 text centroids are food-bearing. NAc's
cluster_reward_bias still keys food clusters exclusively to
interoception node IDs.
- VERDICT: H1a (same as before, but now via the stronger
mechanism: food NAc bias keys are exclusively interoception,
not "no text nodes exist at all").
Additional structural finding (uncovered by the dimension-mismatch
warning added in the pre-merge review fold — it fired on real data):
SensorEncoder produces 384-dim SHA-basis embeddings;
LinguisticEncoder (paraphrase-mpnet-base-v2) produces 768-dim
embeddings. Cross-modality cosine (M_dt) is mathematically
undefined — the vectors live in different-dimensional spaces.
_cosine returns 0.0 silently on length mismatch.
This is stronger than the plan's H1a framing ("encoder subspaces
are far in cosine space"). The actual finding is "encoder
subspaces are different-dimensional, cosine is undefined". Any
cosine-based cross-modal alignment is structurally impossible
without a learned projection layer. The Hebbian binding mechanism
cancelled by Roy-4 was always structurally impossible at the
cosine level.
Sharpened Stage 3 pass criterion (refined by these findings):
Original: produce co-firing + Hebbian binding succeeds.
Refined: at least one text-modality EC node appears in NAc's
cluster_reward_bias map keyed to sense_food_source
after Stage 3 priming.
This is directly observable in aut_nac.json + aut_ec.json — no
cosine math required. The cross-modality Hebbian binding part
collapses out because it's structurally impossible.
Files changed:
- scenarios/roy/roy_5a_confirm_iteration.yaml (variance check spec)
- scenarios/roy/roy_5a_substrate_on_iteration.yaml (substrate-path-on spec)
- docs/experiments/22_roy_5a.md (outcome doc updated with 3-run findings)
- docs/experiments/protocols/22_roy_5a_reproduction.md (load-bearing env var documented)
- docs/plans/persona_convergence_crucible.md (iteration log entry updated)
37 tests still passing. No analyzer code changes needed —
dimension-mismatch warning the pre-merge review added did its job.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
4 tasks
5 tasks
dennys246
added a commit
that referenced
this pull request
May 25, 2026
…2 cause is non-code encoder-layer drift Background: A1 refuted the env-var hypothesis. The post-A1 working hypothesis was LLM narrator drift on the leader between 5/13 and 5/23, producing different cradle scene text -> different food-bearing percepts -> different EC clusters. A3 ran the narrator capture to test that hypothesis. A3 result: narrator-drift hypothesis REFUTED. Captured the structured event stream at cd51be5 via MAXIM_LOG_FILE=/tmp/roy_3c_narrator_capture.jsonl. Three findings from the 39,121-event log: 1. There is NO narrator scene text reaching the AUT during cradle priming. Every tick logs "Imagination skipped: no percept_text (obs keys: [])". Substrate-primary AUT in cradle priming arcs does not consume narrator-generated text. 2. The 12 text-modality EC nodes come from substrate-INTERNAL state strings (drive:hunger(0.51), bias=+1.00, causal, cluster, pos=0.99, ->food, sense_food_source), not LLM scenes. These are deterministically generated from the substrate's own causal-link active_goal text every tick. 3. AUT behavior is byte-identical between 5/12 historical and 5/24 Step 0: 138 vs 139 sense_food_source calls, byte-identical tool output on every call ({'portions': 5.0, 'freshness': 0.9}), 667 vs 664 hippocampus memories, 2001 vs 1992 NAc total_observations. The agent does the exact same activity volume with the exact same byte-level tool outputs. Only EC cluster attribution differs. Updated key-count axis: non-code environmental drift in encoder layer. Ruled out: wire merges (bisect), env var (A1), narrator drift (A3), AUT behavior shift (A3). Across 4 cd51be5 runs today, interoception EC count is rock-stable at 2 (text varies 10-13 — the reward-bias keys are interoception-modality). Historical 5/12-5/13 produced 6 interoception clusters for the same priming activity. With no clustering-affecting code change between 5/12 and 5/14 (only PR #246 EC instrumentation, PR #248 sim_reports persistence, PR #251 docs), remaining suspects are all non-code: - paraphrase-mpnet weight state (HF revision drift) - SensorEncoder SHA-basis depending on process startup state (PYTHONHASHSEED, numpy random state) - Persistent state in ~/.maxim/ affecting encoder warmup - CPU/numpy floating-point determinism between hosts/dates Closing this axis requires historical encoder-output snapshots that don't exist (PR #248 added EC persistence to sim_reports on 5/14, AFTER the historical baseline runs). Memory hygiene: - feedback_narrator_state_confound.md updated to clarify that the rule applies to llm-primary AUT modes (generative campaigns) but NOT to substrate-primary AUT cradle priming, where no narrator text flows through the percept channel. - project_roy_3c_bisect_verdict.md updated with the A3 refutation. Magnitude axis unchanged: Wire-A's bee42ca decay is the confirmed cause (A2 already proved this; A3 doesn't touch magnitude). Files: - docs/experiments/29_roy_3c_bisect.md (writeup updated with A3 section) - ~/.maxim/roy/roy-2-bisect-cd51be5-NARRATOR-CAP/ + structured capture at /tmp/roy_3c_narrator_capture.jsonl Cost: ~17 min wall (single A3 run). No source changes. Diagnostic-only. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to PR #249 (Roy-5a analyzer, merged). The user asked me to do the recommended "re-run Roy-5a once or twice to confirm the text-modality silence is stable". Three follow-up runs traced the root cause to a missing env var — not variance, not a regression.
What the investigation found
Run 1 — Roy-5a-confirm (identical spec, no env changes). 0 text-modality EC fires — silence reproduces, not run-to-run variance.
Investigation between runs: traced the gate at
integration/memory_hub.py:247:MemoryHub._encoderis only wired whenMAXIM_SUBSTRATE_PATH=1.BioEnrichmentPipelinereceivesencoder=getattr(memory_hub, "_encoder", None)atruntime/bio_stack.py:378.bio_enrichment.py:576short-circuits onencoder=None.Roy-4's 154 text fires came from BioEnrichmentPipeline queries — requiring
MAXIM_SUBSTRATE_PATH=1to have been inherited from the shell. Roy-5a + Roy-5a-confirm ran from a shell without it; Roy-4 had it.Run 2 — Roy-5a-substrate-on (same spec, env var explicitly set). 162 text-modality EC fires, 34 NEW text centroids — within run-to-run variance of Roy-4's 154/57. The substrate path was the gate, not a code regression.
The structurally clean H1a verdict
Analyzer re-run on Roy-5a-substrate-on data:
Even with the substrate path active, ZERO of the 14 text-modality centroids are food-bearing. NAc's
cluster_reward_biasmap keys food clusters exclusively to interoception node IDs. H1a still triggers — same verdict, stronger mechanism.Bonus structural finding (the dimension-mismatch warning earned its keep)
The dimension-mismatch warning added in PR #249's pre-merge review fold fired on real data:
row dims=[384] col dims=[768].interoception(SensorEncoder SHA-basis): 384-dimtext(LinguisticEncoder paraphrase-mpnet-base-v2): 768-dimCross-modality cosine (
M_dt) is mathematically undefined — different dimensional spaces._cosinereturns 0.0 silently on length mismatch. This is stronger than the plan's H1a framing ("encoder subspaces are far in cosine space"). The real finding: the subspaces are different-dimensional, cosine is undefined, and any cross-modal cosine-based alignment is structurally impossible without a learned projection layer.The Hebbian binding mechanism cancelled by Roy-4 was always structurally impossible at the cosine level —
cross_modal_substrate_binding.md's Stage 4a resurrection conditions would need a learned dimension-reducing projection between encoders, not a Hebbian rule on raw cosine.Sharpened Stage 3 pass criterion
The original plan said "produce co-firing + Hebbian binding succeeds". The dimension-mismatch finding shows the second clause is structurally impossible at cosine level. The refined criterion is:
Directly observable in
aut_nac.json+aut_ec.json— no cosine math required. If Stage 3's narrator-utterance scaffold produces a text-modality food centroid,M_ttbecomes non-empty in Roy-5b for the first time across all Roy iterations and H1c / H1b becomes discriminable.What changed in the repo
scenarios/roy/roy_5a_confirm_iteration.yaml— variance-check spec.scenarios/roy/roy_5a_substrate_on_iteration.yaml— substrate-path-on spec.docs/experiments/22_roy_5a.md— outcome doc updated with the three-run findings + dimension-mismatch section + sharpened Stage 3 criterion.docs/experiments/protocols/22_roy_5a_reproduction.md—MAXIM_SUBSTRATE_PATH=1documented as load-bearing in the runner env.docs/plans/persona_convergence_crucible.md— iteration log entry updated.No analyzer code changes needed — the dimension-mismatch warning from PR #249's pre-merge fold did its job and surfaced the architectural finding without changes here.
Test plan
python -m pytest tests/unit/test_roy_5_cosine_localization.py tests/unit/test_save_aut_state.py -q— 37 passed.ruff checkclean on touched Python files (the only Python files touched here are the existing scenarios — no Python source changes in this PR).Out of scope (deferred)
🤖 Generated with Claude Code