Skip to content

fix(roy-5a): MAXIM_SUBSTRATE_PATH=1 was the missing env var#251

Merged
dennys246 merged 1 commit into
mainfrom
fix/roy-5a-substrate-path-env-var
May 15, 2026
Merged

fix(roy-5a): MAXIM_SUBSTRATE_PATH=1 was the missing env var#251
dennys246 merged 1 commit into
mainfrom
fix/roy-5a-substrate-path-env-var

Conversation

@dennys246
Copy link
Copy Markdown
Owner

Summary

Follow-up to PR #249 (Roy-5a analyzer, merged). The user asked me to do the recommended "re-run Roy-5a once or twice to confirm the text-modality silence is stable". Three follow-up runs traced the root cause to a missing env var — not variance, not a regression.

What the investigation found

Run 1 — Roy-5a-confirm (identical spec, no env changes). 0 text-modality EC fires — silence reproduces, not run-to-run variance.

Investigation between runs: traced the gate at integration/memory_hub.py:247:

  • MemoryHub._encoder is only wired when MAXIM_SUBSTRATE_PATH=1.
  • BioEnrichmentPipeline receives encoder=getattr(memory_hub, "_encoder", None) at runtime/bio_stack.py:378.
  • Its text-modality EC fire at bio_enrichment.py:576 short-circuits on encoder=None.

Roy-4's 154 text fires came from BioEnrichmentPipeline queries — requiring MAXIM_SUBSTRATE_PATH=1 to have been inherited from the shell. Roy-5a + Roy-5a-confirm ran from a shell without it; Roy-4 had it.

Run 2 — Roy-5a-substrate-on (same spec, env var explicitly set). 162 text-modality EC fires, 34 NEW text centroids — within run-to-run variance of Roy-4's 154/57. The substrate path was the gate, not a code regression.

The structurally clean H1a verdict

Analyzer re-run on Roy-5a-substrate-on data:

Priming text Priming intero Food-bearing text Food-bearing intero
Roy-5a-substrate-on 14 2 0 2

Even with the substrate path active, ZERO of the 14 text-modality centroids are food-bearing. NAc's cluster_reward_bias map keys food clusters exclusively to interoception node IDs. H1a still triggers — same verdict, stronger mechanism.

Bonus structural finding (the dimension-mismatch warning earned its keep)

The dimension-mismatch warning added in PR #249's pre-merge review fold fired on real data: row dims=[384] col dims=[768].

  • interoception (SensorEncoder SHA-basis): 384-dim
  • text (LinguisticEncoder paraphrase-mpnet-base-v2): 768-dim

Cross-modality cosine (M_dt) is mathematically undefined — different dimensional spaces. _cosine returns 0.0 silently on length mismatch. This is stronger than the plan's H1a framing ("encoder subspaces are far in cosine space"). The real finding: the subspaces are different-dimensional, cosine is undefined, and any cross-modal cosine-based alignment is structurally impossible without a learned projection layer.

The Hebbian binding mechanism cancelled by Roy-4 was always structurally impossible at the cosine levelcross_modal_substrate_binding.md's Stage 4a resurrection conditions would need a learned dimension-reducing projection between encoders, not a Hebbian rule on raw cosine.

Sharpened Stage 3 pass criterion

The original plan said "produce co-firing + Hebbian binding succeeds". The dimension-mismatch finding shows the second clause is structurally impossible at cosine level. The refined criterion is:

At least one text-modality EC node appears in NAc's cluster_reward_bias map keyed to sense_food_source after Stage 3 priming.

Directly observable in aut_nac.json + aut_ec.json — no cosine math required. If Stage 3's narrator-utterance scaffold produces a text-modality food centroid, M_tt becomes non-empty in Roy-5b for the first time across all Roy iterations and H1c / H1b becomes discriminable.

What changed in the repo

  • scenarios/roy/roy_5a_confirm_iteration.yaml — variance-check spec.
  • scenarios/roy/roy_5a_substrate_on_iteration.yaml — substrate-path-on spec.
  • docs/experiments/22_roy_5a.md — outcome doc updated with the three-run findings + dimension-mismatch section + sharpened Stage 3 criterion.
  • docs/experiments/protocols/22_roy_5a_reproduction.mdMAXIM_SUBSTRATE_PATH=1 documented as load-bearing in the runner env.
  • docs/plans/persona_convergence_crucible.md — iteration log entry updated.

No analyzer code changes needed — the dimension-mismatch warning from PR #249's pre-merge fold did its job and surfaced the architectural finding without changes here.

Test plan

  • python -m pytest tests/unit/test_roy_5_cosine_localization.py tests/unit/test_save_aut_state.py -q — 37 passed.
  • ruff check clean on touched Python files (the only Python files touched here are the existing scenarios — no Python source changes in this PR).
  • Analyzer end-to-end against real Roy-5a-substrate-on session dirs returns rc=0 + correct H1a verdict + dimension-mismatch warning visible in stderr.

Out of scope (deferred)

  • Stage 3 implementation (cradle-arc redesign with narrator co-firing). Verdict gates it; this PR doesn't ship it.
  • Strategic question for 1.0 / 1.1+ scoping: does the dimension-mismatch finding require a follow-up plan to either (a) align SensorEncoder + LinguisticEncoder on a common embedding dim, or (b) document the learned-projection requirement as 1.2+ research? Belongs in a separate planning thread.

🤖 Generated with Claude Code

User asked for the recommended re-run cycle to confirm Roy-5a's
text-modality silence. Three follow-up runs traced the root cause:

Run 1: Roy-5a-confirm — identical spec, no env var changes.
  Result: 0 text-modality EC fires (same as Roy-5a). N=2 confirmed
  silence is NOT run-to-run variance.

Investigation: traced the gate. MemoryHub._encoder is only wired
when MAXIM_SUBSTRATE_PATH=1 (memory_hub.py:247). Bio_stack passes
encoder=getattr(memory_hub, "_encoder", None) to
BioEnrichmentPipeline (bio_stack.py:378). BioEnrichmentPipeline's
text-modality EC fire (bio_enrichment.py:576) short-circuits on
encoder=None. Roy-4's 154 text fires came from BioEnrichmentPipeline
queries, requiring the env var. Roy-4 had it inherited from the
shell; Roy-5a + Roy-5a-confirm didn't.

Run 2: Roy-5a-substrate-on — same spec, MAXIM_SUBSTRATE_PATH=1
  explicitly set in the runner env.
  Result: 162 text-modality fires (Roy-4 had 154), 34 NEW text
  centroids. The substrate path was the gating issue, not a
  code regression.

Analyzer re-run on Roy-5a-substrate-on data:
  - Priming: 14 text centroids + 2 interoception centroids.
  - But ZERO of the 14 text centroids are food-bearing. NAc's
    cluster_reward_bias still keys food clusters exclusively to
    interoception node IDs.
  - VERDICT: H1a (same as before, but now via the stronger
    mechanism: food NAc bias keys are exclusively interoception,
    not "no text nodes exist at all").

Additional structural finding (uncovered by the dimension-mismatch
warning added in the pre-merge review fold — it fired on real data):

  SensorEncoder produces 384-dim SHA-basis embeddings;
  LinguisticEncoder (paraphrase-mpnet-base-v2) produces 768-dim
  embeddings. Cross-modality cosine (M_dt) is mathematically
  undefined — the vectors live in different-dimensional spaces.
  _cosine returns 0.0 silently on length mismatch.

This is stronger than the plan's H1a framing ("encoder subspaces
are far in cosine space"). The actual finding is "encoder
subspaces are different-dimensional, cosine is undefined". Any
cosine-based cross-modal alignment is structurally impossible
without a learned projection layer. The Hebbian binding mechanism
cancelled by Roy-4 was always structurally impossible at the
cosine level.

Sharpened Stage 3 pass criterion (refined by these findings):

  Original: produce co-firing + Hebbian binding succeeds.
  Refined:  at least one text-modality EC node appears in NAc's
            cluster_reward_bias map keyed to sense_food_source
            after Stage 3 priming.

This is directly observable in aut_nac.json + aut_ec.json — no
cosine math required. The cross-modality Hebbian binding part
collapses out because it's structurally impossible.

Files changed:
- scenarios/roy/roy_5a_confirm_iteration.yaml (variance check spec)
- scenarios/roy/roy_5a_substrate_on_iteration.yaml (substrate-path-on spec)
- docs/experiments/22_roy_5a.md (outcome doc updated with 3-run findings)
- docs/experiments/protocols/22_roy_5a_reproduction.md (load-bearing env var documented)
- docs/plans/persona_convergence_crucible.md (iteration log entry updated)

37 tests still passing. No analyzer code changes needed —
dimension-mismatch warning the pre-merge review added did its job.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dennys246 dennys246 merged commit d22d5a3 into main May 15, 2026
5 checks passed
@dennys246 dennys246 deleted the fix/roy-5a-substrate-path-env-var branch May 15, 2026 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant