You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CLAUDE.md
+8-2Lines changed: 8 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -115,6 +115,11 @@ Simulations call a live LLM for every turn and can burn cost + time quickly. Whe
115
115
116
116
- **`runtime/bio_stack.py::build_bio_stack` is the canonical bio-pipeline construction site** (Wave 3 of biosystem_unification, 2026-04-17). Composes the four individual Wave 1+2 builders (`build_reaction_bus`, `build_pain_bus`, `build_memory_hub`, `build_default_network`) in the correct dependency order. Returns a frozen `BioStack` dataclass containing all wired bio-systems. `persistence_dir: Path | str | None` is the primary configuration — sub-paths (`hippocampus.json`, `atl.json`, `angular_gyrus.json`) are derived internally. `pain_bus=` parameter accepts a pre-built PainBus (sim AUT pattern where the sandbox needs the bus before the rest of the stack); standard learners are subscribed to the pre-existing bus. `with_default_network=True` constructs a DefaultNetwork (Reachy + sim AUT only). Four production callers: cli.py non-sim, simulation/orchestrator.py AUT + orch NPC, embodied_runtime/agentic_runtime.py Reachy. AgentFactory (site #7) deferred to `agent_factory_canonicalization.md` Wave 4 — conditional `remembers`/`learns` + auto_load doesn't fit the umbrella. CLI sim modes stay as-is (just `build_pain_bus`). See [docs/plans/bio_stack_unification.md](docs/plans/bio_stack_unification.md).
117
117
118
+
-**`Episode.valence` defaults to 0.0 on old data.** Backward compatible. Old episode dicts without the valence field deserialize cleanly.
119
+
-**`spreading_activation(propagate_valence=False)` returns `dict[str, float]` unchanged.** The `propagate_valence=True` path returns `dict[str, tuple[float, float]]`. Existing callers are unaffected.
120
+
-**NAc `_reward_bias` clamps to [0, max_reward_bias].** Negative rewards (pain) produce 0.0 bias. Bias only widens EC recognition, never narrows. Pain avoidance is handled by valence annotation on edges, not by reward bias.
121
+
-**`BioStack.save_cerebellum()` must be called at session end.** Without it, learned forward models are lost.
122
+
118
123
## `maxim doctor` — environment diagnostics
119
124
120
125
Runs platform-aware checks + prints fix hints with the user's actual IPs filled in.
@@ -237,7 +242,7 @@ Project structure is documented in [docs/reference.md](docs/reference.md).
237
242
| Tools |`tools/` (register in registry), `runtime/executor.py` (aliases) |
Copy file name to clipboardExpand all lines: docs/decisions.md
+8Lines changed: 8 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -329,6 +329,14 @@ config = NACConfig(
329
329
)
330
330
```
331
331
332
+
### Reward Distribution (SEM Learning Loop)
333
+
334
+
`NAc.distribute_reward(agent_id, reward)` distributes reward across eligible nodes via `credit_node()`. Eligibility traces are set by `update_eligibility()` when percepts complete to substrate nodes. The ReactionBus subscriber in `build_bio_stack` maps reactions to rewards:
335
+
-`Valence.NEGATIVE` -- reward = -intensity (clamps to 0 in credit_node -- bias only widens)
Copy file name to clipboardExpand all lines: docs/embodiment_guide.md
+26-2Lines changed: 26 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -302,10 +302,34 @@ Executes motor programs step by step with:
302
302
-**PainBus subscription** for mid-sequence interrupts
303
303
-**Gate tightening** after painful executions (10% per failure)
304
304
305
+
## SEM Learning Loop (Phase 2 -- Shipped)
306
+
307
+
When a SEM entity interaction produces a reaction (pain on failure, satisfaction on confident prediction), the signal flows through the full bio-pipeline:
CerebellumModulator emits `_emit_success_reaction` when confident enough to skip LLM fallback. Intensity is lower than failure (0.1-0.3 vs 0.3-0.5) -- biologically motivated negativity bias.
320
+
321
+
### NAc reward distribution
322
+
323
+
`distribute_reward` credits eligible substrate nodes proportionally to eligibility traces. Positive rewards widen EC recognition (lower threshold); negative rewards clamp to 0 (bias never narrows).
324
+
325
+
### Cerebellum activation in production
326
+
327
+
`BioStack.cerebellum` is now constructed by `build_bio_stack` and forwarded via `build_executor(cerebellum=...)` to `generate_tools_for_entity`, which creates `CerebellumModulator` instances with a wired `reaction_bus`. This means every SEM affordance tool now has a live Cerebellum backing it -- predictions, training, and reaction emission all happen automatically.
| EC | Memory indexing + substrate recognition |`similarity/`| Routes queries via similarity; pattern_complete_or_separate for substrate nodes (P1) |
128
129
| Angular Gyrus | Cross-modal algebra |`math/`| Combines memories across different modalities |
129
-
| Cerebellum | Motor prediction |`embodiment/`| Predicts outcomes of physical actions, learns motor programs |
130
+
| Cerebellum | Motor prediction |`embodiment/`| Predicts outcomes of physical actions, learns motor programs. Now activated in production via `BioStack.cerebellum` and `build_executor(cerebellum=...)`|
| Valence | Affective edge signal |`memory/episode.py`| Affective signal on Hebbian edges (`Edge.metadata["valence"]`), computed from Reactions at episode close via `apply_hebbian_on_close`. Propagated by `spreading_activation(propagate_valence=True)`|
134
+
| Episode Boundary Rules | Pluggable boundary detection |`memory/episode.py`|`BoundaryRule` callables on `EpisodeBoundaryDetector`. Defaults: tick gap, channel change, scn_tag change. New: `salience_spike_rule(min_intensity=0.5)` triggers boundary on pain/salience spikes via `CaptureEvent.salience_spike`|
-**English only** (Stage 1). spaCy `en_core_web_sm` is English-only. Multi-language support requires a multilingual model (`xx_ent_wiki_sm`) or per-language model selection.
135
135
-**Short fragments** may over-decompose. The `min_chunk_len` filter helps, but domain-specific inputs may need a custom strategy.
136
136
-**No relation tagging yet** (Stage 2). Chunks are bound with untagged Hebbian edges. Role-tagged edges (`relation="spatial"`) are planned for a future stage.
137
+
138
+
## Connection to Valence Annotation
139
+
140
+
With concept decomposition enabled, valence annotation targets individual concept nodes ("rusty sword", "heavy", "sharp") rather than whole-sentence blobs. This means the agent learns "rusty sword is associated with pain" rather than "the entire sentence about picking up a rusty sword is painful." See [valence_annotation_poc.md](../experiments/valence_annotation_poc.md) for the demonstration.
<pclass="text-slate-300 leading-relaxed">The LLM is a <strongclass="text-white">teacher</strong>, not a per-tick oracle. After enough observations, the Cerebellum handles predictions deterministically. In testing, LLM calls drop from 100 to ≤40 over 100 actions.</p>
<pclass="text-slate-300">In the brain, the cerebellum doesn't just predict — it emits signals when predictions fail <em>or</em> succeed. These signals propagate to the hippocampus (contextual memory) and nucleus accumbens (reward learning) simultaneously, closing the loop between motor execution and long-term behavioral adaptation. Success and failure are not symmetric: negative outcomes carry disproportionate weight, a phenomenon known as <em>negativity bias</em>.</p>
132
+
</div>
133
+
134
+
<pclass="text-slate-300 leading-relaxed">When the Cerebellum evaluates an affordance, the outcome flows through the bio-pipeline as a <emclass="text-indigo-400 not-italic font-medium">reaction</em> — a typed evaluative signal that drives learning across multiple systems simultaneously. This is the SEM Learning Loop.</p>
<pclass="text-slate-300 leading-relaxed">The <codeclass="bg-slate-950 px-1 rounded text-xs">CerebellumModulator</code> classifies each affordance execution into one of three outcome paths:</p>
<pclass="text-slate-300 text-sm">Confidence ≥ 0.3 and low variance. The Cerebellum handles the prediction without LLM fallback. Emits a <strongclass="text-white">success reaction</strong> with positive valence.</p>
<pclass="text-slate-300 text-sm">Confidence < 0.3 or high variance. The LLM acts as teacher. The Cerebellum trains on the LLM's response via Rescorla-Wagner update. No reaction emitted — the system is still learning.</p>
<pclass="text-slate-300 text-sm">Affordance execution triggers a failure mode (e.g., shatter, overheat). Emits a <strongclass="text-white">pain reaction</strong> with negative valence via the PainBus.</p>
<pclass="text-slate-300 leading-relaxed">Both success and pain reactions flow through the <codeclass="bg-slate-950 px-1 rounded text-xs">ReactionBus</code>, which dispatches to two subscribers in parallel:</p>
<pclass="text-slate-300 leading-relaxed">Success reactions carry positive valence but at lower intensity than pain reactions — mirroring biological negativity bias. A single painful failure creates a stronger learning signal than several routine successes. This asymmetry means the agent develops caution around dangerous affordances faster than it develops confidence around safe ones, which is the correct survival trade-off for an embodied system.</p>
188
+
189
+
<h3class="text-white font-semibold mt-6">Cerebellum Activation via BioStack</h3>
190
+
<pclass="text-slate-300 leading-relaxed">In production, the Cerebellum is wired through <codeclass="bg-slate-950 px-1 rounded text-xs">BioStack.cerebellum</code> and activated by <codeclass="bg-slate-950 px-1 rounded text-xs">build_executor</code>. This means every agent entry point that constructs a bio-stack gets Cerebellum forward models and the full SEM Learning Loop automatically — no per-caller wiring required.</p>
0 commit comments