Skip to content

feat(search): EXP-05 — instruction-type tagging and search boost#5

Draft
moralespanitz wants to merge 1 commit intomainfrom
feature/exp-05-instruction-tagging
Draft

feat(search): EXP-05 — instruction-type tagging and search boost#5
moralespanitz wants to merge 1 commit intomainfrom
feature/exp-05-instruction-tagging

Conversation

@moralespanitz
Copy link
Copy Markdown

Summary

EXP-05 from BEAM Sprint 2: tag explicit instructions at extraction time and boost them during retrieval ranking. Targets the BEAM IF (instruction-following) ability, where memory currently hurts relative to the no-memory baseline (Honcho 0.844, no-memory 0.760, AtomicMemory below baseline) because instruction memories get diluted in mixed retrieval pools.

Why

From phase2-implementation-plans-2026-04-29.md (EXP-05):

Memory adds +0.084 vs no-memory on most abilities but is hurting more than helping on IF because instruction memories get diluted at retrieval time. Tag them at extraction; boost them at retrieval. Mirrors ENGRAM's procedural-memory channel.

What

Ingest path (always-on, behaviorally identical when boost is off)

src/services/extraction-enrichment.ts — new INSTRUCTION_MARKERS constant detects imperative phrasing in extracted facts. When matched, enrichExtractedFact writes:

  • metadata.fact_role: 'instruction'
  • importance: max(prev, 0.95) (floor — keeps lifecycle decay from evicting them)

Markers: 'always ', 'never ', 'from now on', 'please remember', 'make sure to', "don't forget", 'do not forget', 'every time', 'whenever you', 'going forward', 'in the future', 'remember to'.

Tagging is unconditional — only the retrieval boost is gated. This is intentional so a flag flip is bit-identical for already-ingested data.

Retrieval path (gated behind instructionBoostEnabled)

New file src/services/instruction-boost.ts (~55 LOC). applyInstructionBoost adds instructionBoostWeight to the score of every result whose metadata.fact_role === 'instruction', then re-sorts.

Wired into search-pipeline.ts:applyRankingProtectionStages between the current-state-ranking stage and the conciseness-penalty stage. No-op when the flag is off.

Plumbing

  • metadata?: Record<string, unknown> added to ExtractedFact (extraction.ts) and FactInput (memory-service-types.ts) so the tag survives the extraction → enrichment → audn → storage hop.
  • memory-storage.ts:storeProjection — new mergeStoreMetadata helper merges fact.metadata with cmo_id when persisting the parent memory. cmo_id always wins on conflict.
  • memcell-projection.ts:buildAtomicFactProjection — propagates fact.metadata onto the atomic-fact row metadata.
  • 'fact_role' added to RESERVED_METADATA_KEYS so callers cannot spoof it via the public ingest API. Drift-guard test reserved-metadata-keys.test.ts continues to pass.

Config

Two new keys in src/config.ts (and mirrored on CoreRuntimeConfig and SearchPipelineRuntimeConfig):

Key Default Env var
instructionBoostEnabled false INSTRUCTION_BOOST_ENABLED
instructionBoostWeight 0.15 INSTRUCTION_BOOST_WEIGHT

Both are added to INTERNAL_POLICY_CONFIG_FIELDS, which makes them config_override-allowlisted. The BEAM adapter can A/B per-ability per-run without a server restart:

// POST /v1/memories/search body
{
  "query": "what should I always do?",
  "config_override": {
    "instructionBoostEnabled": true,
    "instructionBoostWeight": 0.15
  }
}

Files changed

  • src/services/instruction-boost.ts — new (55 LOC)
  • src/services/__tests__/instruction-boost.test.ts — new (156 LOC, 8 cases)
  • src/services/extraction-enrichment.ts — +49 LOC (markers, tagging, detector)
  • src/services/__tests__/extraction-enrichment.test.ts — +73 LOC (16 new cases)
  • src/services/extraction.tsmetadata on ExtractedFact
  • src/services/memory-service-types.tsmetadata on FactInput
  • src/services/memory-storage.tsmergeStoreMetadata helper
  • src/services/memcell-projection.ts — propagate metadata into atomic projection
  • src/services/search-pipeline.tsapplyInstructionBoostStage between current-state and conciseness
  • src/db/repository-types.ts'fact_role' reserved
  • src/config.ts, src/app/runtime-container.ts — config keys + allowlist

Test plan

  • npx tsc --noEmit — clean.
  • npx vitest run src/services/__tests__/instruction-boost.test.ts — 8/8 pass.
  • npx vitest run src/services/__tests__/extraction-enrichment.test.ts — 19/19 pass (3 pre-existing + 16 new).
  • npx vitest run src/__tests__/reserved-metadata-keys.test.ts — pass (drift guard accepts the new fact_role key).
  • fallow pre-commit hook — clean.
  • Full DB-backed suite (npm test) — requires .env.test with Postgres+pgvector; not run in this worktree per task instructions (server uses the main checkout's DB).
  • BEAM IF synthetic slice — pending Phase-2 sprint scoring run; the harness will A/B instructionBoostEnabled: true vs false via config_override.

How to test locally

# Ingest with imperative phrasing (tagging is on by default).
curl -XPOST localhost:3050/v1/memories/ingest -H 'content-type: application/json' -d '{
  "user_id": "alice",
  "conversation": "Always cite sources for every claim."
}'

# Search with the boost enabled per-request.
curl -XPOST localhost:3050/v1/memories/search -H 'content-type: application/json' -d '{
  "user_id": "alice",
  "query": "what is my answering style?",
  "config_override": { "instructionBoostEnabled": true, "instructionBoostWeight": 0.15 }
}'

Constraints honored

  • All new TS files <400 lines; new functions <40 lines.
  • No any. No direct process.env access (single touchpoint in config.ts).
  • No silent error catches; no fallback modes.
  • Both flags default to false (Sprint 2 defaults-off rule).
  • AUDN flow untouched — boost is read-side only.

Tags facts that look like explicit instructions (markers: 'always X',
'never Y', 'do not Z', 'remember to W', 'from now on', 'going forward')
during extraction enrichment, by setting metadata.fact_role='instruction'
and flooring importance to 0.95.

Adds an applyInstructionBoost retrieval stage that adds a configurable
weight to facts with metadata.fact_role='instruction' during search
ranking, after current-state-ranking and before final RRF rerank.

New config keys (both defaults-off):
- instructionBoostEnabled: false
- instructionBoostWeight: 0.15

Both are config-override-allowlisted so the BEAM adapter can A/B per
ability via per-request config_override without server restart.

Targets BEAM IF ability per Sprint 2 EXP-05. Memory currently hurts IF
relative to no-memory baseline because instruction memories get diluted
in mixed retrieval; boosting them recovers the signal.

Behind feature flags. Defaults preserve current behavior.
@moralespanitz moralespanitz requested a review from ethanj as a code owner April 29, 2026 19:12
@moralespanitz moralespanitz marked this pull request as draft April 30, 2026 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant