Skip to content

Reject missing relation endpoints#5

Open
WarGloom wants to merge 1 commit into
dimknaf:mainfrom
WarGloom:fix/relation-missing-endpoints
Open

Reject missing relation endpoints#5
WarGloom wants to merge 1 commit into
dimknaf:mainfrom
WarGloom:fix/relation-missing-endpoints

Conversation

@WarGloom
Copy link
Copy Markdown

Summary

  • validate relation endpoint entity IDs before insert
  • return 404 for missing relation endpoints instead of surfacing a database traceback
  • add regression coverage for missing relation endpoint IDs

Tests

  • BRAINDB_TEST_URL=http://localhost:8100 python -m pytest tests/test_relations.py -q
  • python -m py_compile braindb/routers/relations.py tests/test_relations.py

WarGloom pushed a commit to WarGloom/braindb that referenced this pull request May 24, 2026
…rrow-query strategy

This is the second-leg of the recall overhaul (the first leg, d4b9288,
fixed the silent embedding-zero bug and widened the scoring pool). Two
new things land here, plus one prompt nudge.

## A.6 — fuzzy now goes through keywords too (symmetric retrieval)

Before: the embedding pathway in assemble_context was keyword-mediated
(after d4b9288), but the fuzzy pathway still ran pg_trgm + fulltext
directly against entity content / title via fuzzy_search. The result
was structurally unfair: a fact saved with keywords ["Petros", ...]
got text_score ~0.06 against a multi-word query like
"Petros person identity profile" because pg_trgm dilutes when a short
query is compared against a long entity body. The keyword indexing
was being bypassed by half the recall pipeline.

After: a new helper find_fuzzy_keywords runs pg_trgm
similarity(content, query) over entity_type='keyword' rows (short
keyword content → no dilution), and assemble_context's text pathway
fans out via the existing find_entities_for_keywords. Both pathways
now produce a per-entity score equal to the best matched-keyword
similarity over that entity's tagged_with neighbours. The
geometric-mean merge and missing_signal_penalty are unchanged but
become meaningful: they combine two signals about the SAME thing
(how well the query matches this entity's keywords), one via trigrams
and one via embeddings.

fuzzy_search itself is intentionally left alone — it still serves the
"arbitrary content matching" use-cases (quick_search agent tool,
/memory/search). A discoverability backup in assemble_context still
calls fuzzy_search and applies a heavy 0.2 discount as a pure fallback
(only adds entities the keyword path didn't already cover; never
overrides a keyword-path score).

Design principle being restored (user-stated): keywords are the
indexing hub. tagged_with relations are created automatically when an
entity is saved, so the keyword graph alone is enough for retrieval
connectivity. Explicit elaborates / refers_to edges are editorial
nuance, not required for findability.

## A.7 — two-level diversity quota (per-search-term + per-keyword)

When A.6 went live the top recall results for narrow-subject queries
were dominated by a few popular hub keywords (CityFalcon ~42 entities,
user-profile ~30, BrainDB ~12, ...). Each of those keywords was
strongly matched by the broad multi-word queries the LLM was issuing,
so their entities crowded top-N at near-identical scores; the
narrow-subject fact (e.g. Petros, only 1 entity tagged) fell below
the cut. Two complementary mechanisms, sharing ONE counter, fix this:

  L1 — per-search-term reservation: each query in queries[] gets
       ceil(max_results × per_query_share / num_queries) reserved
       slots filled from that query's OWN top-ranked entities. So
       a focused narrow query ALWAYS surfaces something in the
       result, no matter how broad the other queries are.

  L2 — per-keyword quota (geometric decay): walking the remaining
       (open) slots in final_rank-desc order, each new dominant
       matched keyword gets a halving allowance (50% / 25% / 12.5%
       ... of max_results, floor 1). Stops a popular keyword from
       monopolising the open portion.

They share one bookkeeping dict (seen: kw_id -> remaining), so a
keyword's allowance is decremented by BOTH L1 reservations and L2
walks — no double-spending, no conflict. The full coexistence rules
are documented in the docstring of _apply_two_level_quota in
braindb/services/context.py. Please read that block before touching
the function; the no-conflict property depends on the shared counter.

assemble_context now also tracks per-query scores (text_scores_by_q,
embedding_scores_by_q) alongside the existing max-aggregated dicts,
so L1 can rank entities by THAT query's own combined score (using
the same geometric-mean / missing_signal_penalty merge per query).

## Prompt nudge — recall_memory docstring teaches narrow-query strategy

A multi-word query like "Petros person identity profile" matches the
short "Petros" keyword at only ~0.4 fuzzy (trigram dilution). The
1-word query "Petros" matches it at ~1.0 and surfaces the Petros
fact at the top. To exploit this, the recall_memory tool's
docstring (which the LLM reads as the tool description) now
explicitly tells the model:

  - prefer 2-4 short focused queries over one long phrase
  - include bare subject names as standalone queries
  - example: ["Petros", "Selonda Saronikos fish farm", ...]
  - the per-search-term quota guarantees each angle gets
    representation, so adding the bare keyword is free

The narrow strategy + L1 reservation together unlock the
narrow-subject case: the LLM issues a single-keyword query for the
subject, that query reserves slots in the result, the subject's
fact tops those slots.

Also bumped: agent recall_memory default max_results 15 → 30 (via
new settings.recall_default_max_results). The /memory/context API
schema default was already 30; this brings the agent tool in line.

## Verification (live, deepinfra/Gemma-4-31B)

| Query                                                  | Petros position | final_rank |
|--------------------------------------------------------|-----------------|------------|
| ["Petros"] (narrow)                                    | dimknaf#1              | 0.838      |
| ["Petros", "Selonda Saronikos fish farm", "Dimitrios manager"] | dimknaf#1     | 0.839      |
| ["Petros person identity profile", "Petros relation to Dimitris", "Petros CityFalcon"] (broad-only) | dimknaf#5 | (was: NOT in top-30) |

Dimitrios Koutsoumpos /agent/query regression: 49.9s, 1362-char
structured grounded answer. Tool sequence intact.

## Files

 braindb/agent/tools.py              |  33 ++++- (docstring + default 30)
 braindb/config.py                   |  28 ++++  (3 new settings)
 braindb/services/context.py         | 288 ++++++++++++ (the bulk: A.6 + A.7)
 braindb/services/keyword_service.py |  32 ++++  (find_fuzzy_keywords)
 4 files changed, 342 insertions(+), 39 deletions(-)

## Knobs (all new settings, defaults are the shipping values)

  scoring_pool_keyword_neighbors: int = 500
    Already shipped in d4b9288; unchanged here.

  scoring_pool_fuzzy: int = 500
    Already shipped in d4b9288; unchanged here. The fuzzy scoring
    pool now applies to fuzzy_keyword matches (A.6).

  per_query_share: float = 0.5
    L1 quota: fraction of max_results reserved across per-query slots.
    Set to 0 to disable L1.

  keyword_quota_halving: float = 0.5
    L2 quota: each new dominant keyword's slot allowance shrinks
    geometrically. Set to 1.0 to disable L2.

  recall_default_max_results: int = 30
    Default max_results the agent's recall_memory tool exposes to
    the LLM (and the /memory/context API).

## What is explicitly NOT touched

- missing_signal_penalty (still 0.5)
- effective_importance / temporal decay
- graph_expand
- the geometric-mean seed_score merge
- fuzzy_search itself (still keyword-blind for quick_search /
  /memory/search consumers)
- the agent loop, the typed final-answer contract, the wiki pipeline,
  the scheduler

No IDF was added. The two-level quota plus the prompt nudge are
sufficient for narrow-subject surfacing in our data; adding IDF on
top would be bloat.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant