Skip to content

recall: Full-name entity disambiguation (Alex Panagis vs Alex Beck problem) #71

@jack-arturo

Description

@jack-arturo

Problem

When recalling memories for a specific person by full name (e.g. "Alex Panagis"), the system returns memories about different people who share a first name (e.g. "Alex Beck"). This is the #1 quality issue in person-based recall.

Root Cause

The entity extraction creates overly generic entity tags like entity:people:alex which matches every person named Alex in the graph. When entity expansion is enabled, this generic tag bridges unrelated people:

Alex Panagis → entity:people:alex → Alex Beck → EchoDash → Jack → everything

Test Case

Query: recall_memory(query="Alex Panagis", expand_entities=true, limit=15)

Expected: 5-6 memories specifically about Alex Panagis and ScaleMath
Actual: 16 results — 5 about Alex Panagis, 6 about Alex Beck, 5 completely unrelated (Zack Katz, Luka, Andrew Haberman, Mastermind crew)

Even with expand_min_importance=0.7 and expand_min_strength=0.5, the Alex Beck contamination persists because those memories have high importance (0.8-1.0).

Proposed Solutions

  1. Full-name entity preference: When a query contains what appears to be a full name (two+ words, capitalized), heavily penalize or exclude matches on partial name entities. entity:people:alex-panagis should score much higher than entity:people:alex.

  2. Entity normalization during extraction: Merge entity:people:alex-panagis and entity:people:alex-panagis-founder into a single canonical entity. Stop creating bare first-name entities when a full name is available.

  3. Query-time entity filtering: Add an optional entity_filter parameter to recall that constrains results to memories containing a specific entity before expansion begins.

  4. Bi-gram entity matching: When expanding entities, prefer multi-token entity matches over single-token ones. "alex-panagis" (2 tokens) should be strongly preferred over "alex" (1 token).

Impact

This affects any use case where users have contacts/people with shared first names — which is essentially everyone. Person lookup is one of the most common recall patterns.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions