Skip to content

Code review: 3 bugs found (head truncation, empty-result retry tax, hybrid no circuit breaker) + 4 hardening items #574

@ether-btc

Description

@ether-btc

Bug reports

Found during a code review of the current master branch. All findings are source-verified against raw source files.


🔴 BUG 1: slice(-maxChars) silently discards early conversation content

File: src/smart-extractor.ts, extractCandidates()
Severity: High

When conversationText.length > extractMaxChars (default 8000), the code does:

const truncated =
  conversationText.length > maxChars
    ? conversationText.slice(-maxChars)   // ← always keeps TAIL
    : conversationText;

The effect: the extractor always keeps the last 8000 chars and discards everything earlier. A user who establishes context at the start of a 2-hour session loses that context entirely. The README explicitly markets this as learning "from every conversation" — but the implementation systematically loses the first half of long sessions.

Fix (1 line):

const truncated = conversationText.slice(0, maxChars);

🟡 BUG 2: 75ms retry on every empty recall result

File: src/tools.ts, retrieveWithRetry()
Severity: Medium

async function retrieveWithRetry(retriever, params) {
  let results = await retriever.retrieve(params);
  if (results.length === 0) {
    await sleep(75);                    // ← fires on EVERY empty result
    results = await retriever.retrieve(params);
  }
  return results;
}

In a fresh install with no memories stored, every memory_recall call — including the agent's startup context loads — pays a mandatory 75ms delay on a genuinely empty result set. This is a systematic latency tax on the most common case during the learning phase.

Fix: Only retry on observable transient errors, not on empty-by-design results:

if (results.length === 0 && retriever.lastError?.transient) {
  await sleep(75);
  results = await retriever.retrieve(params);
}

🟡 BUG 3: Promise.all in hybrid retrieval has no circuit breaker

File: src/retriever.ts, hybridRetrieval()
Severity: Medium

const [vectorResults, bm25Results] = await Promise.all([
  this.runVectorSearch(...).catch(e => { throw attachFailureStage(e, "hybrid.vectorSearch"); }),
  this.runBM25Search(...).catch(e => { throw attachFailureStage(e, "hybrid.bm25Search"); }),
]);

If the vector search is healthy but BM25 has a transient failure (network timeout, FTS index corruption), the entire retrieval throws — the user gets zero results instead of a graceful vector-only fallback.

Fix: Use Promise.allSettled for graceful degradation:

const [vectorResult, bm25Result] = await Promise.allSettled([
  this.runVectorSearch(...),
  this.runBM25Search(...),
]);
// Log failures, continue with whichever succeeded

Hardening / improvement items

🟡 Missing config: SIMILARITY_THRESHOLD hardcoded at 0.7

File: src/smart-extractor.ts
Severity: Medium

The dedup similarity threshold (0.7) is a magic number in source. This is a critical tuning knob: too low → false merges, too high → duplicates. It should be exposed in SmartExtractorConfig so users can tune it without code changes.


🟡 Missing config: MAX_MEMORIES_PER_EXTRACTION = 5 silently caps extraction

File: src/smart-extractor.ts
Severity: Low-Medium

A long conversation can produce more than 5 candidate memories. The cap silently discards important memories with no warning, no log, and no pagination.


🟢 Nit: Rerank failure not recorded in diagnostics

File: src/retriever.ts, cosineSimilarity() fallback path
Severity: Low

When the cross-encoder reranker fails and the code falls back to cosine similarity, no failure is recorded in diagnostics. Users debugging retrieval quality see no indication that reranking was bypassed.


🟢 Nit: batchDedup failures silently swallowed

File: src/smart-extractor.ts, extractAndPersist()
Severity: Low

} catch (err) {
  this.log(`batchDedup failed, proceeding without batch dedup: ${String(err)}`);
  // continues as if nothing happened
}

If batchDedup fails for a non-transient reason (e.g. bug, not API error), near-duplicate candidates bypass dedup and flood the LLM dedup stage. Should surface a metric or increment a counter.


Summary table

# Item Severity Fix complexity
1 slice(-maxChars) drops early conversation 🔴 High 1 line
2 75ms retry on empty results 🟡 Medium 1 line + condition
3 Hybrid Promise.all no circuit breaker 🟡 Medium Medium refactor
4 SIMILARITY_THRESHOLD not configurable 🟡 Medium Medium
5 MAX_MEMORIES_PER_EXTRACTION hardcoded cap 🟡 Low-Medium Trivial
6 Rerank failure silent in diagnostics 🟢 Low 1-2 lines
7 batchDedup silent swallow 🟢 Low Medium

Review performed by Charon (OpenClaw agent) on 2026-04-09.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions