fix: Use ID-filtered graph projection in COT and Context Extension retrievers by Vasilije1990 · Pull Request #2229 · topoteretes/cognee

Vasilije1990 · 2026-02-24T07:39:01Z

Summary

COT and Context Extension retrievers always called get_triplets(query_batch=...) even for single queries, forcing batch mode in brute_force_triplet_search. Batch mode sets wide_search_limit=None, which bypasses node ID extraction and causes full graph projection instead of ID-filtered projection.
Added get_triplets_batch() helper to GraphCompletionRetriever that delegates to single-query mode (query=) when len(queries)==1, enabling ID-filtered graph projection. Both subclass retrievers now use this helper.
After this fix, single-query searches in GRAPH_COMPLETION_COT and GRAPH_COMPLETION_CONTEXT_EXTENSION log "Retrieving ID-filtered graph from database" instead of "Retrieving full graph", matching GRAPH_COMPLETION behavior.

Changes

File	Change
`graph_completion_retriever.py`	Added `get_triplets_batch()` helper: uses single-query mode for 1 query, batch mode for multiple
`graph_completion_cot_retriever.py`	`_fetch_initial_triplets_and_context` and `_merge_followup_triplets` now use `get_triplets_batch()`
`graph_completion_context_extension_retriever.py`	`get_retrieved_objects` and `_run_extension_round` now use `get_triplets_batch()`

Test plan

Run single-query search with GRAPH_COMPLETION_COT — verify logs show "Retrieving ID-filtered graph from database"
Run single-query search with GRAPH_COMPLETION_CONTEXT_EXTENSION — verify logs show "Retrieving ID-filtered graph from database"
Run single-query search with GRAPH_COMPLETION — verify behavior unchanged
Run batch-query search with COT/Context Extension — verify batch mode still works (full graph projection for multi-query batches)
Run existing retrieval tests

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

Refactor
- Internal optimization of retrieval API to use batch-oriented processing methods, improving consistency across graph completion retrieval modules.

…trievers COT and Context Extension retrievers always called get_triplets(query_batch=...) even for single queries, forcing batch mode in brute_force_triplet_search. Batch mode sets wide_search_limit=None, bypassing node ID extraction and causing full graph projection instead of ID-filtered projection. Add get_triplets_batch() helper that delegates to single-query mode (query=) when len(queries)==1, enabling ID-filtered graph projection. Both retrievers now use this helper so logs show "Retrieving ID-filtered graph from database" for single queries, matching Graph Completion behavior. Co-Authored-By: Claude Opus 4.6 <[email protected]> Signed-off-by: vasilije <[email protected]>

coderabbitai · 2026-02-24T07:39:30Z

Walkthrough

Consolidates the triplet retrieval API by introducing and switching to a batch-oriented get_triplets_batch method across retrieval modules, replacing parameterized get_triplets(query_batch=...) calls while preserving existing logic and behavior.

Changes

Cohort / File(s)	Summary
API Addition `cognee/modules/retrieval/graph_completion_retriever.py`	Introduces new public method `get_triplets_batch(queries: List[str])` that delegates to single-query processing for single inputs and uses batch processing for multiple queries, returning wrapped results.
API Call Updates `cognee/modules/retrieval/graph_completion_context_extension_retriever.py`, `cognee/modules/retrieval/graph_completion_cot_retriever.py`	Updates four call sites from `get_triplets(query_batch=...)` to `get_triplets_batch(...)` across `get_retrieved_objects`, `_run_extension_round`, `_fetch_initial_triplets_and_context`, and `_merge_followup_triplets` methods.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

test: add integration tests for batch queries in graph completion #2153 — Directly related as the main PR adds and switches call sites to the new get_triplets_batch method, supporting integration tests for batch triplet retrieval.
feat: enable batch queries in all graph completion retrievers #2002 — Related through the introduction of batch triplet API and batch-enabled retriever changes that this PR now consumes.

Suggested labels

run-checks

Suggested reviewers

hajdul88
lxobr

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: implementing ID-filtered graph projection in COT and Context Extension retrievers, which is the core purpose of this pull request.
Description check	✅ Passed	The description provides clear human-generated context on the problem, solution, affected files, and test plan. It covers the key issue (batch mode forcing full graph projection) and the fix (adding get_triplets_batch helper), meeting the template requirements.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/id-filtered-graph-cot-context-extension

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

cognee/modules/retrieval/graph_completion_retriever.py (1)

167-177: Missing Parameters section in docstring

The docstring describes the return value but omits a Parameters section for the queries argument. Per coding guidelines, function definitions without complete documentation are considered incomplete.

✏️ Suggested docstring addition

     """
     Retrieves triplets for a list of queries, using single-query mode when
     possible to enable ID-filtered graph projection.

+    Parameters:
+    -----------
+        - queries (List[str]): One or more query strings. A single-element list
+          uses single-query mode to enable ID-filtered graph projection; multiple
+          queries use batch mode.
+
     When there is only one query, delegates to single-query mode (query=)
     which computes relevant node IDs and filters the graph projection.
     For multiple queries, uses batch mode (query_batch=).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cognee/modules/retrieval/graph_completion_retriever.py` around lines 167 -
177, Add a Parameters section to the docstring of the retrieval function in
graph_completion_retriever.py (the function that "Retrieves triplets for a list
of queries" and switches between single-query mode and batch mode) documenting
the queries argument and its type/shape and any other important parameters
(e.g., queries: List[str] — one query per requested result; clarify expected
element type and whether None/empty lists are allowed), and include any relevant
parameter behavior (single-query triggers ID-filtered graph projection). Keep
wording consistent with the existing Returns section and project docstring
style.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cognee/modules/retrieval/graph_completion_retriever.py`:
- Around line 163-181: The method get_triplets_batch must guard the empty-input
case and normalize the heterogeneous return shape of get_triplets so the
declared return type List[List[Edge]] is satisfied for mypy: first, if queries
is empty return an empty list immediately; second, when len(queries)==1 call
get_triplets(query=...) and normalize its result so you always return a
List[List[Edge]] (if get_triplets returns List[Edge] wrap it as [result], if it
returns List[List[Edge]] use it but ensure you return exactly one inner list);
third, when calling get_triplets(query_batch=...) assert/coerce the batch result
to List[List[Edge]] (if the call yields a flat List[Edge] wrap it into a
single-item list-per-query mapping) so both branches have the same concrete type
and mypy passes. Ensure you reference get_triplets and get_triplets_batch while
making these checks and conversions.

---

Nitpick comments:
In `@cognee/modules/retrieval/graph_completion_retriever.py`:
- Around line 167-177: Add a Parameters section to the docstring of the
retrieval function in graph_completion_retriever.py (the function that
"Retrieves triplets for a list of queries" and switches between single-query
mode and batch mode) documenting the queries argument and its type/shape and any
other important parameters (e.g., queries: List[str] — one query per requested
result; clarify expected element type and whether None/empty lists are allowed),
and include any relevant parameter behavior (single-query triggers ID-filtered
graph projection). Keep wording consistent with the existing Returns section and
project docstring style.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0e7ce15 and 7c9ee35.

📒 Files selected for processing (3)

cognee/modules/retrieval/graph_completion_context_extension_retriever.py
cognee/modules/retrieval/graph_completion_cot_retriever.py
cognee/modules/retrieval/graph_completion_retriever.py

coderabbitai · 2026-02-24T07:44:59Z

cognee/modules/retrieval/graph_completion_retriever.py

+    async def get_triplets_batch(
+        self,
+        queries: List[str],
+    ) -> List[List[Edge]]:
+        """
+        Retrieves triplets for a list of queries, using single-query mode when
+        possible to enable ID-filtered graph projection.
+
+        When there is only one query, delegates to single-query mode (query=)
+        which computes relevant node IDs and filters the graph projection.
+        For multiple queries, uses batch mode (query_batch=).
+
+        Returns:
+            List[List[Edge]]: One list of edges per query.
+        """
+        if len(queries) == 1:
+            triplets = await self.get_triplets(query=queries[0])
+            return [triplets]
+        return await self.get_triplets(query_batch=queries)


⚠️ Potential issue | 🟡 Minor

Type annotation mismatch and missing empty-list guard in get_triplets_batch

Two issues:

mypy incompatibility: get_triplets is typed -> Union[List[Edge], List[List[Edge]]], so both return paths fail mypy's return-type check against the declared -> List[List[Edge]]:

Single-query branch: [triplets] has inferred type List[Union[List[Edge], List[List[Edge]]]].

Multi-query branch: direct return of Union[List[Edge], List[List[Edge]]].

No guard for empty queries: when len(queries) == 0, the call falls through to get_triplets(query_batch=[]) whose behaviour with an empty batch is undefined in brute_force_triplet_search. As a public method, this edge case should be defended.

🛠️ Proposed fix

+from typing import cast async def get_triplets_batch( self, queries: List[str], ) -> List[List[Edge]]: + if not queries: + return [] if len(queries) == 1: - triplets = await self.get_triplets(query=queries[0]) + triplets = cast(List[Edge], await self.get_triplets(query=queries[0])) return [triplets] - return await self.get_triplets(query_batch=queries) + return cast(List[List[Edge]], await self.get_triplets(query_batch=queries))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee/modules/retrieval/graph_completion_retriever.py` around lines 163 - 181, The method get_triplets_batch must guard the empty-input case and normalize the heterogeneous return shape of get_triplets so the declared return type List[List[Edge]] is satisfied for mypy: first, if queries is empty return an empty list immediately; second, when len(queries)==1 call get_triplets(query=...) and normalize its result so you always return a List[List[Edge]] (if get_triplets returns List[Edge] wrap it as [result], if it returns List[List[Edge]] use it but ensure you return exactly one inner list); third, when calling get_triplets(query_batch=...) assert/coerce the batch result to List[List[Edge]] (if the call yields a flat List[Edge] wrap it into a single-item list-per-query mapping) so both branches have the same concrete type and mypy passes. Ensure you reference get_triplets and get_triplets_batch while making these checks and conversions.

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Use ID-filtered graph projection in COT and Context Extension retrievers#2229

fix: Use ID-filtered graph projection in COT and Context Extension retrievers#2229
Vasilije1990 wants to merge 1 commit intodevfrom
fix/id-filtered-graph-cot-context-extension

Vasilije1990 commented Feb 24, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Vasilije1990 commented Feb 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Vasilije1990 commented Feb 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 24, 2026 •

edited

Loading