refactor(knowledge_runtime): switch RAG protocol from value mode to reference mode by sunnights · Pull Request #1030 · wecode-ai/Wegent

sunnights · 2026-04-26T16:40:17Z

Summary by CodeRabbit

New Features
- Internal service Bearer-token authentication for protected internal endpoints.
- Database-backed runtime configuration resolution and schema for runtime configs.
- Shared placeholder utility for resolving templated headers.
Bug Fixes
- Connection testing simplified to directly validate storage backends; legacy remote test endpoint removed.
Refactor
- Requests moved to reference-mode (user_id) and executors/endpoints now use DB-resolved configs.
Tests
- Extensive test updates and new suites covering config resolution and reference-mode behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… resolution Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…avior Conditionally set vector_weight/keyword_weight to None when retrieval_mode is not "hybrid", matching Backend RagRuntimeResolver behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Refactor IndexExecutor, QueryExecutor, and AdminExecutor to resolve runtime configs (retriever, embedding model, splitter) from the database via ConfigResolver instead of receiving them directly in request payloads (value mode -> reference mode). Key changes: - IndexExecutor: uses ConfigResolver.resolve_index_config() to get retriever_config, embedding_model_config, index_owner_user_id, and splitter_config from DB - QueryExecutor: iterates over knowledge_base_ids instead of knowledge_base_configs, uses ConfigResolver.resolve_query_config() for each KB - AdminExecutor: uses ConfigResolver.resolve_retriever_config() and _get_knowledge_base() to get retriever config and index owner user_id - All executors: manage DB session lifecycle with proper cleanup (get_session -> next -> try/finally close) - Tests: updated to mock get_session and ConfigResolver methods on executor instances, added tests for config resolution error handling and DB session cleanup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add SQLAlchemy session management, KnowledgeDocumentReadOnly model, and database initialization in main.py lifespan. Config now supports DATABASE_URL environment variable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

RemoteRagGateway now only passes knowledge_base_id + user_id instead of full configs. ConnectionTestRuntimeSpec changed to use knowledge_base_id + user_id. Retriever test-connection endpoint simplified to test directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…or DB session Replace manual get_session()/try/finally/close pattern with FastAPI's Depends(get_db) injection. Executors now receive db in constructor, session lifecycle managed by framework. Add resolve_admin_config() to ConfigResolver to avoid accessing private _get_knowledge_base. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…perations Replace resolve_retriever_config + _get_knowledge_base combination with single resolve_admin_config call. Remove resolve_retriever_config public method since it's no longer used outside ConfigResolver. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move process_custom_headers_placeholders and supporting functions from Backend's model_resolver.py and KR's config_resolver.py to a shared module. Both Backend and KR now use the same implementation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…t-connection Revert ConnectionTestRuntimeSpec back to retriever_config (not modifying runtime_specs.py). Restore test_connection_local to use spec.retriever_config directly (no DB query needed). Remove test_connection from RemoteRagGateway since retrievers endpoint tests storage directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move RuntimeRetrieverConfig, RuntimeEmbeddingModelConfig, and RuntimeRetrievalConfig from knowledge_runtime_protocol.py to a dedicated runtime_config.py module. Re-export from both the protocol module and __init__.py for backward compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…e mode RemotePurgeKnowledgeIndexRequest and RemoteDropKnowledgeIndexRequest no longer have index_owner_user_id and retriever_config fields. Switch from build_internal_* to build_public_* runtime spec builders that resolve configs from DB. Add DATABASE_URL to docker-compose for KR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Key changes based on comparison of zy_kb_runtime and wegent/refactor/kr-config-reference-mode branches: - Switch KR DB layer to shared.db.sync_session (reduce duplication) - Use KnowledgeDocument inheriting shared Base (id + attachment_id + splitter_config) - Rename AdminConfig to AdminResolvedConfig (FETCH_HEAD pattern) - Remove KR test-connection endpoint (Backend retrievers test directly) - Remove RemoteTestConnectionRequest protocol model - Restore Backend /internal/rag Bearer Token auth (verify_internal_service_token) - Keep HEAD choices: ConfigResolutionError, required user_id, single query config, group retriever fallback, remove retrieval_policy/index_families Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-04-26T16:40:34Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Refactors runtime configuration from request-embedded "value mode" to database-backed "reference mode" via a new ConfigResolver. Adds internal service-token auth for RAG routes, removes remote test-connection RPCs in favor of direct storage backend testing, and updates executors/endpoints/tests to resolve configs from the DB.

Changes

Cohort / File(s)	Summary
Backend API Authentication `backend/app/api/endpoints/internal/rag.py`, `backend/app/services/auth/__init__.py`, `backend/app/services/auth/internal_service_token.py`	Adds `verify_internal_service_token` and attaches it as a router dependency for internal RAG endpoints; exports the verifier from auth package.
Retriever Connection Testing `backend/app/api/endpoints/adapter/retrievers.py`, `backend/tests/api/endpoints/test_retrievers_api.py`	Removes gateway/runtime-spec path for connection tests; endpoint now creates storage backend directly and calls `test_connection` in a worker thread. Tests updated to mock storage factory.
RAG Gateway & Embedding headers `backend/app/services/rag/remote_gateway.py`, `backend/app/services/rag/embedding/factory.py`, `backend/app/services/rag/runtime_resolver.py`	Remote gateway payloads simplified to reference-mode (`user_id`); removed remote `test_connection`. Embedding factory and runtime resolver now use shared placeholder utility for custom headers.
Shared runtime models & placeholders `shared/models/__init__.py`, `shared/models/knowledge_runtime_protocol.py`, `shared/models/runtime_config.py`, `shared/utils/placeholder.py`	Introduces strict Pydantic runtime config models in `runtime_config.py`. Protocol models converted from value-mode to reference-mode (add `user_id`, remove embedded runtime configs). Adds placeholder utilities for resolving `${...}` in headers.
Knowledge Runtime DB & startup `knowledge_runtime/knowledge_runtime/config.py`, `knowledge_runtime/knowledge_runtime/main.py`, `knowledge_runtime/pyproject.toml`, `docker-compose.yml`	Adds `database_url` setting and SQLAlchemy dependency; initializes DB on startup when configured and exposes `DATABASE_URL` in compose.
Knowledge Runtime models `knowledge_runtime/knowledge_runtime/models/__init__.py`, `knowledge_runtime/knowledge_runtime/models/knowledge_document.py`	Adds `KnowledgeDocument` SQLAlchemy model for storing per-document splitter config; adds license header.
Endpoints → DB-backed executors `knowledge_runtime/knowledge_runtime/api/endpoints/index.py`, `knowledge_runtime/knowledge_runtime/api/endpoints/query.py`, `knowledge_runtime/knowledge_runtime/api/endpoints/admin.py`	Endpoints now inject `Session = Depends(get_db)` and pass DB into executors; `/test-connection` endpoint removed.
Config resolution `knowledge_runtime/knowledge_runtime/services/config_resolver.py`	New comprehensive ConfigResolver that loads KB, kinds, decrypts credentials, processes placeholders, and builds resolved dataclasses (IndexConfig/QueryConfig/AdminResolvedConfig).
Executors updated `knowledge_runtime/knowledge_runtime/services/index_executor.py`, `knowledge_runtime/knowledge_runtime/services/query_executor.py`, `knowledge_runtime/knowledge_runtime/services/admin_executor.py`	Executors now accept DB session, use ConfigResolver to resolve runtime configs, and pass resolved configs to storage/embedding factories. Logging converted to parameterized calls; connection-test removed.
Tests: backend & runtime `backend/tests/...`, `knowledge_runtime/tests/...`, `shared/tests/...`	Extensive test updates: backend tests mock direct storage tests; knowledge_runtime and shared tests updated/added to validate ConfigResolver logic, DB-backed resolution, reference-mode protocol, and placeholder processing. Many tests renamed/rewired to use `user_id`.
Test fixtures & helpers `knowledge_runtime/tests/conftest.py`	Adds fixtures for `ConfigResolver` and `mock_db` plus helpers to build DB-kind/user/document records for tests.

Sequence Diagram

sequenceDiagram
    participant Client as rgba(52,152,219,0.5) Client
    participant API as rgba(46,204,113,0.5) Knowledge Runtime Endpoint
    participant Exec as rgba(155,89,182,0.5) Executor
    participant Resolver as rgba(241,196,15,0.5) ConfigResolver
    participant DB as rgba(231,76,60,0.5) Database
    participant Storage as rgba(52,73,94,0.5) Storage Backend

    Client->>API: HTTP request (user_id, knowledge_base_id)
    API->>Exec: instantiate Executor(db_session)
    Exec->>Resolver: resolve_*_config(db, knowledge_base_id, user_id)
    Resolver->>DB: load KB, kinds, user, documents
    DB-->>Resolver: KB/kind/user/document records
    Resolver->>Resolver: decrypt creds, process placeholders
    Resolver-->>Exec: RuntimeRetrieverConfig, RuntimeEmbeddingModelConfig, RetrievalConfig
    Exec->>Storage: create_storage_backend_from_config(resolved_retriever_config)
    Storage-->>Exec: storage backend instance
    Exec->>Storage: perform operation (index/query/purge)
    Storage-->>Exec: results
    Exec-->>API: response payload
    API-->>Client: HTTP response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

refactor(rag): build modular data plane foundation #903 — Modifies RAG runtime-spec and resolver/gateway flow; strongly related to reference-mode/runtime resolver changes in this PR.

Poem

🐰 sniffs the code patch
I hopped from requests to a table so deep,
Where configs now slumber and secrets keep.
ConfigResolver digs, then hands things along—
Storage wakes, the pipeline sings its song!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main architectural change: converting the RAG protocol from value mode (embedding/retriever configs provided in requests) to reference mode (simplified requests with database-backed config resolution).
Docstring Coverage	✅ Passed	Docstring coverage is 81.11% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

docker-compose.yml (1)
287-294: ⚠️ Potential issue | 🟠 Major

Default DATABASE_URL does not match the mysql service defaults; rag profile will fail to connect on a fresh setup.

The default mysql+pymysql://root:password@mysql:3306/wegent mismatches the mysql service in this same compose file, where the defaults are MYSQL_ROOT_PASSWORD=123456 (not password), and the application database/user are task_manager / task_user / task_password (no wegent DB is created). The backend service correctly uses ${MYSQL_USER:-task_user}:${MYSQL_PASSWORD:-task_password}@mysql:3306/${MYSQL_DATABASE:-task_manager}. With the defaults shipped here, knowledge_runtime (rag profile) will fail to start once init_db(settings.database_url) runs.

Additionally, depends_on only lists backend; since this service now requires DB connectivity at startup, it should also wait for mysql: service_healthy (matching what backend does).
🛠 Proposed fix to align defaults with mysql service and add health gate
       # Database connection for config resolution (reference-mode)
-      - DATABASE_URL=${DATABASE_URL:-mysql+pymysql://root:password@mysql:3306/wegent}
+      - DATABASE_URL=${DATABASE_URL:-mysql+pymysql://${MYSQL_USER:-task_user}:${MYSQL_PASSWORD:-task_password}@mysql:3306/${MYSQL_DATABASE:-task_manager}}
       # OpenTelemetry Configuration (uncomment to enable)
       # - OTEL_ENABLED=true
       # - OTEL_SERVICE_NAME=wegent-knowledge-runtime
       # - OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
     depends_on:
-      - backend
+      mysql:
+        condition: service_healthy
+      backend:
+        condition: service_started
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 287 - 294, The default DATABASE_URL value
and startup ordering are wrong: update the DATABASE_URL default to match the
compose mysql defaults used by the backend (use
MYSQL_USER/MYSQL_PASSWORD/MYSQL_DATABASE defaults — e.g.
mysql+pymysql://task_user:task_password@mysql:3306/task_manager or interpolate
with
${MYSQL_USER:-task_user}:${MYSQL_PASSWORD:-task_password}@mysql:3306/${MYSQL_DATABASE:-task_manager})
so init_db(settings.database_url) can connect, and add mysql to the depends_on
for this service with the same healthcheck gating used by backend (depends_on: -
backend and - mysql with condition: service_healthy) so the service waits for
the database to be healthy before startup.
knowledge_runtime/knowledge_runtime/services/admin_executor.py (1)
35-43: ⚠️ Potential issue | 🟡 Minor

Stale docstring entry: test_connection.

The class docstring still lists test_connection: Test storage backend connection even though this PR removes the test_connection method from AdminExecutor. Update the operation list to match.
📝 Proposed fix
     Operations:
     - delete_document_index: Delete a specific document's index
     - purge_knowledge_index: Delete all chunks for a knowledge base
     - drop_knowledge_index: Physically drop the index/collection
     - list_chunks: List all chunks in a knowledge base
-    - test_connection: Test storage backend connection
     """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/knowledge_runtime/services/admin_executor.py` around lines
35 - 43, The class docstring for AdminExecutor lists an outdated operation
"test_connection"; update the docstring inside the AdminExecutor class to remove
the `test_connection: Test storage backend connection` bullet so the operations
list only contains the current methods (delete_document_index,
purge_knowledge_index, drop_knowledge_index, list_chunks), ensuring the
docstring text matches the actual implemented methods in AdminExecutor.

🧹 Nitpick comments (9)

shared/models/runtime_config.py (1)
39-46: Consider validating weight presence vs. retrieval_mode.

vector_weight/keyword_weight are bounded to [0,1] individually, but nothing enforces that they are provided (and ideally sum to ~1.0) when retrieval_mode == "hybrid", or rejected/ignored otherwise. Given this is the strict reference-mode contract (extra="forbid"), a model_validator(mode="after") would catch misconfigured KB specs at the boundary instead of silently producing wrong retrieval results downstream.
♻️ Suggested validator
-from pydantic import BaseModel, ConfigDict, Field
+from pydantic import BaseModel, ConfigDict, Field, model_validator
@@
 class RuntimeRetrievalConfig(_RuntimeConfigModel):
     """Normalized retrieval config for a single knowledge base target."""

     top_k: int = Field(default=20, gt=0)
     score_threshold: float = Field(default=0.7, ge=0.0, le=1.0)
     retrieval_mode: RetrievalMode = "vector"
     vector_weight: float | None = Field(default=None, ge=0.0, le=1.0)
     keyword_weight: float | None = Field(default=None, ge=0.0, le=1.0)
+
+    `@model_validator`(mode="after")
+    def _check_hybrid_weights(self) -> "RuntimeRetrievalConfig":
+        if self.retrieval_mode == "hybrid":
+            if self.vector_weight is None or self.keyword_weight is None:
+                raise ValueError(
+                    "hybrid retrieval_mode requires both vector_weight and keyword_weight"
+                )
+        return self
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shared/models/runtime_config.py` around lines 39 - 46, RuntimeRetrievalConfig
currently allows vector_weight and keyword_weight independently but does not
enforce presence/relationship vs. retrieval_mode; add a pydantic
model_validator(mode="after") on RuntimeRetrievalConfig to: when retrieval_mode
== "hybrid" require both vector_weight and keyword_weight are not None and their
sum is approximately 1.0 (or within a small epsilon), and when retrieval_mode !=
"hybrid" require both weights are None (or raise) so extra="forbid" surfaces bad
specs; raise ValueError with clear messages on violations to block invalid
configs.
knowledge_runtime/knowledge_runtime/models/knowledge_document.py (1)
26-26: Use a callable (default=dict) instead of a mutable literal {}.

Column(JSON, ..., default={}) binds the same {} instance as the column-level Python default; if anywhere in the codebase that resolved value is mutated before flush (e.g. doc.splitter_config["foo"] = ... on a freshly created instance) the mutation will leak across rows. The idiomatic and safe form is to pass a callable, which SQLAlchemy invokes per insert to produce a fresh dict.
🛡️ Proposed fix
-    splitter_config = Column(JSON, nullable=False, default={})
+    splitter_config = Column(JSON, nullable=False, default=dict)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/knowledge_runtime/models/knowledge_document.py` at line 26,
The Column definition for splitter_config uses a mutable literal default={}
which can be shared across instances; change the Column(JSON, nullable=False,
default={}) to use a callable default (e.g., default=dict) so SQLAlchemy creates
a fresh dict per row; update the splitter_config Column declaration in the
KnowledgeDocument model to use default=dict to avoid shared-mutable-state bugs.
backend/app/api/endpoints/adapter/retrievers.py (1)
257-276: Minor: prefer logger.exception and consider sanitizing the surfaced error string.

Two small things in the new error path:

logger.error(f"Retriever connection test failed: {str(e)}") discards the traceback. Using logger.exception(...) here would preserve the stack trace — important for diagnosing backend driver failures (qdrant/elasticsearch client errors often wrap the underlying exception).

Returning f"Connection failed: {str(e)}" to the client can echo back library-internal details (and sometimes the URL/credential context embedded in the exception by some clients). Since this endpoint accepts user-supplied credentials, consider surfacing a generic message to the client and logging the full exception detail server-side only.
♻️ Suggested change
     except Exception as e:
-        logger.error(f"Retriever connection test failed: {str(e)}")
-        return {"success": False, "message": f"Connection failed: {str(e)}"}
+        logger.exception("Retriever connection test failed")
+        return {"success": False, "message": "Connection failed"}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/endpoints/adapter/retrievers.py` around lines 257 - 276, The
error handler in the retriever connection test currently logs without a
traceback and returns the full exception string to the client; change the except
Exception block to call logger.exception("Retriever connection test failed") to
preserve the stack trace and replace the client-facing message with a sanitized
generic string (e.g. "Connection failed") while logging the full exception
server-side; keep the ValueError branch unchanged and ensure this affects the
code path around create_storage_backend_from_config and the awaited
storage_backend.test_connection call.
knowledge_runtime/tests/test_admin_executor.py (1)
62-71: Optional: extract the repeated resolve_admin_config mock into a fixture/helper.

The same admin_executor._config_resolver.resolve_admin_config = MagicMock(return_value=AdminResolvedConfig(index_owner_user_id=7, retriever_config=mock_retriever_config)) block is duplicated across all five test methods. A small helper or autouse fixture parameterized by mock_retriever_config would remove ~25 lines of duplication and make future changes to the resolver contract a single-point edit.
♻️ Example helper
+@pytest.fixture
+def patch_resolve_admin_config(admin_executor, mock_retriever_config):
+    admin_executor._config_resolver.resolve_admin_config = MagicMock(
+        return_value=AdminResolvedConfig(
+            index_owner_user_id=7,
+            retriever_config=mock_retriever_config,
+        )
+    )
+    return admin_executor._config_resolver.resolve_admin_config
Then each test can simply request patch_resolve_admin_config instead of repeating the inline assignment.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_admin_executor.py` around lines 62 - 71,
Duplicate test setup: extract the repeated mock assignment of
admin_executor._config_resolver.resolve_admin_config into a reusable pytest
fixture/helper (e.g., patch_resolve_admin_config) that returns or applies
MagicMock(return_value=AdminResolvedConfig(index_owner_user_id=7,
retriever_config=mock_retriever_config)); update each test to accept this
fixture (or call the helper) instead of repeating the inline assignment; keep
the existing patch of create_storage_backend_from_runtime_config and reference
the same mock_retriever_config in the fixture so changes to the resolver
contract are centralized.
knowledge_runtime/knowledge_runtime/main.py (1)
36-43: Consider fail-fast when DB-backed flows are mandatory.

With database_url empty, the service starts successfully but every endpoint that depends on ConfigResolver (index/query/admin) will fail at request time with no early signal. If DB-backed config resolution is now the only supported mode in production, raising at startup (or at least making this conditional on a STANDALONE_MODE-like flag) would surface misconfiguration immediately rather than after the first failed request. The current warning-only behavior is fine for dev but is easy to miss in container logs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/knowledge_runtime/main.py` around lines 36 - 43, The
startup currently only warns when settings.database_url is missing but several
endpoints rely on DB-backed ConfigResolver (init_db); change startup to
fail-fast in production by raising an error when settings.database_url is empty
unless an explicit opt-out flag is set (e.g., STANDALONE_MODE or
settings.standalone_mode). Locate the block that checks settings.database_url
and either call init_db(settings.database_url) as today or, if missing and not
in standalone mode, raise a RuntimeError (or exit with non-zero) with a clear
message indicating DB-backed config resolution is required; keep the current
warning behavior only when standalone mode is enabled or when a dev flag is
present so dev runs remain unaffected.
knowledge_runtime/knowledge_runtime/services/index_executor.py (1)
36-112: LGTM!

The DB-backed resolution flow is wired cleanly: ConfigResolver.resolve_index_config is invoked first, and downstream calls correctly consume config.retriever_config, config.embedding_model_config, config.index_owner_user_id, and config.splitter_config instead of request-supplied fields. Conversion of log statements to parameterized logging is a nice ancillary improvement.

Minor optional note: execute is now ~70 lines, slightly over the preferred 50-line guideline. Splitting the "resolve config + fetch content" prelude from the "build pipeline + index" portion into small private helpers could improve readability, but is non-blocking.

As per coding guidelines: "Maximum function length of 50 lines preferred for all functions".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/knowledge_runtime/services/index_executor.py` around lines
36 - 112, The execute method exceeds the preferred 50-line guideline; refactor
it by extracting the prelude (config resolution and content fetching) into a
private helper (e.g., _resolve_config_and_fetch that calls
ConfigResolver.resolve_index_config and ContentFetcher.fetch and returns config,
binary_data, source_file, file_extension) and the pipeline build + indexing into
another helper (e.g., _build_and_index that creates storage via
create_storage_backend_from_runtime_config, embedding model via
create_embedding_model_from_runtime_config, constructs DocumentService and calls
DocumentService.index_document_from_binary), then have execute call these two
helpers and return the result so execute remains concise while preserving all
existing behavior and logging.
knowledge_runtime/tests/test_config_resolver.py (1)
668-671: Mock target is the wrong module — these patches are no-ops.

config_resolver.py does from shared.utils.placeholder import process_custom_headers_placeholders at module load, binding a local reference inside knowledge_runtime.services.config_resolver. Patching shared.utils.placeholder.process_custom_headers_placeholders does not rebind that local reference, so production code keeps calling the real function. In these particular tests, custom_headers happens to be empty (or the assertion doesn't depend on the substitution), so the tests still pass — but the patches are misleading and won't help if a future test relies on them.

If you intend to mock the helper used by _build_resolved_embedding_model_config, patch the imported name where it's used:
♻️ Suggested change
-            patch(
-                "shared.utils.placeholder.process_custom_headers_placeholders",
-                side_effect=lambda h, u: h,
-            ),
+            patch(
+                "knowledge_runtime.services.config_resolver."
+                "process_custom_headers_placeholders",
+                side_effect=lambda h, u: h,
+            ),
Also applies to: 711-714, 746-749
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_config_resolver.py` around lines 668 - 671, The
test patches are targeting the helper in its original module, which is a no-op
because config_resolver.py imports process_custom_headers_placeholders at module
load; update the patches to target the name where it is imported and used (the
bound symbol in the config resolver module) — i.e., replace patches of
"shared.utils.placeholder.process_custom_headers_placeholders" with
"knowledge_runtime.services.config_resolver.process_custom_headers_placeholders"
so that calls inside _build_resolved_embedding_model_config (and any other
functions in that module) are properly mocked; apply this change to the three
patch sites flagged in the tests.
knowledge_runtime/knowledge_runtime/services/config_resolver.py (1)
433-441: Redundant blanket except Exception (ruff BLE001).

shared.utils.crypto.decrypt_api_key already returns the original value when the input is not encrypted and otherwise propagates only crypto-specific failures from decrypt_sensitive_data. Catching every Exception here both hides real bugs (e.g., a TypeError from passing a non-string into value.startswith etc.) and is redundant for the documented "backward-compatibility for plaintext" path. Either narrow the except to the specific decryption errors raised by decrypt_sensitive_data, or rely on decrypt_api_key and drop the try/except.
♻️ Proposed change
     `@staticmethod`
     def _decrypt_optional_value(value: Any) -> Any:
         """Decrypt an optional encrypted value. Returns original if decryption fails."""
         if not value:
             return value
         try:
             return decrypt_api_key(value)
-        except Exception:
+        except Exception:  # noqa: BLE001  # best-effort: keep ciphertext on decrypt failure
+            logger.warning("Failed to decrypt sensitive value; using original.")
             return value
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/knowledge_runtime/services/config_resolver.py` around lines
433 - 441, The helper _decrypt_optional_value currently masks all exceptions
with a blanket except Exception which hides real bugs; remove the try/except and
simply return decrypt_api_key(value) (keeping the early "if not value: return
value" check) so decryption errors from decrypt_sensitive_data propagate
correctly; alternatively, if you must swallow only crypto failures, import and
catch the specific exception thrown by decrypt_sensitive_data (e.g.,
DecryptionError) instead of Exception, referencing decrypt_api_key and
decrypt_sensitive_data in your change.
knowledge_runtime/knowledge_runtime/services/query_executor.py (1)
52-60: Optional: parallelize per-KB queries for multi-KB requests.

execute() resolves config and runs each KB query strictly sequentially. With a multi-KB request, latency scales linearly with len(knowledge_base_ids), even though each KB query is independent and the resolution / underlying retrieval are I/O-bound. Consider running them concurrently with asyncio.gather (still bounded by storage backend connection limits) to reduce tail latency for queries spanning several knowledge bases.
⚡ Sketch
-        for knowledge_base_id in request.knowledge_base_ids:
-            records = await self._query_knowledge_base(
-                request=request,
-                knowledge_base_id=knowledge_base_id,
-            )
-            all_records.extend(records)
+        results = await asyncio.gather(
+            *(
+                self._query_knowledge_base(
+                    request=request,
+                    knowledge_base_id=kb_id,
+                )
+                for kb_id in request.knowledge_base_ids
+            )
+        )
+        for records in results:
+            all_records.extend(records)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/knowledge_runtime/services/query_executor.py` around lines
52 - 60, execute() currently iterates request.knowledge_base_ids and awaits
_query_knowledge_base sequentially, causing linear latency; change it to run
per-KB queries concurrently using asyncio.gather on tasks created from calling
self._query_knowledge_base(request=request, knowledge_base_id=kb_id) for each
kb_id, then flatten/extend the results into all_records; optionally wrap
coroutine creation with a bounded semaphore or an async pool to limit concurrent
connections to the storage backend if needed.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@knowledge_runtime/knowledge_runtime/services/config_resolver.py`:
- Around line 355-379: The current fallback query hard-codes Kind.namespace ==
"default", which can leak a public retriever into a tenant namespace; update the
fallback in the retriever lookup to preserve the requested namespace (use
Kind.namespace == namespace instead of "default") so lookups remain scoped to
the requested tenant for the Kind with Kind.name == name and Kind.kind ==
"Retriever"; also ensure this change is applied where the retriever lookup logic
(the Kind query using variables name and namespace) occurs and add no-op
behavior otherwise (leave user_id==0 fallback as-is or remove if not needed by
design).

In `@knowledge_runtime/tests/test_config_resolver.py`:
- Around line 119-129: The sentinel pattern in _make_document is broken because
the default argument splitter_config=object() creates a new object at definition
time and the check splitter_config is object() creates a different object at
call time, so the identity test always fails and the default never applies; fix
by introducing a module-level sentinel (e.g., _SENTINEL = object()) or use None
as the default and then in _make_document check if splitter_config is _SENTINEL
(or is None) and only then set doc.splitter_config = {"chunk_size": 512}; update
the function signature and the conditional that assigns doc.splitter_config
accordingly so omitted calls receive the proper default.

---

Outside diff comments:
In `@docker-compose.yml`:
- Around line 287-294: The default DATABASE_URL value and startup ordering are
wrong: update the DATABASE_URL default to match the compose mysql defaults used
by the backend (use MYSQL_USER/MYSQL_PASSWORD/MYSQL_DATABASE defaults — e.g.
mysql+pymysql://task_user:task_password@mysql:3306/task_manager or interpolate
with
${MYSQL_USER:-task_user}:${MYSQL_PASSWORD:-task_password}@mysql:3306/${MYSQL_DATABASE:-task_manager})
so init_db(settings.database_url) can connect, and add mysql to the depends_on
for this service with the same healthcheck gating used by backend (depends_on: -
backend and - mysql with condition: service_healthy) so the service waits for
the database to be healthy before startup.

In `@knowledge_runtime/knowledge_runtime/services/admin_executor.py`:
- Around line 35-43: The class docstring for AdminExecutor lists an outdated
operation "test_connection"; update the docstring inside the AdminExecutor class
to remove the `test_connection: Test storage backend connection` bullet so the
operations list only contains the current methods (delete_document_index,
purge_knowledge_index, drop_knowledge_index, list_chunks), ensuring the
docstring text matches the actual implemented methods in AdminExecutor.

---

Nitpick comments:
In `@backend/app/api/endpoints/adapter/retrievers.py`:
- Around line 257-276: The error handler in the retriever connection test
currently logs without a traceback and returns the full exception string to the
client; change the except Exception block to call logger.exception("Retriever
connection test failed") to preserve the stack trace and replace the
client-facing message with a sanitized generic string (e.g. "Connection failed")
while logging the full exception server-side; keep the ValueError branch
unchanged and ensure this affects the code path around
create_storage_backend_from_config and the awaited
storage_backend.test_connection call.

In `@knowledge_runtime/knowledge_runtime/main.py`:
- Around line 36-43: The startup currently only warns when settings.database_url
is missing but several endpoints rely on DB-backed ConfigResolver (init_db);
change startup to fail-fast in production by raising an error when
settings.database_url is empty unless an explicit opt-out flag is set (e.g.,
STANDALONE_MODE or settings.standalone_mode). Locate the block that checks
settings.database_url and either call init_db(settings.database_url) as today
or, if missing and not in standalone mode, raise a RuntimeError (or exit with
non-zero) with a clear message indicating DB-backed config resolution is
required; keep the current warning behavior only when standalone mode is enabled
or when a dev flag is present so dev runs remain unaffected.

In `@knowledge_runtime/knowledge_runtime/models/knowledge_document.py`:
- Line 26: The Column definition for splitter_config uses a mutable literal
default={} which can be shared across instances; change the Column(JSON,
nullable=False, default={}) to use a callable default (e.g., default=dict) so
SQLAlchemy creates a fresh dict per row; update the splitter_config Column
declaration in the KnowledgeDocument model to use default=dict to avoid
shared-mutable-state bugs.

In `@knowledge_runtime/knowledge_runtime/services/config_resolver.py`:
- Around line 433-441: The helper _decrypt_optional_value currently masks all
exceptions with a blanket except Exception which hides real bugs; remove the
try/except and simply return decrypt_api_key(value) (keeping the early "if not
value: return value" check) so decryption errors from decrypt_sensitive_data
propagate correctly; alternatively, if you must swallow only crypto failures,
import and catch the specific exception thrown by decrypt_sensitive_data (e.g.,
DecryptionError) instead of Exception, referencing decrypt_api_key and
decrypt_sensitive_data in your change.

In `@knowledge_runtime/knowledge_runtime/services/index_executor.py`:
- Around line 36-112: The execute method exceeds the preferred 50-line
guideline; refactor it by extracting the prelude (config resolution and content
fetching) into a private helper (e.g., _resolve_config_and_fetch that calls
ConfigResolver.resolve_index_config and ContentFetcher.fetch and returns config,
binary_data, source_file, file_extension) and the pipeline build + indexing into
another helper (e.g., _build_and_index that creates storage via
create_storage_backend_from_runtime_config, embedding model via
create_embedding_model_from_runtime_config, constructs DocumentService and calls
DocumentService.index_document_from_binary), then have execute call these two
helpers and return the result so execute remains concise while preserving all
existing behavior and logging.

In `@knowledge_runtime/knowledge_runtime/services/query_executor.py`:
- Around line 52-60: execute() currently iterates request.knowledge_base_ids and
awaits _query_knowledge_base sequentially, causing linear latency; change it to
run per-KB queries concurrently using asyncio.gather on tasks created from
calling self._query_knowledge_base(request=request, knowledge_base_id=kb_id) for
each kb_id, then flatten/extend the results into all_records; optionally wrap
coroutine creation with a bounded semaphore or an async pool to limit concurrent
connections to the storage backend if needed.

In `@knowledge_runtime/tests/test_admin_executor.py`:
- Around line 62-71: Duplicate test setup: extract the repeated mock assignment
of admin_executor._config_resolver.resolve_admin_config into a reusable pytest
fixture/helper (e.g., patch_resolve_admin_config) that returns or applies
MagicMock(return_value=AdminResolvedConfig(index_owner_user_id=7,
retriever_config=mock_retriever_config)); update each test to accept this
fixture (or call the helper) instead of repeating the inline assignment; keep
the existing patch of create_storage_backend_from_runtime_config and reference
the same mock_retriever_config in the fixture so changes to the resolver
contract are centralized.

In `@knowledge_runtime/tests/test_config_resolver.py`:
- Around line 668-671: The test patches are targeting the helper in its original
module, which is a no-op because config_resolver.py imports
process_custom_headers_placeholders at module load; update the patches to target
the name where it is imported and used (the bound symbol in the config resolver
module) — i.e., replace patches of
"shared.utils.placeholder.process_custom_headers_placeholders" with
"knowledge_runtime.services.config_resolver.process_custom_headers_placeholders"
so that calls inside _build_resolved_embedding_model_config (and any other
functions in that module) are properly mocked; apply this change to the three
patch sites flagged in the tests.

In `@shared/models/runtime_config.py`:
- Around line 39-46: RuntimeRetrievalConfig currently allows vector_weight and
keyword_weight independently but does not enforce presence/relationship vs.
retrieval_mode; add a pydantic model_validator(mode="after") on
RuntimeRetrievalConfig to: when retrieval_mode == "hybrid" require both
vector_weight and keyword_weight are not None and their sum is approximately 1.0
(or within a small epsilon), and when retrieval_mode != "hybrid" require both
weights are None (or raise) so extra="forbid" surfaces bad specs; raise
ValueError with clear messages on violations to block invalid configs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4c40c1de-e68a-430e-9166-e5a85687496e

📥 Commits

Reviewing files that changed from the base of the PR and between b1b1d20 and cb44c62.

⛔ Files ignored due to path filters (2)

knowledge_runtime/uv.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (32)

backend/app/api/endpoints/adapter/retrievers.py
backend/app/api/endpoints/internal/rag.py
backend/app/services/auth/__init__.py
backend/app/services/auth/internal_service_token.py
backend/app/services/rag/embedding/factory.py
backend/app/services/rag/remote_gateway.py
backend/app/services/rag/runtime_resolver.py
backend/tests/api/endpoints/test_retrievers_api.py
backend/tests/services/rag/test_local_gateway.py
backend/tests/services/rag/test_remote_gateway.py
docker-compose.yml
knowledge_runtime/knowledge_runtime/api/endpoints/admin.py
knowledge_runtime/knowledge_runtime/api/endpoints/index.py
knowledge_runtime/knowledge_runtime/api/endpoints/query.py
knowledge_runtime/knowledge_runtime/config.py
knowledge_runtime/knowledge_runtime/main.py
knowledge_runtime/knowledge_runtime/models/__init__.py
knowledge_runtime/knowledge_runtime/models/knowledge_document.py
knowledge_runtime/knowledge_runtime/services/admin_executor.py
knowledge_runtime/knowledge_runtime/services/config_resolver.py
knowledge_runtime/knowledge_runtime/services/index_executor.py
knowledge_runtime/knowledge_runtime/services/query_executor.py
knowledge_runtime/pyproject.toml
knowledge_runtime/tests/test_admin_executor.py
knowledge_runtime/tests/test_config_resolver.py
knowledge_runtime/tests/test_index_executor.py
knowledge_runtime/tests/test_query_executor.py
shared/models/__init__.py
shared/models/knowledge_runtime_protocol.py
shared/models/runtime_config.py
shared/tests/test_knowledge_runtime_protocol.py
shared/utils/placeholder.py

…ument The default argument `splitter_config=object()` creates a new object at definition time, but `splitter_config is object()` creates a different object at call time, so the identity check always failed. Replace with a module-level `_SENTINEL = object()` so the default applies correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

knowledge_runtime/tests/test_config_resolver.py (3)
122-131: Type hint inconsistent with sentinel default.

splitter_config: dict | None = _SENTINEL is incorrect: _SENTINEL is object(), which is neither dict nor None. Type checkers will flag this. Use Any to match the broader type:
♻️ Suggested change
-from unittest.mock import MagicMock, patch
+from typing import Any
+from unittest.mock import MagicMock, patch
@@
-def _make_document(
-    document_id: int = 100, splitter_config: dict | None = _SENTINEL
-) -> MagicMock:
+def _make_document(
+    document_id: int = 100, splitter_config: Any = _SENTINEL
+) -> MagicMock:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_config_resolver.py` around lines 122 - 131, The
type hint on _make_document's parameter is too narrow: `splitter_config: dict |
None = _SENTINEL` is inconsistent because _SENTINEL is an object, not dict or
None; update the annotation to a broader type (e.g., typing.Any or Any) so the
default sentinel is valid, adjust the import to include Any if needed, and keep
the function logic using _SENTINEL to detect the default branch (refer to
function _make_document and the sentinel _SENTINEL).
1-1104: File exceeds the 1000-line guideline.

This file is 1104 lines. Consider splitting along the existing section banners (e.g., test_resolve_index_config.py, test_parse_kb_retrieval_config.py, test_build_resolved_*.py, test_helpers.py) and moving the shared _make_* factories into a conftest.py to keep each module focused.

As per coding guidelines: "if a file exceeds 1000 lines, split it into multiple sub-modules".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_config_resolver.py` around lines 1 - 1104, This
test module is over 1000 lines; split it into focused files and centralize
shared factories: create separate test files (e.g., tests for
resolve_index_config, parse_kb_retrieval_config, build_resolved_* and helper
functions) and move the shared factory helpers (_make_kb_kind,
_make_retriever_kind, _make_model_kind, _make_document, _make_user, _SENTINEL)
plus common fixtures (resolver, mock_db) into conftest.py; update imports in
each new test file to use the fixtures and factories and keep each test file
under 1000 lines while preserving existing test class and function names (e.g.,
TestResolveIndexConfig, TestParseKbRetrievalConfig,
TestBuildResolvedRetrieverConfig, TestBuildResolvedEmbeddingModelConfig,
TestProcessCustomHeadersPlaceholders) so tests run unchanged.
821-866: Tests don't actually validate the user-priority / fallback behavior.

These tests only assert result is not None after wiring mock_db.query...first.return_value to a fake. They don't verify which filter clauses were applied, the ordering by user_id (so user-specific records win over user_id=0), or that the public fallback kicks in only when the first lookup misses. The test_group_namespace_fallback_to_public case at least exercises side_effect=[None, public_retriever] but doesn't assert that two queries happened or that the second one used user_id=0.

Consider asserting on the calls made against mock_db (e.g., mock_db.query.return_value.filter.call_args_list) or capturing the filter arguments to confirm the resolver applies user_id priority and falls back to user_id=0 for group namespaces. As per coding guidelines, "Kind resource is uniquely identified by THREE fields: namespace, name, and user_id; always filter by all three fields" — these tests should pin that contract.

Also applies to: 877-909
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_config_resolver.py` around lines 821 - 866,
Update the three failing tests to assert the actual DB filter calls and ordering
instead of only checking result non-null: in
test_default_namespace_queries_with_user_priority, assert that
mock_db.query.return_value.filter.return_value.order_by was called and inspect
mock_db.query.return_value.filter.call_args_list to verify filters included
name="test-retriever", namespace="default" and that order_by was invoked to
prefer user_id=42 over user_id=0; in
test_group_namespace_queries_without_user_id, assert the single filter call
includes namespace="team-ns", name="test-retriever" and user_id=42 (or absence
if intended) to match the three-field identification rule; in
test_group_namespace_fallback_to_public, assert two separate query/filter
sequences occurred (e.g., len(filter.call_args_list) == 2) and that the first
used user_id=42 while the second used user_id=0 (public fallback) by inspecting
the call args for user_id values. Ensure you reference
resolver._get_retriever_kind and
mock_db.query.return_value.filter.call_args_list when adding these assertions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@knowledge_runtime/tests/test_config_resolver.py`:
- Around line 670-673: The test patches currently target
"shared.utils.placeholder.process_custom_headers_placeholders" but
config_resolver.py imports the function directly, so the patch must target the
usage site; update all three patches to mock
"knowledge_runtime.services.config_resolver.process_custom_headers_placeholders"
(the same symbol imported into that module) and keep the side_effect=lambda h,
u: h to preserve passthrough behavior; ensure you change the three occurrences
that correspond to the tests referencing config_resolver's call site.

---

Nitpick comments:
In `@knowledge_runtime/tests/test_config_resolver.py`:
- Around line 122-131: The type hint on _make_document's parameter is too
narrow: `splitter_config: dict | None = _SENTINEL` is inconsistent because
_SENTINEL is an object, not dict or None; update the annotation to a broader
type (e.g., typing.Any or Any) so the default sentinel is valid, adjust the
import to include Any if needed, and keep the function logic using _SENTINEL to
detect the default branch (refer to function _make_document and the sentinel
_SENTINEL).
- Around line 1-1104: This test module is over 1000 lines; split it into focused
files and centralize shared factories: create separate test files (e.g., tests
for resolve_index_config, parse_kb_retrieval_config, build_resolved_* and helper
functions) and move the shared factory helpers (_make_kb_kind,
_make_retriever_kind, _make_model_kind, _make_document, _make_user, _SENTINEL)
plus common fixtures (resolver, mock_db) into conftest.py; update imports in
each new test file to use the fixtures and factories and keep each test file
under 1000 lines while preserving existing test class and function names (e.g.,
TestResolveIndexConfig, TestParseKbRetrievalConfig,
TestBuildResolvedRetrieverConfig, TestBuildResolvedEmbeddingModelConfig,
TestProcessCustomHeadersPlaceholders) so tests run unchanged.
- Around line 821-866: Update the three failing tests to assert the actual DB
filter calls and ordering instead of only checking result non-null: in
test_default_namespace_queries_with_user_priority, assert that
mock_db.query.return_value.filter.return_value.order_by was called and inspect
mock_db.query.return_value.filter.call_args_list to verify filters included
name="test-retriever", namespace="default" and that order_by was invoked to
prefer user_id=42 over user_id=0; in
test_group_namespace_queries_without_user_id, assert the single filter call
includes namespace="team-ns", name="test-retriever" and user_id=42 (or absence
if intended) to match the three-field identification rule; in
test_group_namespace_fallback_to_public, assert two separate query/filter
sequences occurred (e.g., len(filter.call_args_list) == 2) and that the first
used user_id=42 while the second used user_id=0 (public fallback) by inspecting
the call args for user_id values. Ensure you reference
resolver._get_retriever_kind and
mock_db.query.return_value.filter.call_args_list when adding these assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6c03ddc0-753b-40f1-8afe-f954219ce2fe

📥 Commits

Reviewing files that changed from the base of the PR and between cb44c62 and 2d8e3f4.

📒 Files selected for processing (1)

knowledge_runtime/tests/test_config_resolver.py

- Fix broken sentinel pattern in _make_document (module-level _SENTINEL instead of per-call object()) - Update internal rag test payloads: replace old fields (index_owner_user_id, retriever_config) with reference-mode fields (knowledge_base_id, user_id) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py (1)
117-175: ⚠️ Potential issue | 🟠 Major

The endpoint ignores user_id from the request; the test assertion should verify it is forwarded to the builder, and the endpoint implementation should pass it.

RemoteListChunksRequest requires user_id (shared/models/knowledge_runtime_protocol.py:126), and the test correctly includes it in the payload (line 120). However:

Endpoint implementation (backend/app/api/endpoints/internal/rag.py:655–661) does not forward user_id when calling build_internal_list_chunks_runtime_spec, unlike the parallel purge and drop endpoints which correctly pass user_id=request.user_id to their respective builders.

Test assertion (lines 168–174) does not verify user_id is forwarded, masking the omission.

Update the endpoint to forward user_id (matching the purge/drop pattern), and add user_id=9 to the assertion to pin the contract.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py` around
lines 117 - 175, The endpoint is not forwarding user_id to the runtime builder;
update the handler in rag.py to call
build_internal_list_chunks_runtime_spec(..., user_id=request.user_id) (same
pattern as purge/drop handlers) so the RemoteListChunksRequest contract is
satisfied, and update the test
test_internal_all_chunks_routes_protocol_request_through_local_gateway to assert
mock_build_spec.assert_called_once_with(..., knowledge_base_id=7,
max_chunks=1000, query="list_index_chunks",
metadata_condition=payload["metadata_condition"], user_id=9) so the test
verifies the user_id is forwarded when LocalRagGateway.list_chunks is awaited
with the returned runtime_spec.

🧹 Nitpick comments (3)

knowledge_runtime/tests/test_config_resolver.py (3)
119-131: Type hint on splitter_config is inaccurate.

_SENTINEL = object() does not satisfy the declared dict | None annotation, which will trip type checkers (mypy/pyright) and is misleading for callers. Use Any to reflect the sentinel-based contract:
♻️ Proposed fix
+from typing import Any
+
 _SENTINEL = object()
 
 
 def _make_document(
-    document_id: int = 100, splitter_config: dict | None = _SENTINEL
+    document_id: int = 100, splitter_config: Any = _SENTINEL
 ) -> MagicMock:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_config_resolver.py` around lines 119 - 131, The
type hint for splitter_config in _make_document is incorrect because _SENTINEL
is an object and doesn't match dict | None; update the signature to use
typing.Any (import Any) so the parameter can accept the sentinel, dict, or None
without type errors, e.g. change "splitter_config: dict | None = _SENTINEL" to
"splitter_config: Any = _SENTINEL" and keep the sentinel logic inside the
function; reference symbols: _SENTINEL and _make_document.
160-167: Dead mock-db plumbing in test_success_with_document_id.

_get_knowledge_base, _get_user_name, _get_splitter_config, and the two builder methods are all patched directly on resolver, so the chained mock_db.query.return_value...first.side_effect = [kb, user, doc] (and the order_by.first.return_value = retriever setup just above) are never consulted. This is misleading for readers and risks masking regressions if the patches are later removed.
♻️ Suggested cleanup
-        mock_db.query.return_value.filter.return_value.filter.return_value.order_by.return_value.first.return_value = (
-            retriever
-        )
-        mock_db.query.return_value.filter.return_value.first.side_effect = [
-            kb,
-            user,
-            doc,
-        ]
-
         with (
             patch.object(resolver, "_get_knowledge_base", return_value=kb),
retriever, model, doc, and user locals can also be dropped if no longer referenced.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_config_resolver.py` around lines 160 - 167, The
test_success_with_document_id contains dead mock-db plumbing: remove the unused
chained mocks on mock_db.query (the order_by.first.return_value and the
.first.side_effect = [kb, user, doc]) because _get_knowledge_base,
_get_user_name, _get_splitter_config and the two builder methods are patched
directly on resolver and those DB mocks are never used; update the test to rely
only on the patched resolver methods (resolver._get_knowledge_base,
resolver._get_user_name, resolver._get_splitter_config, and the two builder
methods) and delete unused local variables retriever, model, doc, and user if
they are no longer referenced to avoid misleading setup and maskable
regressions.
821-866: _get_*_kind tests don’t actually verify namespace/user_id priority semantics.

These tests are named after important resolution rules (user-priority in default, no user filter in group namespaces, fallback to user_id=0) but only assert that the result is non-None. With the chained MagicMock, any query path returns a truthy mock, so an implementation regression (e.g., dropping the user_id filter in default, or skipping the user_id=0 fallback) would not fail these tests.

Consider strengthening at least one assertion per test by inspecting the recorded filter/order_by calls, e.g.:
# verify user_id appears in default-namespace filter chain
filter_calls = mock_db.query.return_value.filter.call_args_list
assert any("user_id" in str(c) for c in filter_calls)
or by spying on the underlying SQLAlchemy column comparisons. Otherwise the “priority” / “fallback” phrasing in the test names is aspirational rather than verified.

Per coding guidelines: "Kind resource is uniquely identified by THREE fields: namespace, name, and user_id; always filter by all three fields" — these tests are the natural place to lock that contract in.

Also applies to: 877-909
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_config_resolver.py` around lines 821 - 866,
Tests for _get_retriever_kind only assert non-None and don't verify the required
filters/priority semantics; update each test (e.g.,
test_default_namespace_queries_with_user_priority,
test_group_namespace_queries_without_user_id,
test_group_namespace_fallback_to_public) to assert the query filter chain
includes the expected comparisons for namespace, name and user_id by inspecting
mock_db.query.return_value.filter.call_args_list (or order_by.call_args_list for
user-priority cases) and for the fallback test assert that the first call
returned None and a subsequent call included user_id=0 in its filter calls;
ensure you reference _get_retriever_kind when adding these assertions so the
tests fail if the implementation drops required filters or fallback behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@knowledge_runtime/tests/test_config_resolver.py`:
- Around line 1-1104: The test module is too large (1104 lines); split it into
smaller files along class boundaries and extract shared fixtures into
conftest.py: create test_config_resolver_resolve.py containing
TestResolveIndexConfig and TestResolveQueryConfig,
test_config_resolver_builders.py containing TestBuildResolvedRetrieverConfig and
TestBuildResolvedEmbeddingModelConfig, test_config_resolver_helpers.py
containing TestGetKnowledgeBase, TestParseKbRetrievalConfig,
TestGetSplitterConfig, TestGetUserName, TestDecryptOptionalValue,
TestGetRetrieverKind, TestGetModelKind, and test_config_resolver_errors.py
containing TestConfigResolutionError and TestProcessCustomHeadersPlaceholders;
move shared fixtures/factories (resolver, mock_db, _make_kb_kind,
_make_retriever_kind, _make_model_kind, _make_document, _make_user) into a new
conftest.py so each new test module imports them automatically and update
imports/patch targets (e.g., references to ConfigResolver,
ConfigResolutionError, IndexConfig, QueryConfig, RuntimeRetrieverConfig,
RuntimeEmbeddingModelConfig, process_custom_headers_placeholders) accordingly to
keep tests passing.

---

Outside diff comments:
In `@backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py`:
- Around line 117-175: The endpoint is not forwarding user_id to the runtime
builder; update the handler in rag.py to call
build_internal_list_chunks_runtime_spec(..., user_id=request.user_id) (same
pattern as purge/drop handlers) so the RemoteListChunksRequest contract is
satisfied, and update the test
test_internal_all_chunks_routes_protocol_request_through_local_gateway to assert
mock_build_spec.assert_called_once_with(..., knowledge_base_id=7,
max_chunks=1000, query="list_index_chunks",
metadata_condition=payload["metadata_condition"], user_id=9) so the test
verifies the user_id is forwarded when LocalRagGateway.list_chunks is awaited
with the returned runtime_spec.

---

Nitpick comments:
In `@knowledge_runtime/tests/test_config_resolver.py`:
- Around line 119-131: The type hint for splitter_config in _make_document is
incorrect because _SENTINEL is an object and doesn't match dict | None; update
the signature to use typing.Any (import Any) so the parameter can accept the
sentinel, dict, or None without type errors, e.g. change "splitter_config: dict
| None = _SENTINEL" to "splitter_config: Any = _SENTINEL" and keep the sentinel
logic inside the function; reference symbols: _SENTINEL and _make_document.
- Around line 160-167: The test_success_with_document_id contains dead mock-db
plumbing: remove the unused chained mocks on mock_db.query (the
order_by.first.return_value and the .first.side_effect = [kb, user, doc])
because _get_knowledge_base, _get_user_name, _get_splitter_config and the two
builder methods are patched directly on resolver and those DB mocks are never
used; update the test to rely only on the patched resolver methods
(resolver._get_knowledge_base, resolver._get_user_name,
resolver._get_splitter_config, and the two builder methods) and delete unused
local variables retriever, model, doc, and user if they are no longer referenced
to avoid misleading setup and maskable regressions.
- Around line 821-866: Tests for _get_retriever_kind only assert non-None and
don't verify the required filters/priority semantics; update each test (e.g.,
test_default_namespace_queries_with_user_priority,
test_group_namespace_queries_without_user_id,
test_group_namespace_fallback_to_public) to assert the query filter chain
includes the expected comparisons for namespace, name and user_id by inspecting
mock_db.query.return_value.filter.call_args_list (or order_by.call_args_list for
user-priority cases) and for the fallback test assert that the first call
returned None and a subsequent call included user_id=0 in its filter calls;
ensure you reference _get_retriever_kind when adding these assertions so the
tests fail if the implementation drops required filters or fallback behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 06f019d4-0100-40d5-bbe6-f88a333af953

📥 Commits

Reviewing files that changed from the base of the PR and between 2d8e3f4 and 1a4163e.

📒 Files selected for processing (2)

backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py
knowledge_runtime/tests/test_config_resolver.py

…er files Split the 1104-line monolithic test file into 4 focused files along class boundaries and extract shared fixtures/factories into conftest.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (4)

knowledge_runtime/tests/test_config_resolver_helpers.py (2)
242-325: Tests verify return value but not query semantics.

test_default_namespace_queries_with_user_priority and test_group_namespace_queries_without_user_id only assert result is not None, which passes as long as the configured mock returns anything. They don't verify the actual SQLAlchemy behavior under test (user_id filter, order_by(Kind.user_id.desc()), namespace branching). Consider asserting on the calls, e.g., capturing mock_db.query.return_value.filter.call_args_list and verifying the filter expressions, or at minimum that the user-priority branch uses order_by while the group branch doesn't. Otherwise a regression that swaps the two branches would still let the tests pass.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_config_resolver_helpers.py` around lines 242 -
325, Tests currently only assert non-None results; update the two pairs of tests
for _get_retriever_kind and _get_model_kind to also assert the expected query
behavior: inspect mock_db.query.return_value.filter.call_args_list to ensure the
user-priority branch used order_by (verify that
mock_db.query.return_value.filter.return_value.order_by was called at least once
for "default" namespace tests) and that the group-namespace branch did not call
order_by (or only called filter, not order_by) for "team-ns" tests; additionally
assert the filter call arguments contain the expected user_id or namespace
expressions to validate the correct branch was taken.
229-236: Test asserts mock identity, not real decrypt_api_key plaintext pass-through.

test_plain_text_api_key patches decrypt_api_key to return "sk-plain-key", so the assertion only verifies that _decrypt_optional_value returns whatever the patched decrypt returns — i.e., it's effectively a duplicate of test_successful_decrypt. To meaningfully cover the documented behavior ("plaintext sk- keys pass through"), call the real decrypt_api_key (no patch) on a sk- plaintext, or patch it to mimic real plaintext-detection behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_config_resolver_helpers.py` around lines 229 -
236, The test test_plain_text_api_key is incorrectly patching
knowledge_runtime.services.config_resolver.decrypt_api_key so it only verifies
the mock rather than the real plaintext-pass-through behavior; update the test
to exercise ConfigResolver._decrypt_optional_value with an actual "sk-..."
plaintext API key (do not patch decrypt_api_key) or adjust the patch to mimic
real plaintext detection (e.g., have the mock raise or return the input for
"sk-" values) so that the assertion verifies that
ConfigResolver._decrypt_optional_value returns the plaintext sk- key without
delegating to a mocked decrypt function.
knowledge_runtime/tests/test_config_resolver_resolve.py (1)
40-50: Dead mock-DB setup — all paths are patched at the resolver level.

Lines 40-50 wire up mock_db.query.return_value.filter… chains and a MagicMock user, but the test then patches _get_knowledge_base, _get_user_name, _build_resolved_retriever_config, _build_resolved_embedding_model_config, and _get_splitter_config directly on the resolver, so none of those mock_db query chains are exercised. The user MagicMock and the mock_db.query setup can be removed without changing what the test verifies.
♻️ Suggested cleanup
         kb = _make_kb_kind(knowledge_base_id=1, user_id=42)
-        retriever = _make_retriever_kind()
-        model = _make_model_kind()
-        user = MagicMock()
-        user.id = 42
-        user.user_name = "testuser"
-
-        mock_db.query.return_value.filter.return_value.filter.return_value.order_by.return_value.first.return_value = (
-            retriever
-        )
-        mock_db.query.return_value.filter.return_value.first.side_effect = [
-            kb,
-            user,
-        ]
 
         with (
             patch.object(resolver, "_get_knowledge_base", return_value=kb),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/test_config_resolver_resolve.py` around lines 40 -
50, Remove the dead DB and user mocks that are not used because the resolver
methods are patched: delete the MagicMock user assignment (user = MagicMock();
user.id = 42; user.user_name = "testuser") and the mock_db.query... filter...
order_by...first chain setup; keep the existing patches on resolver methods
(_get_knowledge_base, _get_user_name, _build_resolved_retriever_config,
_build_resolved_embedding_model_config, _get_splitter_config) and any assertions
that depend on those patched methods so the test logic remains unchanged.
knowledge_runtime/tests/conftest.py (1)
59-157: Underscore-prefixed helpers are imported across test modules.

_make_kb_kind, _make_retriever_kind, _make_model_kind, _make_document, and _make_user are imported by sibling test files (e.g., test_config_resolver_helpers.py, test_config_resolver_builders.py, test_config_resolver_resolve.py). The leading underscore conventionally signals module-private helpers, which conflicts with cross-module use. Consider either dropping the underscore prefix or extracting the factories into a dedicated tests/_factories.py (or similar) so the public/private intent is clear and conftest.py stays focused on fixtures. Importing from conftest is also somewhat fragile under pytest collection.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@knowledge_runtime/tests/conftest.py` around lines 59 - 157, These helper
factory functions (_make_kb_kind, _make_retriever_kind, _make_model_kind,
_make_document, _make_user) are defined with a leading underscore yet are
imported by sibling tests; either remove the underscore prefix to make them
public (rename to make_kb_kind, make_retriever_kind, make_model_kind,
make_document, make_user and update all imports/usages), or extract them into a
new dedicated test factory module (e.g., tests/_factories.py) and update imports
in test_config_resolver_helpers.py, test_config_resolver_builders.py, and
test_config_resolver_resolve.py to import from that module; also leave
conftest.py focused on pytest fixtures only.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@knowledge_runtime/tests/conftest.py`:
- Around line 59-157: These helper factory functions (_make_kb_kind,
_make_retriever_kind, _make_model_kind, _make_document, _make_user) are defined
with a leading underscore yet are imported by sibling tests; either remove the
underscore prefix to make them public (rename to make_kb_kind,
make_retriever_kind, make_model_kind, make_document, make_user and update all
imports/usages), or extract them into a new dedicated test factory module (e.g.,
tests/_factories.py) and update imports in test_config_resolver_helpers.py,
test_config_resolver_builders.py, and test_config_resolver_resolve.py to import
from that module; also leave conftest.py focused on pytest fixtures only.

In `@knowledge_runtime/tests/test_config_resolver_helpers.py`:
- Around line 242-325: Tests currently only assert non-None results; update the
two pairs of tests for _get_retriever_kind and _get_model_kind to also assert
the expected query behavior: inspect
mock_db.query.return_value.filter.call_args_list to ensure the user-priority
branch used order_by (verify that
mock_db.query.return_value.filter.return_value.order_by was called at least once
for "default" namespace tests) and that the group-namespace branch did not call
order_by (or only called filter, not order_by) for "team-ns" tests; additionally
assert the filter call arguments contain the expected user_id or namespace
expressions to validate the correct branch was taken.
- Around line 229-236: The test test_plain_text_api_key is incorrectly patching
knowledge_runtime.services.config_resolver.decrypt_api_key so it only verifies
the mock rather than the real plaintext-pass-through behavior; update the test
to exercise ConfigResolver._decrypt_optional_value with an actual "sk-..."
plaintext API key (do not patch decrypt_api_key) or adjust the patch to mimic
real plaintext detection (e.g., have the mock raise or return the input for
"sk-" values) so that the assertion verifies that
ConfigResolver._decrypt_optional_value returns the plaintext sk- key without
delegating to a mocked decrypt function.

In `@knowledge_runtime/tests/test_config_resolver_resolve.py`:
- Around line 40-50: Remove the dead DB and user mocks that are not used because
the resolver methods are patched: delete the MagicMock user assignment (user =
MagicMock(); user.id = 42; user.user_name = "testuser") and the mock_db.query...
filter... order_by...first chain setup; keep the existing patches on resolver
methods (_get_knowledge_base, _get_user_name, _build_resolved_retriever_config,
_build_resolved_embedding_model_config, _get_splitter_config) and any assertions
that depend on those patched methods so the test logic remains unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9fe91780-7db4-4419-9c11-20e485f129c9

📥 Commits

Reviewing files that changed from the base of the PR and between 1a4163e and 954d87f.

📒 Files selected for processing (5)

knowledge_runtime/tests/conftest.py
knowledge_runtime/tests/test_config_resolver_builders.py
knowledge_runtime/tests/test_config_resolver_errors.py
knowledge_runtime/tests/test_config_resolver_helpers.py
knowledge_runtime/tests/test_config_resolver_resolve.py

# Conflicts: # backend/app/services/rag/embedding/factory.py

sunnights and others added 13 commits April 26, 2026 15:29

refactor(shared): change Remote*Request protocol to reference mode

c7e01c8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(knowledge_runtime): add ConfigResolver for reference-mode config…

d5f5457

… resolution Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(knowledge_runtime): align hybrid weight handling with Backend beh…

66dcdd9

…avior Conditionally set vector_weight/keyword_weight to None when retrieval_mode is not "hybrid", matching Backend RagRuntimeResolver behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread knowledge_runtime/knowledge_runtime/services/config_resolver.py

Comment thread knowledge_runtime/tests/test_config_resolver.py Outdated

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread knowledge_runtime/tests/test_config_resolver.py Outdated

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread knowledge_runtime/tests/test_config_resolver.py Outdated

refactor(knowledge_runtime): split test_config_resolver.py into small…

954d87f

…er files Split the 1104-line monolithic test file into 4 focused files along class boundaries and extract shared fixtures/factories into conftest.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

sunnights changed the title ~~Zy kb2~~ refactor(knowledge_runtime): switch RAG protocol from value mode to reference mode Apr 27, 2026

sunnights added 4 commits April 28, 2026 11:46

fix(docker): 修复 knowledge_runtime 数据库连接配置

85d8791

Merge remote-tracking branch 'wecode-ai/main' into zy_kb2

ebd12ba

# Conflicts: # backend/app/services/rag/embedding/factory.py

refactor(rag): 删除 build_internal_* 死代码并统一使用 public 方法

67f4b5c

fix(uv): 修正 requires-python 版本为 3.10

c18d116

qdaxb approved these changes Apr 28, 2026

View reviewed changes

qdaxb merged commit 7fedc2a into wecode-ai:main Apr 28, 2026
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(knowledge_runtime): switch RAG protocol from value mode to reference mode#1030

refactor(knowledge_runtime): switch RAG protocol from value mode to reference mode#1030
qdaxb merged 20 commits intowecode-ai:mainfrom
sunnights:zy_kb2

sunnights commented Apr 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 26, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sunnights commented Apr 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sunnights commented Apr 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 26, 2026 •

edited

Loading