Fixed critical issues - auth, docker setup, db indexing, error handling, sync calls#41
Fixed critical issues - auth, docker setup, db indexing, error handling, sync calls#41rkritika1508 merged 6 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughRemoved tests from the runtime Docker image, converted several FastAPI handlers and related tests from async to sync, added SHA‑256 hashing and constant‑time auth token comparison, introduced Alembic revision adding log indexes, added safe error-message handling, bumped guardrails-ai, and adjusted CI/install/docker scripts and docs. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
backend/app/core/exception_handlers.py (1)
74-79:⚠️ Potential issue | 🟠 MajorConsider logging the actual exception before returning the safe message.
In production,
_safe_error_messagehides the real error from the response (correctly), but the actual exception should still be logged server-side for debugging. Without logging here, 500 errors in production become very difficult to diagnose.Proposed fix
+import logging + +logger = logging.getLogger(__name__) + ... `@app.exception_handler`(Exception) async def generic_error_handler(request: Request, exc: Exception): + logger.exception("Unhandled exception on %s %s", request.method, request.url) return JSONResponse( status_code=HTTP_500_INTERNAL_SERVER_ERROR, content=APIResponse.failure_response(_safe_error_message(exc)).model_dump(), )docker-compose.yml (1)
70-77:⚠️ Potential issue | 🔴 CriticalCritical: Port mismatch — uvicorn listens on port 80 but compose expects port 8000.
The port mapping (
"8000:8000") and healthcheck (localhost:8000) both target container port 8000, but thecommandon Line 77 starts uvicorn with--port 80. Nothing will be listening on container port 8000, so the service will be unreachable from the host and the healthcheck will always fail.Either update the command to
--port 8000or revert the port mapping and healthcheck to use port 80.🐛 Proposed fix — align uvicorn port with compose config
command: > - uv run uvicorn app.main:app --host 0.0.0.0 --port 80 --reload + uv run uvicorn app.main:app --host 0.0.0.0 --port 8000 --reloadbackend/app/api/routes/guardrails.py (1)
160-164:⚠️ Potential issue | 🟡 Minor
result.errormay beNone, producing the string"None"as the error message.If
result.validated_output is Noneandresult.erroris alsoNone,str(result.error)yields the literal string"None", which would be persisted in the request log and returned to the caller. Consider a fallback:Proposed fix
# Case 2: validation failed without a fix return _finalize( status=RequestStatus.ERROR, - error_message=str(result.error), + error_message=str(result.error) if result.error is not None else "Validation failed", )backend/app/tests/test_validate_with_guard.py (1)
62-78:⚠️ Potential issue | 🟡 MinorTest assertion on Line 78 is implicitly coupled to
settings.ENVIRONMENT.
_safe_error_messagereturns"An unexpected error occurred."whenENVIRONMENT == "production", soresponse.error == "Invalid config"will fail in that case. This is fine if tests always run in a non-production environment, but the coupling is implicit. Consider either:
- Patching
settings.ENVIRONMENTexplicitly in the test, or- Asserting with
inor a broader check.Proposed fix — make the environment explicit
def test_validate_with_guard_exception(): - with patch( + with patch("app.api.routes.guardrails.settings") as mock_settings, patch( "app.api.routes.guardrails.build_guard", side_effect=Exception("Invalid config"), ): + mock_settings.ENVIRONMENT = "development" response = _validate_with_guard(
🤖 Fix all issues with AI agents
In `@backend/app/api/deps.py`:
- Around line 39-43: The AUTH_TOKEN is expected to be a SHA-256 hex digest but
there is no startup validation or documentation and deps.py checks it
per-request (provided_hash = _hash_token(...); expected_hash =
settings.AUTH_TOKEN) raising RuntimeError at runtime; fix by adding a
startup-time validator in your configuration initialization (where settings is
constructed) that asserts settings.AUTH_TOKEN matches a 64-character hex regex
(e.g. r'^[0-9a-f]{64}$') and raises a clear error if not, update .env.example
and docs to state AUTH_TOKEN must be a SHA-256 hex digest, and remove/replace
the per-request RuntimeError in deps.py so deps uses the validated
settings.AUTH_TOKEN and only performs secrets.compare_digest(_hash_token(...),
settings.AUTH_TOKEN) without re-checking configuration on every request.
🧹 Nitpick comments (8)
backend/app/alembic/env.py (1)
10-11: Redundant but harmless explicit import.Line 8's
from app.models import SQLModelalready executesapp.models.__init__, which imports all models. The additionalimport app.modelson line 10 is redundant. Consider adding a comment like# noqa: F401 — ensure all models are registeredto clarify intent, or remove it since line 8 already serves this purpose.backend/app/core/exception_handlers.py (1)
52-55: Good addition for error message safety.The production guard and fallback are solid. One concern: the string comparison
settings.ENVIRONMENT == "production"is brittle — ensure the config value is normalized (e.g., lowercased) to avoid silent mismatches like"Production"or"PRODUCTION".backend/app/api/deps.py (1)
23-24: Unsalted SHA-256 hash provides limited additional security over plaintext comparison.The real improvement here is
secrets.compare_digestfor constant-time comparison (line 45), which mitigates timing attacks. The SHA-256 hash without a salt doesn't add meaningful security — if an attacker can readsettings.AUTH_TOKENfrom the environment, having it as a hash vs. plaintext doesn't help since the hash is deterministic and unsalted.If the goal is to avoid storing the raw token in config, consider using a proper key derivation function (e.g.,
bcryptwhich is already a dependency) or at minimum document the threat model this addresses.backend/app/alembic/versions/004_added_log_indexes.py (2)
8-8: Unused imports from Alembic template boilerplate.
SequenceandUnionare imported but never used in this migration. The static analysis tool also flags thatSequenceshould be imported fromcollections.abc(ruff UP035). Since neither is used, simply remove the import.Proposed fix
-from typing import Sequence, Union
20-28: Index additions look appropriate.The indexes on
request_id,status,inserted_at,outcome, andnamealign well with typical log query and filtering patterns.One consideration: if
request_idis frequently queried alongsideinserted_at(e.g., for time-bounded lookups per request), a composite index might be more efficient than two separate single-column indexes. But single-column indexes are a solid starting point.backend/pyproject.toml (1)
25-25: Consider adding an upper bound forguardrails-ai.While guardrails-ai currently pins to v0.8.0 (the latest available), leaving it open-ended could pull in future major versions with breaking changes once the package reaches 1.0.0. Many other dependencies in this file include upper bounds for stability (e.g.,
fastapi<1.0.0,tenacity<9.0.0).docker-compose.yml (1)
41-41: Clarify the purpose ofentrypoint: [].This overrides the image's default
ENTRYPOINTwith an empty list, which means thecommand(bash scripts/prestart.sh) runs directly without any entrypoint wrapper. This is fine if the Dockerfile sets an entrypoint that would conflict, but it's worth a brief inline comment explaining why it's needed for future maintainers.backend/app/tests/test_validate_with_guard.py (1)
11-13: Module-level sharedMagicMockinstances accumulate state across tests.
mock_request_log_crudandmock_validator_log_crudare created once and reused across all test functions. Call counts, call args, and return values persist between tests, which can cause order-dependent failures. Consider usingpytestfixtures or callingreset_mock()in each test.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@backend/app/core/config.py`:
- Around line 85-89: CI is failing because startup validation in
_enforce_non_default_secrets (which calls _validate_auth_token_hash) rejects
AUTH_TOKEN set to the placeholder "<ADD-KEY>"; update the CI environment (or
.env.test used by CI) to provide a valid 64-character hex SHA-256 digest for
AUTH_TOKEN (generate with e.g. echo -n "your-secret-token" | sha256sum and paste
the resulting hex string into AUTH_TOKEN) so _validate_auth_token_hash passes
during tests.
🧹 Nitpick comments (1)
backend/app/core/config.py (1)
79-83: Redundant anchors inre.fullmatchpattern.
re.fullmatchalready matches the entire string, so^and$are unnecessary. Harmless but misleading — it suggests the author may have intendedre.matchorre.search.♻️ Suggested fix
def _validate_auth_token_hash(self) -> None: - if not re.fullmatch(r"^[0-9a-f]{64}$", self.AUTH_TOKEN): + if not re.fullmatch(r"[0-9a-f]{64}", self.AUTH_TOKEN): raise ValueError( "AUTH_TOKEN must be a SHA-256 hex digest (64 lowercase hex characters)." )
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
docker-compose.yml (1)
70-77:⚠️ Potential issue | 🔴 CriticalCritical: Port mismatch — container will be unreachable and healthcheck will always fail.
The
commandon line 77 starts uvicorn on--port 80, but the port mapping (8000:8000) and healthcheck (localhost:8000) both reference port 8000 inside the container. Nothing listens on container port 8000, so the healthcheck will perpetually fail and the backend will be unreachable from the host.Change the uvicorn port to match:
🐛 Proposed fix
command: > - uv run uvicorn app.main:app --host 0.0.0.0 --port 80 --reload + uv run uvicorn app.main:app --host 0.0.0.0 --port 8000 --reloadbackend/app/api/routes/guardrails.py (1)
51-78:⚠️ Potential issue | 🟡 MinorInconsistent response format: success path returns raw dict, error path returns
APIResponse.On line 78, the success path returns
{"validators": validators}(a plain dict), while the error path on lines 71–76 returnsAPIResponse.failure_response(...). This means the response schema differs between success and failure for the same endpoint, which is confusing for API consumers.🐛 Proposed fix
- return {"validators": validators} + return APIResponse.success_response(data={"validators": validators})backend/app/tests/test_validate_with_guard.py (1)
62-78:⚠️ Potential issue | 🟡 MinorTest assertion on line 78 is coupled to non-production
ENVIRONMENTsetting.
_safe_error_messagereturns"An unexpected error occurred."whensettings.ENVIRONMENT == "production", but this test assertsresponse.error == "Invalid config". The test will silently break ifENVIRONMENTis ever set to"production"in the test env. Consider explicitly patchingsettings.ENVIRONMENTto a non-production value, or mocking_safe_error_message.
🧹 Nitpick comments (8)
backend/app/alembic/versions/004_added_log_indexes.py (1)
8-8: Unused imports (Alembic template boilerplate).
SequenceandUnionare never referenced in this file. Safe to remove.🧹 Proposed cleanup
-from typing import Sequence, Union - from alembic import opbackend/app/alembic/env.py (1)
10-11: Redundant import, but acceptable as an explicit signal.
from app.models import SQLModelon line 8 already executesapp.models.__init__, which imports all models. The bareimport app.modelson line 10 is a no-op at runtime. It does serve as a readable hint that all models must be loaded for autogenerate, which is a common Alembic convention.backend/app/core/config.py (1)
79-83:re.fullmatchalready anchors the match —^and$are redundant.
re.fullmatchmatches the entire string by definition. The^and$anchors in the pattern do nothing here.Suggested fix
def _validate_auth_token_hash(self) -> None: - if not re.fullmatch(r"^[0-9a-f]{64}$", self.AUTH_TOKEN): + if not re.fullmatch(r"[0-9a-f]{64}", self.AUTH_TOKEN): raise ValueError( "AUTH_TOKEN must be a SHA-256 hex digest (64 lowercase hex characters)." )backend/README.md (1)
207-217: Clarify the digest generation command output.
shasum -a 256(andsha256sumon Linux) outputs the hash followed by a filename/dash. Developers unfamiliar with this may paste the entire output including the trailing-into.env. Consider showing how to extract just the hash, or noting the expected output format.Suggested improvement
-echo -n "your-plain-text-token" | shasum -a 256 +echo -n "your-plain-text-token" | sha256sum | awk '{print $1}'backend/app/core/exception_handlers.py (1)
52-55: Private naming convention on a cross-module function.
_safe_error_messageis prefixed with_(Python convention for module-private), but it's imported and used inbackend/app/api/routes/guardrails.py. Consider renaming tosafe_error_message(no underscore) to signal it's part of the module's public API.docker-compose.yml (1)
41-41: What doesentrypoint: []achieve here?Setting
entrypoint: []overrides anyENTRYPOINTfrom the Dockerfile with an empty list, so onlycommandruns. This is fine if the Dockerfile has an entrypoint that conflicts withprestart.sh, but it's worth a brief inline comment explaining the intent for future maintainers.backend/scripts/install_guardrails_from_hub.sh (1)
25-28: Quote command substitutions to prevent word splitting (SC2046).While the current outputs are single-token strings, unquoted
$(...)is a shellcheck warning and a latent risk if the expressions ever change.♻️ Proposed fix
guardrails configure \ --token "$GUARDRAILS_HUB_API_KEY" \ - $( [[ "$ENABLE_METRICS" == "true" ]] && echo "--enable-metrics" || echo "--disable-metrics" ) \ - $( [[ "$ENABLE_REMOTE_INFERENCING" == "true" ]] && echo "--enable-remote-inferencing" || echo "--disable-remote-inferencing" ) + "$( [[ "$ENABLE_METRICS" == "true" ]] && echo "--enable-metrics" || echo "--disable-metrics" )" \ + "$( [[ "$ENABLE_REMOTE_INFERENCING" == "true" ]] && echo "--enable-remote-inferencing" || echo "--disable-remote-inferencing" )"backend/app/tests/test_validate_with_guard.py (1)
11-13: SharedMagicMockinstances across tests may leak call state.
mock_request_log_crudandmock_validator_log_crudare module-level singletons. Call history (e.g.,.update.call_args,.create.call_count) accumulates across tests. This doesn't break the current assertions, but it makes the tests fragile if you later add assertions on mock interactions.Consider using
pytestfixtures withautousereset, or instantiate fresh mocks per test.♻️ Proposed fix using fixtures
-mock_request_log_crud = MagicMock() -mock_validator_log_crud = MagicMock() -mock_request_log_id = uuid4() +@pytest.fixture(autouse=True) +def reset_mocks(): + mock_request_log_crud.reset_mock() + mock_validator_log_crud.reset_mock()Or alternatively, create fresh mocks per test via fixtures.
Summary
Target issue is #43, #44, #45, #46, #47.
Explain the motivation for making this change. What existing problem does the pull request solve?
Here are the fixes I have made -
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
Please add here if any other information is required for the reviewer.
Summary by CodeRabbit
Security
Performance
Bug Fixes
Improvements
Scripts