Skip to content

Harden CI with security, governance, and release gates#23

Merged
liteshperumalla merged 10 commits intomainfrom
codex/sdlc-hardening
Apr 24, 2026
Merged

Harden CI with security, governance, and release gates#23
liteshperumalla merged 10 commits intomainfrom
codex/sdlc-hardening

Conversation

@liteshperumalla
Copy link
Copy Markdown
Owner

@liteshperumalla liteshperumalla commented Apr 24, 2026

Summary by CodeRabbit

  • New Features

    • Public discovery endpoints (/robots.txt, /sitemap.xml) and adjusted caching/security headers for responses; expanded CI/CD security and contract checks.
  • Documentation

    • Extensive SDLC docs, ADR templates, runbooks, contributing guide, and PR/issue templates.
  • Tests

    • Backend security/header tests, backend proxy contract tests, and frontend test setup improvements.
  • Chores

    • Editor/Node configs (.editorconfig, .nvmrc), Makefile, pre-commit hooks, env-drift/migration scripts, Docker run-as-nonroot.

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
smart-ai-tutor Ready Ready Preview, Comment Apr 24, 2026 10:04pm

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 24, 2026

Walkthrough

Adds repository governance, CI/CD and security pipelines, editor and tooling configs, GitHub issue/PR templates, developer and operational documentation, env/migration validation scripts, container/user hardening, JWT payload parsing helpers, frontend contract tests, and various workflow refinements.

Changes

Cohort / File(s) Summary
Repo config & editor
/.editorconfig, /.nvmrc, /.pre-commit-config.yaml, Makefile
Adds editor rules (UTF‑8, LF, trim trailing whitespace, final newline), enforces 2-space indent globally, 4-space for *.py, tabs for Makefile; pins Node to 20.20.2; registers an env-drift pre-commit hook; and adds Make targets for bootstrap, test, lint, security, contract, migrate-validate, and env-check.
GitHub templates & PR guidance
.github/ISSUE_TEMPLATE/*, .github/PULL_REQUEST_TEMPLATE.md
Adds structured issue templates (bug/change/feature), issue template config to disable blank issues and link runbooks, and a standardized PR template.
CI/CD workflows & security scans
.github/workflows/ci-cd.yml, .github/workflows/codeql.yml, .github/workflows/* (deploy-*, database-migrations.yml, coderabbit.yml, _slack-notify.yml, rollback-production.yml, etc.), .github/actions/send-slack-notification/action.yml
Introduces a comprehensive CI pipeline (env-contract check, Semgrep SAST with SARIF upload, DAST via OWASP ZAP, migration validation, frontend/backend/contract tests, and gating aggregation jobs), new CodeQL workflow, and refactors many workflow steps to export/use env vars rather than inline expressions; updates composite Slack action wiring.
Developer & governance docs
CONTRIBUTING.md, docs/adr/...
Adds CONTRIBUTING and ADR template, ADR baseline (SDLC governance) and ADR README describing ADR process and required artifacts.
Operational runbooks & SDLC pages
docs/runbooks/*, docs/sdlc/*
Adds runbooks (backup/restore drill, db-migration-failure, deploy-failure) and SDLC guidance pages covering deployment safety, environment discipline, testing pyramid, data lifecycle, ops maturity, change control, and developer workflow.
Frontend tests & setup
frontend/jest.setup.js, frontend/src/app/api/backend/__tests__/route-contract.test.ts
Makes Jest browser mocks conditional on window presence and adds route-contract tests validating path traversal rejection, auth enforcement, proxy behavior, cookie forwarding, and CSRF header extraction.
Environment & migration checks
scripts/check-env-drift.py, scripts/validate-migrations.sh
Adds env-contract drift checker (Python) comparing declared keys across example/target env files and an Alembic migration validation script (bash) that builds a DB URL safely and runs upgrade/current checks.
JWT parsing & security helpers
backend/jwt_blacklist.py, backend/jwt_service.py, scripts/verify_security.py
Replaces uses of unverified PyJWT decode with manual base64url decode + JSON parse helpers to extract JWT payload claims (for JTI and exp), broadens parsing error handling, and centralizes decoding logic.
Backend headers & routes
backend/api/main.py, backend/tests/test_security.py
Adds COEP/COOP/CORP headers in middleware, dynamic caching rules (public short cache for /, /robots.txt, /sitemap.xml, no-store otherwise), and new GET /robots.txt and GET /sitemap.xml handlers; adds tests validating headers and cache behavior.
Containerization
Dockerfile
Creates a non-root appuser, adjusts ownership of /app and the entrypoint, and switches execution to that user.
Misc workflow & action refinements
.github/workflows/* (various), .github/actions/send-slack-notification/action.yml
Standardizes step-level env variable usage across multiple workflows and composite actions to avoid inline ${{ ... }} interpolation inside shell scripts and to improve step portability.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.89% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Harden CI with security, governance, and release gates' directly reflects the main change—establishing comprehensive CI hardening through security scanning, governance documentation, and deployment release gates.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/sdlc-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 24, 2026

AI Code Review Report

Pull Request

Field Value
Title Harden CI with security, governance, and release gates
Base branch main
Head branch codex/sdlc-hardening
Affected areas

Description

Summary by CodeRabbit

  • New Features

    • Expanded CI/CD with security scanning, DAST, CodeQL, migration validation, and contract testing; new Make targets and pre-commit checks.
  • Documentation

    • Extensive SDLC docs, ADR templates, and operational runbooks added.
  • Tests

    • Backend proxy contract tests and frontend test setup improvements.
  • Chores

    • Standardized editor/node config (.editorconfig, .nvmrc), new PR/issue templates, environment-drift and migration validation scripts, and Docker run-as-nonroot adjustment.

Review Scope

This automated review checks for:

  • Security: hardcoded secrets, SQL injection, XSS, broken auth, eval usage
  • Performance: N+1 queries, inefficient loops, missing pagination, memory issues
  • Code quality: bare except clauses, missing error handling, antipatterns
  • Best practices: logging, observability, proper error messages, type safety

Backend Issues

  • service.py: eval() — potential code injection

Generated by Smart AI Tutor code review workflow

@github-advanced-security
Copy link
Copy Markdown
Contributor

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

  • The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
  • Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
  • You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 24, 2026

Backend Test Results

105 tests  +3   105 ✅ +3   33s ⏱️ -4s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 54dbcc8. ± Comparison against base commit 57e4ee9.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (7)
frontend/src/app/api/backend/__tests__/route-contract.test.ts (2)

7-11: Replace global.fetch rather than creating a persistent mock leak.

beforeEach assigns a fresh jest.fn() to global.fetch but doesn't restore the original in afterAll/afterEach. Since jest.setup.js already sets global.fetch = jest.fn() globally, this is fine for isolation in this file but will leak a mocked fetch into any subsequent imports that expect real fetch in the same Jest worker. Prefer jest.spyOn(global, 'fetch') with mockRestore() in afterEach for stronger isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/app/api/backend/__tests__/route-contract.test.ts` around lines 7
- 11, The test suite currently overwrites global.fetch directly in the
beforeEach block (global.fetch = jest.fn()) causing a persistent mock leak;
change the setup to use jest.spyOn(global, "fetch") in the beforeEach so you
create a spyable mock, and add an afterEach that calls (global.fetch as
jest.SpyInstance).mockRestore() (or mockRestore() on the spy) to restore the
original fetch; reference the existing describe block and its beforeEach to
replace the direct assignment and add an afterEach that restores the spy.

60-60: Hardcoded upstream target couples the test to environment config.

http://localhost:8010/chat presumably reflects the default BACKEND_INTERNAL_URL (or similar) resolved by the proxy. If CI or a future contributor sets that env var before the test runs, this assertion will fail in an unhelpful way. Consider either:

  • explicitly setting the env var at the top of the suite (process.env.BACKEND_INTERNAL_URL = 'http://localhost:8010'), or
  • asserting with expect(target).toMatch(/\/chat$/) and checking the base separately.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/app/api/backend/__tests__/route-contract.test.ts` at line 60,
The test asserts a hardcoded URL ("http://localhost:8010/chat") which couples it
to environment config; update the test in route-contract.test.ts to avoid
brittle coupling by either setting process.env.BACKEND_INTERNAL_URL =
"http://localhost:8010" at the top of the suite before computing target, or
change the assertion to only verify the path (e.g.,
expect(target).toMatch(/\/chat$/)) and separately assert the base if needed;
reference the variable under test (target) and the env var name
(BACKEND_INTERNAL_URL/process.env.BACKEND_INTERNAL_URL) when making the change.
.github/workflows/codeql.yml (1)

45-47: Autobuild step is unreachable given the current matrix.

Both matrix entries use build-mode: none, so the Autobuild step can never run. Either drop it to reduce noise in the workflow, or add a short comment clarifying it's retained for future compiled-language additions (e.g., Go/Java).

🧹 Suggested cleanup
-      - name: Autobuild
-        if: ${{ matrix.build-mode == 'autobuild' }}
-        uses: github/codeql-action/autobuild@v3
-
       - name: Perform CodeQL analysis
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/codeql.yml around lines 45 - 47, The workflow contains an
unreachable "Autobuild" step conditioned on matrix.build-mode == 'autobuild'
while the matrix only defines build-mode: none; update
.github/workflows/codeql.yml by either removing the "Autobuild" job block (name:
Autobuild, uses: github/codeql-action/autobuild@v3) to reduce noise, or make it
intentionally reachable by adding an explicit matrix entry with build-mode:
autobuild (or add a short explanatory comment above the step noting it is kept
for future compiled-language builds) so the condition on matrix.build-mode
actually makes sense.
scripts/validate-migrations.sh (1)

16-16: Avoid echoing the full DB endpoint if you later add credentials to the log line.

The current echo is safe (host/port/db only). Worth a one-line note or lint rule so that future edits don't accidentally interpolate ${db_url} (which contains the password) into a log statement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/validate-migrations.sh` at line 16, The current echo in
scripts/validate-migrations.sh prints the DB endpoint using POSTGRES_HOST,
POSTGRES_PORT and POSTGRES_DB; to avoid accidentally leaking credentials later,
ensure you never interpolate the full connection string (e.g., db_url) in
logs—keep the echo limited to those three variables or a redacted form and add a
one-line inline comment or small lint check in the script warning "do not log
DB_URL / credentials" so future edits do not replace
POSTGRES_HOST/POSTGRES_PORT/POSTGRES_DB with a sensitive variable like db_url.
.pre-commit-config.yaml (1)

65-69: Consider scoping this hook to env-contract file changes.

The env-contract-drift hook currently runs on every commit regardless of whether env-contract files changed, since there is no files pattern and pass_filenames: false. For a fast check this is fine, but consider scoping to paths that can actually cause drift to avoid unnecessary work on unrelated commits.

♻️ Optional tweak
       - id: env-contract-drift
         name: Check environment contract drift
         entry: python scripts/check-env-drift.py
         language: system
         pass_filenames: false
+        files: '(^|/)\.env(\.example|\.sample|\.template)?$'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.pre-commit-config.yaml around lines 65 - 69, The env-contract-drift
pre-commit hook (id: env-contract-drift, entry: python
scripts/check-env-drift.py) is currently unconditional; narrow it by adding a
files pattern that matches only environment contract files (e.g., regex matching
your env-contract directory or filenames) so the hook runs only when those files
change, and set pass_filenames to true if the script can accept a list of
changed files (or leave it false if it must run unconditionally for matched
files).
docs/sdlc/deployment-safety.md (1)

13-18: Consider de-duplicating repeated “Confirm …” checklist wording.

This is readable as-is, but rephrasing the successive bullets can reduce repetition and improve scanability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/sdlc/deployment-safety.md` around lines 13 - 18, The checklist repeats
the word “Confirm” on each bullet—refactor to reduce duplication by introducing
a single leading prompt (e.g., “Confirm:” or “Verify the following:”) and then
list the five items (linked issue & PR verification; migration impact;
smoke/health endpoints; dashboards & alerts; rollback workflow and last known
good version), or alternatively rephrase each bullet with different verbs
(Verify, Assess, Check, Monitor, Document) to improve scanability while keeping
the same checklist entries.
.github/workflows/ci-cd.yml (1)

139-152: Consider adding --severity ERROR to Semgrep to avoid blocking unrelated PRs on low-severity findings.

semgrep scan --error without --severity will exit non-zero on any finding from the loaded rulesets, which typically trips on INFO/WARNING-level rules early in adoption and can block unrelated PRs. Adding --severity ERROR limits rule execution to ERROR-severity rules only, preventing lower-severity findings from triggering build failures. Combined with --error, this is a hardened default for CI pipelines.

♻️ Suggested change
           semgrep scan \
             --config p/owasp-top-ten \
             --config p/python \
             --config p/typescript \
             --exclude node_modules \
             --exclude .next \
             --exclude coverage \
+            --severity ERROR \
             --error \
             --sarif \
             --output semgrep.sarif \
             .
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci-cd.yml around lines 139 - 152, The Semgrep SAST step
("Run Semgrep SAST scan") uses the semgrep scan invocation and currently uses
--error which fails on any finding; update the semgrep scan command to include
--severity ERROR so only ERROR-level rules cause a non-zero exit (i.e., add the
--severity ERROR flag to the existing semgrep scan invocation).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/ci-cd.yml:
- Around line 227-239: The workflow step "Run OWASP ZAP baseline scan" passes
the -I flag to zap-baseline.py which prevents failures on warnings, so remove
the -I flag from the zap-baseline.py invocation (or, if you intend DAST to be
informational only, explicitly document that behavior in the runbook/SDLC docs
and add a comment in the workflow) and optionally add a configuration file via
-c zap.conf to tune rules so High-risk findings cause a non-zero exit code;
target the zap-baseline.py call in the workflow step to make this change.
- Around line 230-240: The ZAP Docker scan fails to write artifacts because
ghcr.io/zaproxy/zaproxy:stable runs as non-root and ${GITHUB_WORKSPACE} is owned
by the runner; update the CI step that runs zap-baseline.py to ensure writable
mounts by either (A) adding a pre-scan step that runs chmod a+w
"$GITHUB_WORKSPACE" so /zap/wrk/zap-report.html, /zap/wrk/zap-report.json and
/zap/wrk/zap-warnings.md can be created, or (B) adding the user-mapping flag -u
"$(id -u):$(id -g)" to the docker run line that invokes
ghcr.io/zaproxy/zaproxy:stable zap-baseline.py; modify the docker run command
accordingly (refer to the docker run invocation and zap-baseline.py invocation)
so the scan can successfully write the artifacts.

In @.nvmrc:
- Line 1: Update the pinned Node version in .nvmrc from "20.9.0" to "20.20.2"
(the latest 20.x LTS patch) and also add or update the "engines" field in
frontend/package.json to `"node": ">=20.20.2 <21"` to detect version drift in
dev/CI environments; locate the .nvmrc entry and the "engines" section in
frontend/package.json (or add it if missing) and ensure the CI config that reads
.nvmrc continues to reference the updated value.

In `@scripts/check-env-drift.py`:
- Around line 44-50: The loop currently treats a missing source OR target as
failure; change it so only present target files are validated against their
source contract: in the loop over targets (variables: target, source, method
exists()) skip and record pairs where target.exists() is False (append a new
skipped list) and only perform comparison / add to failures when target.exists()
is True (if source is missing, keep that as a failure), then at the end print
both failures and the skipped list as informational output so
missing-but-expected gitignored targets do not fail CI.

In `@scripts/validate-migrations.sh`:
- Line 14: The db_url construction in validate-migrations.sh currently
interpolates POSTGRES_USER and POSTGRES_PASSWORD raw, which will break if they
contain URL-reserved characters; update the script to URL-encode those
credentials before building db_url (e.g., call Python's urllib.parse.quote or
another URL-encoding utility to encode POSTGRES_USER and POSTGRES_PASSWORD and
use the encoded variables in the db_url assignment). Locate the db_url
assignment and replace direct ${POSTGRES_USER}/${POSTGRES_PASSWORD} usage with
the encoded versions so the final connection string is safe when credentials
contain reserved characters.

---

Nitpick comments:
In @.github/workflows/ci-cd.yml:
- Around line 139-152: The Semgrep SAST step ("Run Semgrep SAST scan") uses the
semgrep scan invocation and currently uses --error which fails on any finding;
update the semgrep scan command to include --severity ERROR so only ERROR-level
rules cause a non-zero exit (i.e., add the --severity ERROR flag to the existing
semgrep scan invocation).

In @.github/workflows/codeql.yml:
- Around line 45-47: The workflow contains an unreachable "Autobuild" step
conditioned on matrix.build-mode == 'autobuild' while the matrix only defines
build-mode: none; update .github/workflows/codeql.yml by either removing the
"Autobuild" job block (name: Autobuild, uses: github/codeql-action/autobuild@v3)
to reduce noise, or make it intentionally reachable by adding an explicit matrix
entry with build-mode: autobuild (or add a short explanatory comment above the
step noting it is kept for future compiled-language builds) so the condition on
matrix.build-mode actually makes sense.

In @.pre-commit-config.yaml:
- Around line 65-69: The env-contract-drift pre-commit hook (id:
env-contract-drift, entry: python scripts/check-env-drift.py) is currently
unconditional; narrow it by adding a files pattern that matches only environment
contract files (e.g., regex matching your env-contract directory or filenames)
so the hook runs only when those files change, and set pass_filenames to true if
the script can accept a list of changed files (or leave it false if it must run
unconditionally for matched files).

In `@docs/sdlc/deployment-safety.md`:
- Around line 13-18: The checklist repeats the word “Confirm” on each
bullet—refactor to reduce duplication by introducing a single leading prompt
(e.g., “Confirm:” or “Verify the following:”) and then list the five items
(linked issue & PR verification; migration impact; smoke/health endpoints;
dashboards & alerts; rollback workflow and last known good version), or
alternatively rephrase each bullet with different verbs (Verify, Assess, Check,
Monitor, Document) to improve scanability while keeping the same checklist
entries.

In `@frontend/src/app/api/backend/__tests__/route-contract.test.ts`:
- Around line 7-11: The test suite currently overwrites global.fetch directly in
the beforeEach block (global.fetch = jest.fn()) causing a persistent mock leak;
change the setup to use jest.spyOn(global, "fetch") in the beforeEach so you
create a spyable mock, and add an afterEach that calls (global.fetch as
jest.SpyInstance).mockRestore() (or mockRestore() on the spy) to restore the
original fetch; reference the existing describe block and its beforeEach to
replace the direct assignment and add an afterEach that restores the spy.
- Line 60: The test asserts a hardcoded URL ("http://localhost:8010/chat") which
couples it to environment config; update the test in route-contract.test.ts to
avoid brittle coupling by either setting process.env.BACKEND_INTERNAL_URL =
"http://localhost:8010" at the top of the suite before computing target, or
change the assertion to only verify the path (e.g.,
expect(target).toMatch(/\/chat$/)) and separately assert the base if needed;
reference the variable under test (target) and the env var name
(BACKEND_INTERNAL_URL/process.env.BACKEND_INTERNAL_URL) when making the change.

In `@scripts/validate-migrations.sh`:
- Line 16: The current echo in scripts/validate-migrations.sh prints the DB
endpoint using POSTGRES_HOST, POSTGRES_PORT and POSTGRES_DB; to avoid
accidentally leaking credentials later, ensure you never interpolate the full
connection string (e.g., db_url) in logs—keep the echo limited to those three
variables or a redacted form and add a one-line inline comment or small lint
check in the script warning "do not log DB_URL / credentials" so future edits do
not replace POSTGRES_HOST/POSTGRES_PORT/POSTGRES_DB with a sensitive variable
like db_url.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c5f78c4d-7f76-4966-ad35-b9286542f164

📥 Commits

Reviewing files that changed from the base of the PR and between 57e4ee9 and 182cfac.

📒 Files selected for processing (29)
  • .editorconfig
  • .github/ISSUE_TEMPLATE/bug-report.yml
  • .github/ISSUE_TEMPLATE/change-request.yml
  • .github/ISSUE_TEMPLATE/config.yml
  • .github/ISSUE_TEMPLATE/feature-request.yml
  • .github/PULL_REQUEST_TEMPLATE.md
  • .github/workflows/ci-cd.yml
  • .github/workflows/codeql.yml
  • .nvmrc
  • .pre-commit-config.yaml
  • CONTRIBUTING.md
  • Makefile
  • docs/adr/0000-template.md
  • docs/adr/0001-sdlc-governance-baseline.md
  • docs/adr/README.md
  • docs/runbooks/backup-restore-drill.md
  • docs/runbooks/db-migration-failure.md
  • docs/runbooks/deploy-failure.md
  • docs/sdlc/data-db-lifecycle.md
  • docs/sdlc/deployment-safety.md
  • docs/sdlc/developer-workflow.md
  • docs/sdlc/environment-discipline.md
  • docs/sdlc/ops-maturity.md
  • docs/sdlc/requirements-change-control.md
  • docs/sdlc/testing-pyramid.md
  • frontend/jest.setup.js
  • frontend/src/app/api/backend/__tests__/route-contract.test.ts
  • scripts/check-env-drift.py
  • scripts/validate-migrations.sh

Comment thread .github/workflows/ci-cd.yml Outdated
Comment thread .github/workflows/ci-cd.yml
Comment thread .nvmrc Outdated
@@ -0,0 +1,2 @@
20.9.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

What is the latest Node.js 20 LTS release version?

💡 Result:

The latest Node.js 20 LTS release version is 20.20.2.

Citations:


🌐 Web query:

Node.js 20.9.0 release date and security patches

💡 Result:

Node.js 20.9.0 release date: 2023-10-24. Security patches (what’s in v20.9.0): The official Node.js 20.9.0 release post is primarily an LTS-transition release (Node.js 20.x enters LTS “Iron”). The release page excerpt does not list any specific CVE fixes for v20.9.0 itself; it highlights the LTS transition and notes a known issue (NODE_V8_COVERAGE causing a hang). Security patches after v20.9.0 (relevant for staying secure on Node.js 20 LTS): Node.js publishes periodic LTS security releases for the 20.x line. For example: - 2026-01-13: Node.js 20.20.0 is explicitly a “security release” and lists multiple CVE fixes. - 2026-03-24: Node.js 20.20.2 is explicitly a “security release” and lists multiple CVE fixes. Recommendation (as of 2026-04-24): If you are running Node.js 20, upgrade to the latest available 20.x LTS patch release rather than staying on 20.9.0; the security fixes are delivered in later patch releases (20.x “Iron” line).

Citations:


🌐 Web query:

Node.js 20.x LTS CVE fixes OpenSSL HTTP parser 20.9 to 20.20

💡 Result:

No specific CVE fixes for OpenSSL or the HTTP parser (llhttp/http-parser) were identified in Node.js 20.9 to 20.20 that match the query exactly. Node.js 20.x uses OpenSSL 3.0.x (e.g., updated to 3.0.15, 3.0.17, 3.0.19 in later releases like 20.19.x and 20.20.x), and the HTTP parser is llhttp (successor to http-parser). Key security updates in this range: - OpenSSL CVEs (low-moderate severity, PKCS#12 related): CVE-2025-69421, CVE-2026-22795 fixed in regular releases post-Jan 2026 (v20.x uses OpenSSL 3.0, affected by some) [1,9,24]. - 20.20.0 (Jan 2026 security release): Fixed multiple Node.js CVEs (permissions, TLS, buffers, etc.), no OpenSSL/HTTP parser mentions [2,3]. - llhttp upgrade to v9 fixed CVE-2025-23167 (HTTP/1 header termination, request smuggling) prior to 20.19.2 [27]. - No evidence of HTTP parser version changes labeled as 20.9 to 20.20 or direct OpenSSL CVEs tied to those version numbers. Node.js 20.x LTS receives ongoing security patches via regular releases until April 2026 EOL. Upgrade to latest 20.20.2 (March 2026) for all fixes [31]. Check process.versions.openssl and official changelogs for exact deps per version.

Citations:


Bump Node version to the latest 20.x LTS patch.

Node 20.9.0 (Oct 2023) is now 2.5+ years old. Upgrade to 20.20.2, which is the final security patch for Node.js 20 LTS (EOL April 2026). Since this PR hardens CI, pinning to the latest patch ensures inherited security fixes. Consider also setting engines in frontend/package.json to detect version drift.

🔧 Suggested change
-20.9.0
+20.20.2
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
20.9.0
20.20.2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.nvmrc at line 1, Update the pinned Node version in .nvmrc from "20.9.0" to
"20.20.2" (the latest 20.x LTS patch) and also add or update the "engines" field
in frontend/package.json to `"node": ">=20.20.2 <21"` to detect version drift in
dev/CI environments; locate the .nvmrc entry and the "engines" section in
frontend/package.json (or add it if missing) and ensure the CI config that reads
.nvmrc continues to reference the updated value.

Comment thread scripts/check-env-drift.py
Comment thread scripts/validate-migrations.sh Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (6)
backend/jwt_blacklist.py (2)

135-140: Broad except Exception will mask bugs beyond decoding errors.

_decode_unverified_payload raises well-defined errors (ValueError, json.JSONDecodeError, binascii.Error, UnicodeDecodeError). Catching everything — including KeyboardInterrupt-class-adjacent or programmer errors in future helpers — and logging at ERROR for any malformed token will also be noisy if callers pass garbage. Narrow the exception list and drop log level to debug/warning for expected parse failures, consistent with jwt_service.get_token_expiry.

♻️ Proposed refactor
         try:
             payload = _decode_unverified_payload(token)
             return payload.get("jti")
-        except Exception as e:
-            logger.error(f"Failed to extract JTI from token: {e}")
+        except (ValueError, json.JSONDecodeError, UnicodeDecodeError) as e:
+            logger.debug(f"Failed to extract JTI from token: {e}")
             return None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/jwt_blacklist.py` around lines 135 - 140, The current broad except in
the JTI extraction masks unrelated errors; update the try/except in the function
that calls _decode_unverified_payload to catch only the known decoding errors
(ValueError, json.JSONDecodeError, binascii.Error, UnicodeDecodeError), log
those expected parse failures at a lower level (debug or warning consistent with
jwt_service.get_token_expiry), and continue to return None for malformed
tokens—leave other exceptions to propagate so programmer/critical errors are not
swallowed.

193-206: Duplicate helper with jwt_service.py — consolidate to a single source.

This _decode_unverified_payload is byte-for-byte identical to the one in backend/jwt_service.py (lines 266–279). Two copies of the same JWT-parsing primitive will drift over time (e.g., if one adds size limits, segment validation, or header parsing for security hardening but the other doesn't). Consider extracting it to a shared module (e.g., backend/jwt_utils.py) and importing it from both call sites.

♻️ Proposed refactor
# backend/jwt_utils.py (new)
import base64
import json
from typing import Any, Dict


def decode_unverified_payload(token: str) -> Dict[str, Any]:
    """Parse the JWT payload segment without trusting it for authentication."""
    parts = token.split(".")
    if len(parts) != 3:
        raise ValueError("Invalid JWT format")

    payload_segment = parts[1]
    padding = "=" * (-len(payload_segment) % 4)
    decoded = base64.urlsafe_b64decode(payload_segment + padding)
    payload = json.loads(decoded.decode("utf-8"))
    if not isinstance(payload, dict):
        raise ValueError("Invalid JWT payload")
    return payload

Then in backend/jwt_blacklist.py:

-import base64
-import json
+from .jwt_utils import decode_unverified_payload
...
-            payload = _decode_unverified_payload(token)
+            payload = decode_unverified_payload(token)
...
-def _decode_unverified_payload(token: str) -> dict:
-    """Parse the JWT payload segment without trusting it for authentication."""
-
-    parts = token.split(".")
-    if len(parts) != 3:
-        raise ValueError("Invalid JWT format")
-
-    payload_segment = parts[1]
-    padding = "=" * (-len(payload_segment) % 4)
-    decoded = base64.urlsafe_b64decode(payload_segment + padding)
-    payload = json.loads(decoded.decode("utf-8"))
-    if not isinstance(payload, dict):
-        raise ValueError("Invalid JWT payload")
-    return payload

Apply the same swap in backend/jwt_service.py and (optionally) in scripts/verify_security.py.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/jwt_blacklist.py` around lines 193 - 206, Duplicate JWT payload
decoder _decode_unverified_payload exists in backend/jwt_blacklist.py and
backend/jwt_service.py; extract a single shared function (e.g.,
decode_unverified_payload) into a new module backend/jwt_utils.py that
implements the same logic (validate three segments, pad base64, decode,
json.loads and ensure dict) and then replace the local
_decode_unverified_payload definitions by importing the shared
decode_unverified_payload in backend/jwt_blacklist.py and backend/jwt_service.py
(and optionally scripts/verify_security.py), updating any call sites to use the
imported function and removing the duplicated code.
Dockerfile (1)

33-39: Consider COPY --chown to avoid a bloated ownership layer.

chown -R appuser:appuser /app in a separate RUN layer duplicates every file's metadata and can materially increase final image size when /app is large (this repo includes backend sources, ML deps, and the RAG dataset). Switching to COPY --chown=appuser:appuser on the source COPYs avoids the extra layer and makes intent explicit.

♻️ Proposed refactor
-COPY requirements.txt /tmp/requirements.txt
+COPY --chown=appuser:appuser requirements.txt /tmp/requirements.txt
 RUN python -m pip install --upgrade pip \
     && pip install --no-cache-dir -r /tmp/requirements.txt

-COPY . /app
+COPY --chown=appuser:appuser . /app

-COPY docker/entrypoint.sh /entrypoint.sh
-RUN chmod +x /entrypoint.sh \
-    && chown -R appuser:appuser /app /entrypoint.sh
+COPY --chown=appuser:appuser docker/entrypoint.sh /entrypoint.sh
+RUN chmod +x /entrypoint.sh
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` around lines 33 - 39, The Dockerfile currently runs chown -R
appuser:appuser /app in a RUN layer which creates a redundant ownership layer
and bloats the image; instead, change the COPY instructions (the COPY . /app and
COPY docker/entrypoint.sh /entrypoint.sh) to use COPY --chown=appuser:appuser so
files are owned correctly at copy time and remove the chown -R invocation from
the RUN that currently does chmod +x /entrypoint.sh && chown -R appuser:appuser
/app /entrypoint.sh, keeping the chmod step and then set USER appuser as before.
backend/jwt_service.py (1)

237-244: JWTError in the except tuple is now unreachable.

After switching to _decode_unverified_payload, this block no longer calls anything that raises JWTError (the only JWT-library call left in JWTService is jwt.encode / jwt.decode in other methods). You can drop it for clarity. Also consider OverflowError — a malicious exp far outside the time_t range would surface from datetime.fromtimestamp and escape this handler.

♻️ Proposed refactor
-        except (JWTError, ValueError, TypeError, json.JSONDecodeError) as exc:
+        except (ValueError, TypeError, OverflowError, json.JSONDecodeError) as exc:
             logger.debug(f"Failed to decode token expiry: {exc}")
         return None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/jwt_service.py` around lines 237 - 244, The except block in the
expiry-parsing code around _decode_unverified_payload should drop JWTError (it's
no longer raised here) and instead catch OverflowError so
datetime.fromtimestamp(exp, tz=timezone.utc) failures are handled; update the
except tuple from (JWTError, ValueError, TypeError, json.JSONDecodeError) to
(ValueError, TypeError, json.JSONDecodeError, OverflowError) and keep the
logger.debug(f"Failed to decode token expiry: {exc}") and the subsequent return
None logic unchanged.
.github/workflows/ci-cd.yml (1)

592-642: Aggregator treats skipped jobs as passing — consider tightening.

The loop only flags failure/cancelled, so if any of secret-scanning, dependency-scanning, sast, dast, backend-tests, or frontend-tests is skipped (e.g., a future if: guard on one of those jobs, or a matrix change, or the needed upstream secret-scanning being skipped on a tag push), security-management still reports green and delivery-readiness follows suit. For a release-gate aggregator, the safer default is "only success passes":

♻️ Suggested tightening
-            if [ "${result}" = "failure" ] || [ "${result}" = "cancelled" ]; then
-              security_failed=1
-            fi
+            if [ "${result}" != "success" ]; then
+              security_failed=1
+            fi

Same consideration applies to delivery-readiness (lines 662-673). If some of the needs are legitimately allowed to skip, explicitly allow-list them in the loop instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci-cd.yml around lines 592 - 642, The aggregator currently
treats only "failure" and "cancelled" as failures, which lets "skipped" be
considered passing; update the logic in the security-management job (the for
loop that iterates over "${{ needs.secret-scanning.result }}", "${{
needs.dependency-scanning.result }}", "${{ needs.sast.result }}", "${{
needs.dast.result }}", "${{ needs.backend-tests.result }}", "${{
needs.frontend-tests.result }}" and the security_failed variable) to consider
only "success" as passing (i.e., treat any result not equal to "success" as a
failure), or alternatively implement an allow-list of results that may be
skipped and only treat other non-success results as failures; apply the same
tightening to the delivery-readiness aggregator loop referenced in the review
comment.
scripts/check-env-drift.py (1)

9-18: Minor: load_keys does not handle export KEY=value lines.

Not a blocker because the repo's own .env.example is unlikely to use export prefixes, but if any contracted env file later adopts that convention (common in some shell-sourced env files), the extracted key will be "export KEY" and the drift check will produce false positives. A cheap hardening:

♻️ Optional robustness tweak
         key = line.split("=", 1)[0].strip()
+        if key.startswith("export "):
+            key = key[len("export "):].strip()
         if key:
             keys.add(key)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/check-env-drift.py` around lines 9 - 18, The load_keys function
treats lines like "export KEY=value" as keys named "export KEY"; update
load_keys to strip an optional leading "export" token and any following
whitespace before splitting on "=" so the extracted key becomes "KEY". Locate
load_keys and adjust the line normalization (after line = raw_line.strip()) to
remove a leading "export" plus whitespace (or use a small regex) before
computing key, then continue the existing checks and adds.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/ci-cd.yml:
- Around line 228-265: The DAST step currently runs zap-baseline.py with -m 3
which can exit non-zero and, because of set -euo pipefail, prevents the three cp
commands (copying from ${RUNNER_TEMP}/zap to ${GITHUB_WORKSPACE}) from executing
so reports are never uploaded; either ensure the copy-out always runs by making
the scan command not fail the script (e.g., run zap-baseline.py ... ||
scan_exit=$?; true; then always cp the files and optionally exit with scan_exit)
while preserving the cp lines, or remove the cp lines and change the Upload DAST
artifacts `path:` to point to "${RUNNER_TEMP}/zap/*" so actions/upload-artifact
picks up zap-report.html, zap-report.json and zap-warnings.md even on failures;
reference zap-baseline.py -m 3, the cp commands that copy from
"${RUNNER_TEMP}/zap", and the Upload DAST artifacts `path:` when applying the
fix.
- Around line 139-153: Update the "Run Semgrep SAST scan" step so the scan
reports WARNING+ findings to SARIF while still failing the job only on ERRORs:
change the semgrep invocation (the semgrep scan command that currently includes
"--severity ERROR") to use "--severity WARNING" (or remove the --severity flag
entirely) and keep the "--error" flag so the workflow still fails on ERROR-level
findings; if ERROR-only gating is intentional instead, add an inline comment
above the semgrep scan explaining that choice for reviewers and runbooks.

In @.github/workflows/database-migrations.yml:
- Around line 150-169: The summary currently prints SUMMARY_POSTGRES_HOST,
SUMMARY_POSTGRES_PORT and SUMMARY_POSTGRES_DB which are populated from secrets
and therefore get masked as "***" in $GITHUB_STEP_SUMMARY; remove or replace
those rows so the summary remains useful — either delete the "Database
host/port/name" and "Database name" lines or swap them to non-secret identifiers
(e.g., use inputs.environment_name or another non-secret env) instead of
SUMMARY_POSTGRES_HOST/SUMMARY_POSTGRES_PORT/SUMMARY_POSTGRES_DB when emitting
the summary.

---

Nitpick comments:
In @.github/workflows/ci-cd.yml:
- Around line 592-642: The aggregator currently treats only "failure" and
"cancelled" as failures, which lets "skipped" be considered passing; update the
logic in the security-management job (the for loop that iterates over "${{
needs.secret-scanning.result }}", "${{ needs.dependency-scanning.result }}",
"${{ needs.sast.result }}", "${{ needs.dast.result }}", "${{
needs.backend-tests.result }}", "${{ needs.frontend-tests.result }}" and the
security_failed variable) to consider only "success" as passing (i.e., treat any
result not equal to "success" as a failure), or alternatively implement an
allow-list of results that may be skipped and only treat other non-success
results as failures; apply the same tightening to the delivery-readiness
aggregator loop referenced in the review comment.

In `@backend/jwt_blacklist.py`:
- Around line 135-140: The current broad except in the JTI extraction masks
unrelated errors; update the try/except in the function that calls
_decode_unverified_payload to catch only the known decoding errors (ValueError,
json.JSONDecodeError, binascii.Error, UnicodeDecodeError), log those expected
parse failures at a lower level (debug or warning consistent with
jwt_service.get_token_expiry), and continue to return None for malformed
tokens—leave other exceptions to propagate so programmer/critical errors are not
swallowed.
- Around line 193-206: Duplicate JWT payload decoder _decode_unverified_payload
exists in backend/jwt_blacklist.py and backend/jwt_service.py; extract a single
shared function (e.g., decode_unverified_payload) into a new module
backend/jwt_utils.py that implements the same logic (validate three segments,
pad base64, decode, json.loads and ensure dict) and then replace the local
_decode_unverified_payload definitions by importing the shared
decode_unverified_payload in backend/jwt_blacklist.py and backend/jwt_service.py
(and optionally scripts/verify_security.py), updating any call sites to use the
imported function and removing the duplicated code.

In `@backend/jwt_service.py`:
- Around line 237-244: The except block in the expiry-parsing code around
_decode_unverified_payload should drop JWTError (it's no longer raised here) and
instead catch OverflowError so datetime.fromtimestamp(exp, tz=timezone.utc)
failures are handled; update the except tuple from (JWTError, ValueError,
TypeError, json.JSONDecodeError) to (ValueError, TypeError,
json.JSONDecodeError, OverflowError) and keep the logger.debug(f"Failed to
decode token expiry: {exc}") and the subsequent return None logic unchanged.

In `@Dockerfile`:
- Around line 33-39: The Dockerfile currently runs chown -R appuser:appuser /app
in a RUN layer which creates a redundant ownership layer and bloats the image;
instead, change the COPY instructions (the COPY . /app and COPY
docker/entrypoint.sh /entrypoint.sh) to use COPY --chown=appuser:appuser so
files are owned correctly at copy time and remove the chown -R invocation from
the RUN that currently does chmod +x /entrypoint.sh && chown -R appuser:appuser
/app /entrypoint.sh, keeping the chmod step and then set USER appuser as before.

In `@scripts/check-env-drift.py`:
- Around line 9-18: The load_keys function treats lines like "export KEY=value"
as keys named "export KEY"; update load_keys to strip an optional leading
"export" token and any following whitespace before splitting on "=" so the
extracted key becomes "KEY". Locate load_keys and adjust the line normalization
(after line = raw_line.strip()) to remove a leading "export" plus whitespace (or
use a small regex) before computing key, then continue the existing checks and
adds.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 64627a2a-ac70-4692-aba1-2340e3685980

📥 Commits

Reviewing files that changed from the base of the PR and between 182cfac and 15bc10c.

📒 Files selected for processing (19)
  • .github/actions/send-slack-notification/action.yml
  • .github/workflows/_slack-notify.yml
  • .github/workflows/ci-cd.yml
  • .github/workflows/codeql.yml
  • .github/workflows/coderabbit.yml
  • .github/workflows/database-migrations.yml
  • .github/workflows/deploy-production.yml
  • .github/workflows/deploy-staging.yml
  • .github/workflows/rollback-production.yml
  • .pre-commit-config.yaml
  • Dockerfile
  • backend/alembic/env.py
  • backend/jwt_blacklist.py
  • backend/jwt_service.py
  • docs/sdlc/deployment-safety.md
  • frontend/src/app/api/backend/__tests__/route-contract.test.ts
  • scripts/check-env-drift.py
  • scripts/validate-migrations.sh
  • scripts/verify_security.py
✅ Files skipped from review due to trivial changes (2)
  • .github/workflows/codeql.yml
  • docs/sdlc/deployment-safety.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/validate-migrations.sh

Comment on lines +139 to +153
- name: Run Semgrep SAST scan
run: |
set -euo pipefail
semgrep scan \
--config p/owasp-top-ten \
--config p/python \
--config p/typescript \
--exclude node_modules \
--exclude .next \
--exclude coverage \
--severity ERROR \
--error \
--sarif \
--output semgrep.sarif \
.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Semgrep --severity ERROR ignores WARNING/INFO findings — gate may miss real issues.

--severity ERROR filters out WARNING and INFO rules from both the run and the SARIF upload. Many OWASP Top Ten and Python/TypeScript rules in the configured rule packs ship as WARNING (e.g., taint rules flagged at medium severity), so they won't be uploaded to GitHub code scanning or fail the build. If the intent of this gate is "fail only on ERROR but still report WARNING+ findings to Security tab", consider scanning at WARNING and only failing on ERROR:

🔧 Suggested change — report broader, fail narrow
           semgrep scan \
             --config p/owasp-top-ten \
             --config p/python \
             --config p/typescript \
             --exclude node_modules \
             --exclude .next \
             --exclude coverage \
-            --severity ERROR \
-            --error \
+            --severity WARNING \
+            --error-on-findings-severity ERROR \
             --sarif \
             --output semgrep.sarif \
             .

If ERROR-only gating is intentional, please leave a comment in the workflow (and ADR/runbook) calling that out so reviewers don't assume the SARIF is comprehensive.

What severity levels does Semgrep's --severity flag accept and how does --error interact with them?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci-cd.yml around lines 139 - 153, Update the "Run Semgrep
SAST scan" step so the scan reports WARNING+ findings to SARIF while still
failing the job only on ERRORs: change the semgrep invocation (the semgrep scan
command that currently includes "--severity ERROR") to use "--severity WARNING"
(or remove the --severity flag entirely) and keep the "--error" flag so the
workflow still fails on ERROR-level findings; if ERROR-only gating is
intentional instead, add an inline comment above the semgrep scan explaining
that choice for reviewers and runbooks.

Comment on lines +228 to +265
- name: Run OWASP ZAP baseline scan
run: |
set -euo pipefail
mkdir -p "${RUNNER_TEMP}/zap"
chmod 777 "${RUNNER_TEMP}/zap"
cat > "${RUNNER_TEMP}/zap/baseline.conf" <<'EOF'
# ZAP baseline rule configuration for CI.
# Keep DAST blocking on FAILs while ignoring the currently accepted
# passive warnings below until the app hardening work lands.
10049 IGNORE (Storable and Cacheable Content)
90004 IGNORE (Cross-Origin-Resource-Policy Header Missing or Invalid)
EOF
docker run --rm --network host \
-v "${RUNNER_TEMP}/zap:/zap/wrk:rw" \
ghcr.io/zaproxy/zaproxy:stable \
zap-baseline.py \
-t http://127.0.0.1:8010 \
-c baseline.conf \
-r zap-report.html \
-J zap-report.json \
-w zap-warnings.md \
-m 3
cp "${RUNNER_TEMP}/zap"/zap-report.html "${GITHUB_WORKSPACE}/zap-report.html"
cp "${RUNNER_TEMP}/zap"/zap-report.json "${GITHUB_WORKSPACE}/zap-report.json"
cp "${RUNNER_TEMP}/zap"/zap-warnings.md "${GITHUB_WORKSPACE}/zap-warnings.md"

- name: Upload DAST artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: dast-report
path: |
zap-report.html
zap-report.json
zap-warnings.md
logs/dast-backend.log
if-no-files-found: warn
retention-days: 30
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

DAST artifacts are lost exactly when you need them.

zap-baseline.py -m 3 returns non-zero when the scan detects FAILs. With set -euo pipefail, the three cp commands at lines 250-252 never run, so zap-report.html, zap-report.json, and zap-warnings.md remain in ${RUNNER_TEMP}/zap/ and are never picked up by the subsequent upload-artifact step (which only looks in ${GITHUB_WORKSPACE}). Net effect: on the failure paths where the DAST report matters most, only logs/dast-backend.log gets uploaded.

🔧 Suggested fix — guarantee copy-out regardless of scan exit code
       - name: Run OWASP ZAP baseline scan
         run: |
           set -euo pipefail
           mkdir -p "${RUNNER_TEMP}/zap"
           chmod 777 "${RUNNER_TEMP}/zap"
           cat > "${RUNNER_TEMP}/zap/baseline.conf" <<'EOF'
           # ZAP baseline rule configuration for CI.
           # Keep DAST blocking on FAILs while ignoring the currently accepted
           # passive warnings below until the app hardening work lands.
           10049  IGNORE  (Storable and Cacheable Content)
           90004  IGNORE  (Cross-Origin-Resource-Policy Header Missing or Invalid)
           EOF
-          docker run --rm --network host \
+          scan_rc=0
+          docker run --rm --network host \
             -v "${RUNNER_TEMP}/zap:/zap/wrk:rw" \
             ghcr.io/zaproxy/zaproxy:stable \
             zap-baseline.py \
               -t http://127.0.0.1:8010 \
               -c baseline.conf \
               -r zap-report.html \
               -J zap-report.json \
               -w zap-warnings.md \
-              -m 3
-          cp "${RUNNER_TEMP}/zap"/zap-report.html "${GITHUB_WORKSPACE}/zap-report.html"
-          cp "${RUNNER_TEMP}/zap"/zap-report.json "${GITHUB_WORKSPACE}/zap-report.json"
-          cp "${RUNNER_TEMP}/zap"/zap-warnings.md "${GITHUB_WORKSPACE}/zap-warnings.md"
+              -m 3 || scan_rc=$?
+          cp "${RUNNER_TEMP}/zap"/zap-report.html  "${GITHUB_WORKSPACE}/zap-report.html"  || true
+          cp "${RUNNER_TEMP}/zap"/zap-report.json  "${GITHUB_WORKSPACE}/zap-report.json"  || true
+          cp "${RUNNER_TEMP}/zap"/zap-warnings.md  "${GITHUB_WORKSPACE}/zap-warnings.md"  || true
+          exit "${scan_rc}"

Alternatively, change the artifact path: to point into ${RUNNER_TEMP}/zap/* and drop the copy step altogether.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Run OWASP ZAP baseline scan
run: |
set -euo pipefail
mkdir -p "${RUNNER_TEMP}/zap"
chmod 777 "${RUNNER_TEMP}/zap"
cat > "${RUNNER_TEMP}/zap/baseline.conf" <<'EOF'
# ZAP baseline rule configuration for CI.
# Keep DAST blocking on FAILs while ignoring the currently accepted
# passive warnings below until the app hardening work lands.
10049 IGNORE (Storable and Cacheable Content)
90004 IGNORE (Cross-Origin-Resource-Policy Header Missing or Invalid)
EOF
docker run --rm --network host \
-v "${RUNNER_TEMP}/zap:/zap/wrk:rw" \
ghcr.io/zaproxy/zaproxy:stable \
zap-baseline.py \
-t http://127.0.0.1:8010 \
-c baseline.conf \
-r zap-report.html \
-J zap-report.json \
-w zap-warnings.md \
-m 3
cp "${RUNNER_TEMP}/zap"/zap-report.html "${GITHUB_WORKSPACE}/zap-report.html"
cp "${RUNNER_TEMP}/zap"/zap-report.json "${GITHUB_WORKSPACE}/zap-report.json"
cp "${RUNNER_TEMP}/zap"/zap-warnings.md "${GITHUB_WORKSPACE}/zap-warnings.md"
- name: Upload DAST artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: dast-report
path: |
zap-report.html
zap-report.json
zap-warnings.md
logs/dast-backend.log
if-no-files-found: warn
retention-days: 30
- name: Run OWASP ZAP baseline scan
run: |
set -euo pipefail
mkdir -p "${RUNNER_TEMP}/zap"
chmod 777 "${RUNNER_TEMP}/zap"
cat > "${RUNNER_TEMP}/zap/baseline.conf" <<'EOF'
# ZAP baseline rule configuration for CI.
# Keep DAST blocking on FAILs while ignoring the currently accepted
# passive warnings below until the app hardening work lands.
10049 IGNORE (Storable and Cacheable Content)
90004 IGNORE (Cross-Origin-Resource-Policy Header Missing or Invalid)
EOF
scan_rc=0
docker run --rm --network host \
-v "${RUNNER_TEMP}/zap:/zap/wrk:rw" \
ghcr.io/zaproxy/zaproxy:stable \
zap-baseline.py \
-t http://127.0.0.1:8010 \
-c baseline.conf \
-r zap-report.html \
-J zap-report.json \
-w zap-warnings.md \
-m 3 || scan_rc=$?
cp "${RUNNER_TEMP}/zap"/zap-report.html "${GITHUB_WORKSPACE}/zap-report.html" || true
cp "${RUNNER_TEMP}/zap"/zap-report.json "${GITHUB_WORKSPACE}/zap-report.json" || true
cp "${RUNNER_TEMP}/zap"/zap-warnings.md "${GITHUB_WORKSPACE}/zap-warnings.md" || true
exit "${scan_rc}"
- name: Upload DAST artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: dast-report
path: |
zap-report.html
zap-report.json
zap-warnings.md
logs/dast-backend.log
if-no-files-found: warn
retention-days: 30
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci-cd.yml around lines 228 - 265, The DAST step currently
runs zap-baseline.py with -m 3 which can exit non-zero and, because of set -euo
pipefail, prevents the three cp commands (copying from ${RUNNER_TEMP}/zap to
${GITHUB_WORKSPACE}) from executing so reports are never uploaded; either ensure
the copy-out always runs by making the scan command not fail the script (e.g.,
run zap-baseline.py ... || scan_exit=$?; true; then always cp the files and
optionally exit with scan_exit) while preserving the cp lines, or remove the cp
lines and change the Upload DAST artifacts `path:` to point to
"${RUNNER_TEMP}/zap/*" so actions/upload-artifact picks up zap-report.html,
zap-report.json and zap-warnings.md even on failures; reference zap-baseline.py
-m 3, the cp commands that copy from "${RUNNER_TEMP}/zap", and the Upload DAST
artifacts `path:` when applying the fix.

Comment thread .github/workflows/database-migrations.yml Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/jwt_blacklist.py (1)

6-12: ⚠️ Potential issue | 🟡 Minor

Remove the unused jwt import on line 6.

The jwt module is no longer used in this file after refactoring _get_jti to decode payloads manually. It appears only as an import statement with no other references, so it should be removed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/jwt_blacklist.py` around lines 6 - 12, Remove the unused top-level
import "jwt" from the imports list (it currently appears alongside logging,
Optional, datetime, base64, binascii, json) since _get_jti now decodes payloads
manually and there are no references to jwt; update the import block to exclude
"jwt" and run linters/tests to ensure no remaining references to the jwt symbol
remain.
🧹 Nitpick comments (2)
backend/jwt_blacklist.py (2)

136-141: Consider narrowing jti to str before returning.

payload.get("jti") returns whatever JSON yields, which could be a non-string (number, bool, list). Per RFC 7519 the jti claim SHOULD be a string, but a malformed token could violate that and cause a downstream TypeError on the jti[:8] slicing in the various log statements (lines 65, 73, 103, 112). The outer try/except in blacklist_token/is_blacklisted would still catch it and fail safely, but treating a non-string jti as missing is cleaner.

🛡️ Suggested narrowing
         try:
             payload = _decode_unverified_payload(token)
-            return payload.get("jti")
+            jti = payload.get("jti")
+            return jti if isinstance(jti, str) else None
         except (ValueError, json.JSONDecodeError, binascii.Error, UnicodeDecodeError) as e:
             logger.error(f"Failed to extract JTI from token: {e}")
             return None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/jwt_blacklist.py` around lines 136 - 141, The helper that extracts
the JTI currently returns payload.get("jti") which can be non-string; change the
logic in the JTI extractor (the code using _decode_unverified_payload that
returns payload.get("jti")) to coerce/validate the claim and only return a str:
if the claim is missing or not an instance of str, return None. Update the
extractor used by blacklist_token and is_blacklisted so downstream slices like
jti[:8] are never called on non-strings.

194-207: Extract duplicated payload decoder into a shared utility.

This implementation is identical to _decode_unverified_payload in backend/jwt_service.py (lines 266–279) and decode_unverified_payload in scripts/verify_security.py (line 22). Three copies of the same JWT-parsing logic will drift over time (e.g., bug fixes, hardening of size limits, error handling). Consider extracting to a shared module such as backend/jwt_utils.py and importing from all call sites.

Also note the return-type annotation differs across implementations: this declares -> dict while jwt_service.py declares -> Dict[str, Any]. A single helper would eliminate that inconsistency and establish a canonical type signature.

♻️ Sketch of shared utility
# backend/jwt_utils.py
import base64
import json
from typing import Any, Dict


def decode_unverified_payload(token: str) -> Dict[str, Any]:
    """Parse the JWT payload segment without trusting it for authentication."""
    parts = token.split(".")
    if len(parts) != 3:
        raise ValueError("Invalid JWT format")

    payload_segment = parts[1]
    padding = "=" * (-len(payload_segment) % 4)
    decoded = base64.urlsafe_b64decode(payload_segment + padding)
    payload = json.loads(decoded.decode("utf-8"))
    if not isinstance(payload, dict):
        raise ValueError("Invalid JWT payload")
    return payload

Then in jwt_blacklist.py, jwt_service.py, and scripts/verify_security.py:

from backend.jwt_utils import decode_unverified_payload
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/jwt_blacklist.py` around lines 194 - 207, Extract the duplicated JWT
payload parsing into a single helper (e.g.,
backend.jwt_utils.decode_unverified_payload) and update all call sites to import
it instead of keeping copies in jwt_blacklist.py (_decode_unverified_payload),
jwt_service.py (decode_unverified_payload) and scripts/verify_security.py;
implement the helper to perform the same split/base64/json checks but use a
canonical signature like -> Dict[str, Any] (import typing.Dict and Any), remove
the local copies, and update imports so every module calls the shared
backend.jwt_utils.decode_unverified_payload.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/api/main.py`:
- Around line 183-193: The sitemap endpoint currently returns a hardcoded .local
URL; update the sitemap() handler to read the canonical public URL from
configuration (e.g., config.PUBLIC_BASE_URL or an env var) instead of the
literal "https://smart-ai-tutor.local/", and use that value when building the
<loc> entry passed into Response (keep media_type="application/xml"); if no
public URL is configured, return an empty <urlset> (no <url> children) to avoid
exposing a non-resolvable placeholder. Ensure you update the sitemap() function
and reference config.PUBLIC_BASE_URL (or the chosen env var) when composing the
XML.
- Around line 114-116: The COEP/COOP headers are being applied unconditionally
in the response middleware and break Swagger/ReDoc assets; update the middleware
that sets response.headers["Cross-Origin-Embedder-Policy"] and
response.headers["Cross-Origin-Opener-Policy"] (same area as the enforce_https
middleware) to skip adding those two headers for documentation/openapi routes
(at minimum: "/docs", "/redoc", "/docs/oauth2-redirect", "/openapi.json" — or
any path starting with "/docs") or alternatively remove COEP entirely for this
service; keep Cross-Origin-Resource-Policy if desired but ensure the conditional
check runs before setting response.headers so docs and openapi JSON are not
served with COEP/COOP.

In `@backend/tests/test_security.py`:
- Around line 113-116: In test_dynamic_not_found_includes_no_store_headers
remove the leftover call to test_client.get("/robots.txt") that is immediately
overwritten; keep only the request exercising the dynamic 404
(test_client.get("/missing")) so the test asserts the intended dynamic-not-found
behavior and no-store headers, i.e., delete the redundant GET to "/robots.txt"
in the test.

---

Outside diff comments:
In `@backend/jwt_blacklist.py`:
- Around line 6-12: Remove the unused top-level import "jwt" from the imports
list (it currently appears alongside logging, Optional, datetime, base64,
binascii, json) since _get_jti now decodes payloads manually and there are no
references to jwt; update the import block to exclude "jwt" and run
linters/tests to ensure no remaining references to the jwt symbol remain.

---

Nitpick comments:
In `@backend/jwt_blacklist.py`:
- Around line 136-141: The helper that extracts the JTI currently returns
payload.get("jti") which can be non-string; change the logic in the JTI
extractor (the code using _decode_unverified_payload that returns
payload.get("jti")) to coerce/validate the claim and only return a str: if the
claim is missing or not an instance of str, return None. Update the extractor
used by blacklist_token and is_blacklisted so downstream slices like jti[:8] are
never called on non-strings.
- Around line 194-207: Extract the duplicated JWT payload parsing into a single
helper (e.g., backend.jwt_utils.decode_unverified_payload) and update all call
sites to import it instead of keeping copies in jwt_blacklist.py
(_decode_unverified_payload), jwt_service.py (decode_unverified_payload) and
scripts/verify_security.py; implement the helper to perform the same
split/base64/json checks but use a canonical signature like -> Dict[str, Any]
(import typing.Dict and Any), remove the local copies, and update imports so
every module calls the shared backend.jwt_utils.decode_unverified_payload.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dc8bfc7a-e10f-4fb6-b072-1ffa0854168a

📥 Commits

Reviewing files that changed from the base of the PR and between 15bc10c and 54dbcc8.

📒 Files selected for processing (7)
  • .github/workflows/ci-cd.yml
  • .github/workflows/database-migrations.yml
  • .nvmrc
  • backend/api/main.py
  • backend/jwt_blacklist.py
  • backend/tests/test_security.py
  • scripts/validate-migrations.sh
🚧 Files skipped from review as they are similar to previous changes (3)
  • .nvmrc
  • scripts/validate-migrations.sh
  • .github/workflows/ci-cd.yml

Comment thread backend/api/main.py
Comment on lines +114 to +116
response.headers["Cross-Origin-Resource-Policy"] = "same-origin"
response.headers["Cross-Origin-Opener-Policy"] = "same-origin"
response.headers["Cross-Origin-Embedder-Policy"] = "require-corp"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Does cdn.jsdelivr.net serve assets with Cross-Origin-Resource-Policy headers compatible with COEP require-corp, and does FastAPI's default Swagger UI load cleanly under COEP: require-corp?

💡 Result:

Yes, COEP: require-corp requires that every cross-origin subresource loaded in no-cors mode be allowed via the resource’s Cross-Origin-Resource-Policy (CORP) header; jsDelivr generally serves assets with CORP: cross-origin, which is compatible with COEP require-corp. However, whether FastAPI’s default Swagger UI loads cleanly under COEP: require-corp depends on whether all Swagger UI subresources actually get CORP headers (and also on edge cases like intermediate redirects that may lack CORP).

Citations:


🏁 Script executed:

# Search for any GitHub issues or documentation about FastAPI Swagger UI with COEP headers
curl -s "https://api.github.com/search/issues?q=FastAPI+Swagger+COEP+OR+FastAPI+docs+COEP&sort=updated" 2>/dev/null | head -100

Repository: liteshperumalla/Smart-Tutor-AI-AI-Driven-Personalized-Teaching-Support

Length of output: 7205


🌐 Web query:

FastAPI Swagger UI COEP require-corp compatibility issues practical examples

💡 Result:

FastAPI's Swagger UI at /docs loads JavaScript (swagger-ui-bundle.js), CSS (swagger-ui.css), favicon, and OpenAPI schema (/openapi.json) from external CDNs and relative paths by default. When a FastAPI app sets Cross-Origin-Embedder-Policy: require-corp (COEP require-corp), often with Cross-Origin-Opener-Policy: same-origin for cross-origin isolation, it blocks cross-origin resources loaded in no-cors mode unless they have Cross-Origin-Resource-Policy (CORP) headers or are loaded with CORS mode. Practical compatibility issues: - External CDN resources (e.g., cdn.jsdelivr.net for JS/CSS, fastapi.tiangolo.com for favicon) are blocked because they lack CORP: cross-origin when fetched in no-cors mode (default for and <script src>). Browser console shows COEP violation errors like "blocked due to CORP/COEP". - /openapi.json loads same-origin (relative URL), so no issue. - Favicon from external domain also blocked. Solutions (prioritize self-hosting for air-gapped/offline compatibility): 1. Self-host Swagger UI assets (recommended): - Download swagger-ui-bundle.js, swagger-ui.css, favicon.png from CDN/GitHub releases to a static/ directory. - Mount StaticFiles: app.mount("/static", StaticFiles(directory="static"), name="static") - Override /docs: from fastapi.openapi.docs import get_swagger_ui_html @app.get("/docs", include_in_schema=False) async def custom_swagger_ui: return get_swagger_ui_html( openapi_url=app.openapi_url, title=app.title + " - Swagger UI", swagger_js_url="/static/swagger-ui-bundle.js", swagger_css_url="/static/swagger-ui.css", swagger_favicon_url="/static/favicon.png", # or use same-origin ) All resources now same-origin, bypassing COEP blocks. 2. Use credentialless mode (COEP: credentialless) if supported (Chrome 96+), loads cross-origin without credentials/CORP but strips cookies. 3. Custom middleware to add CORP headers to responses doesn't help external CDNs. 4. Disable docs_url=None in FastAPI if isolation not needed for docs, but keep for API if requiring isolation. Example error (browser console): Did not allow loading cross-origin resource via no-cors due to COEP. Self-hosting also resolves CDN downtime/firewall issues, as noted in FastAPI docs and GitHub issues.

Citations:


🏁 Script executed:

# Check if FastAPI has documentation or examples about conditional header setting in middleware
rg "request.url.path|response.headers" --type py | head -20

Repository: liteshperumalla/Smart-Tutor-AI-AI-Driven-Personalized-Teaching-Support

Length of output: 2065


🏁 Script executed:

cat -n backend/api/main.py | sed -n '100,130p'

Repository: liteshperumalla/Smart-Tutor-AI-AI-Driven-Personalized-Teaching-Support

Length of output: 1789


🏁 Script executed:

# Get more context around the security headers
cat -n backend/api/main.py | sed -n '80,140p'

Repository: liteshperumalla/Smart-Tutor-AI-AI-Driven-Personalized-Teaching-Support

Length of output: 3170


Scope Cross-Origin-Embedder-Policy and Cross-Origin-Opener-Policy away from documentation paths to avoid breaking /docs and /redoc.

Lines 114–116 apply COEP/COOP headers unconditionally to every response, including Swagger UI and ReDoc—which are served when config.ENVIRONMENT != "production". Swagger UI loads JavaScript and CSS from cdn.jsdelivr.net in no-cors mode. With COEP: require-corp on the HTML response, browsers will refuse to load these cross-origin sub-resources and throw "blocked due to COEP" errors in the console. This causes /docs and /redoc to silently fail to render in non-production environments—a regression that developers will only discover when they need to test the API.

Exclude /docs, /redoc, /docs/oauth2-redirect, and /openapi.json from COEP/COOP (already a pattern in enforce_https middleware at line 91):

🔧 Suggested fix
     response.headers["Cross-Origin-Resource-Policy"] = "same-origin"
-    response.headers["Cross-Origin-Opener-Policy"] = "same-origin"
-    response.headers["Cross-Origin-Embedder-Policy"] = "require-corp"
+    if request.url.path not in {"/docs", "/redoc", "/docs/oauth2-redirect", "/openapi.json"}:
+        response.headers["Cross-Origin-Opener-Policy"] = "same-origin"
+        response.headers["Cross-Origin-Embedder-Policy"] = "require-corp"

Alternatively, drop Cross-Origin-Embedder-Policy entirely; an API backend does not embed cross-origin resources, and require-corp is primarily intended for contexts using SharedArrayBuffer or cross-origin isolation—neither applicable here.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
response.headers["Cross-Origin-Resource-Policy"] = "same-origin"
response.headers["Cross-Origin-Opener-Policy"] = "same-origin"
response.headers["Cross-Origin-Embedder-Policy"] = "require-corp"
response.headers["Cross-Origin-Resource-Policy"] = "same-origin"
if request.url.path not in {"/docs", "/redoc", "/docs/oauth2-redirect", "/openapi.json"}:
response.headers["Cross-Origin-Opener-Policy"] = "same-origin"
response.headers["Cross-Origin-Embedder-Policy"] = "require-corp"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/api/main.py` around lines 114 - 116, The COEP/COOP headers are being
applied unconditionally in the response middleware and break Swagger/ReDoc
assets; update the middleware that sets
response.headers["Cross-Origin-Embedder-Policy"] and
response.headers["Cross-Origin-Opener-Policy"] (same area as the enforce_https
middleware) to skip adding those two headers for documentation/openapi routes
(at minimum: "/docs", "/redoc", "/docs/oauth2-redirect", "/openapi.json" — or
any path starting with "/docs") or alternatively remove COEP entirely for this
service; keep Cross-Origin-Resource-Policy if desired but ensure the conditional
check runs before setting response.headers so docs and openapi JSON are not
served with COEP/COOP.

Comment thread backend/api/main.py
Comment on lines +183 to +193
@app.get("/sitemap.xml")
async def sitemap():
return Response(
content=(
'<?xml version="1.0" encoding="UTF-8"?>'
'<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'
"<url><loc>https://smart-ai-tutor.local/</loc></url>"
"</urlset>"
),
media_type="application/xml",
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Hardcoded placeholder URL in sitemap.xml.

https://smart-ai-tutor.local/ is a .local mDNS-style placeholder that will be served verbatim in every environment, including production. Search crawlers ingesting this sitemap from production will index a non‑resolvable URL.

Drive the canonical URL from configuration (e.g. config.PUBLIC_BASE_URL or an env var), and consider returning an empty <urlset> when no public URL is configured.

🔧 Suggested change
 `@app.get`("/sitemap.xml")
 async def sitemap():
+    public_base_url = os.getenv("PUBLIC_BASE_URL", "").rstrip("/")
+    url_entry = (
+        f"<url><loc>{public_base_url}/</loc></url>" if public_base_url else ""
+    )
     return Response(
         content=(
             '<?xml version="1.0" encoding="UTF-8"?>'
             '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'
-            "<url><loc>https://smart-ai-tutor.local/</loc></url>"
+            f"{url_entry}"
             "</urlset>"
         ),
         media_type="application/xml",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/api/main.py` around lines 183 - 193, The sitemap endpoint currently
returns a hardcoded .local URL; update the sitemap() handler to read the
canonical public URL from configuration (e.g., config.PUBLIC_BASE_URL or an env
var) instead of the literal "https://smart-ai-tutor.local/", and use that value
when building the <loc> entry passed into Response (keep
media_type="application/xml"); if no public URL is configured, return an empty
<urlset> (no <url> children) to avoid exposing a non-resolvable placeholder.
Ensure you update the sitemap() function and reference config.PUBLIC_BASE_URL
(or the chosen env var) when composing the XML.

Comment on lines +113 to +116
def test_dynamic_not_found_includes_no_store_headers(self, test_client):
response = test_client.get("/robots.txt")

response = test_client.get("/missing")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Dead/leftover request in test_dynamic_not_found_includes_no_store_headers.

Line 114 issues a GET to /robots.txt whose response is immediately overwritten on line 116 and never asserted on. This looks like copy/paste residue and should be removed so the test only exercises the dynamic 404 path it's documenting.

🧹 Proposed cleanup
     def test_dynamic_not_found_includes_no_store_headers(self, test_client):
-        response = test_client.get("/robots.txt")
-
         response = test_client.get("/missing")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_dynamic_not_found_includes_no_store_headers(self, test_client):
response = test_client.get("/robots.txt")
response = test_client.get("/missing")
def test_dynamic_not_found_includes_no_store_headers(self, test_client):
response = test_client.get("/missing")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_security.py` around lines 113 - 116, In
test_dynamic_not_found_includes_no_store_headers remove the leftover call to
test_client.get("/robots.txt") that is immediately overwritten; keep only the
request exercising the dynamic 404 (test_client.get("/missing")) so the test
asserts the intended dynamic-not-found behavior and no-store headers, i.e.,
delete the redundant GET to "/robots.txt" in the test.

@liteshperumalla liteshperumalla merged commit 966deb2 into main Apr 24, 2026
22 checks passed
@liteshperumalla liteshperumalla deleted the codex/sdlc-hardening branch April 24, 2026 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants