Skip to content

Repo setup : Adding CI and pre commit#39

Merged
AkhileshNegi merged 9 commits intomainfrom
enhancement/CI_in_repo
Feb 10, 2026
Merged

Repo setup : Adding CI and pre commit#39
AkhileshNegi merged 9 commits intomainfrom
enhancement/CI_in_repo

Conversation

@nishika26
Copy link
Copy Markdown
Contributor

@nishika26 nishika26 commented Feb 10, 2026

Summary

Target issue is #40

Notes

  • CI workflow and pre-commit checks added.
  • File formatted with the pre-commit

Note: please run uv run pre-commit install after this pr is merged to main

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 10, 2026

📝 Walkthrough

Walkthrough

Adds DB session and bearer-token dependencies, settings initialization, APIResponse utilities, new enum members and validator config helpers, minor model/schema tweaks, test fixtures, Alembic formatting edits, CI and pre-commit configs, plus formatting changes across validator modules and evaluation scripts.

Changes

Cohort / File(s) Summary
Alembic
backend/app/alembic/env.py, backend/app/alembic/versions/*.py
Added get_url() in env; migration files reformatted (quoting/style) with no schema changes.
API deps & routes
backend/app/api/deps.py, backend/app/api/routes/guardrails.py, backend/app/api/routes/validator_configs.py, backend/app/api/routes/utils.py
Added get_db/SessionDep, verify_bearer_token/AuthDep; many endpoints now accept session: SessionDep and _: AuthDep; minor signature/default tweaks (e.g., suppress_pass_logs).
Core config & startup
backend/app/core/config.py, backend/app/main.py
Added get_settings() + module-level settings; added custom_generate_unique_id and conditional Sentry init.
Guardrail controller & validators
backend/app/core/guardrail_controller.py, backend/app/core/validators/..., backend/app/core/enum.py
Added build_guard and get_validator_config_models; added ValidatorType members (GenderAssumptionBias, BanList); many validators reformatted.
Models & schemas
backend/app/models/config/validator_config.py, backend/app/models/logging/*.py, backend/app/schemas/guardrail_config.py
ValidatorConfig.is_enabled defaulted to True; updated_at uses onupdate=now; RequestStatus gains ERROR/WARNING; added RequestLogUpdate; GuardrailResponse adds optional safe_text.
CRUD & utils
backend/app/crud/*.py, backend/app/core/exception_handlers.py, backend/app/utils.py
RequestLogCrud.update now sets updated_at and status and persists; ValidatorConfigCrud.update improves error handling; added now(), split_validator_payload(), and generic APIResponse utilities; refined validation error messages.
Evaluation
backend/app/evaluation/pii/run.py, backend/app/evaluation/*
Added run_pii() and anonymized column; other evaluation scripts received formatting changes.
Tests & fixtures
backend/app/tests/conftest.py, backend/app/tests/**
Added session-scoped test DB setup, DB-clean fixture, dependency override fixture binding SessionDep to test session; multiple test formatting tweaks (one gender-bias test scope changed).
CI & tooling
.github/workflows/continuous_integration.yml, .pre-commit-config.yaml
Added GitHub Actions CI workflow (Postgres, Python matrix, migrations, tests) and pre-commit hooks (black, checks).

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant API as FastAPI
  participant Auth as AuthDep
  participant DB as Database\n(Session)
  participant Guard as GuardrailController
  participant Validator as Validator(s)

  Client->>API: POST /guardrails (payload, Authorization)
  API->>Auth: verify_bearer_token(header)
  Auth-->>API: valid / raise 401
  API->>DB: get_db() -> SessionDep
  API->>Guard: build_guard(validator_configs)
  API->>Validator: run validations (use DB session/configs)
  Validator-->>API: results (including safe_text)
  API->>DB: persist RequestLog / ValidatorLog
  API-->>Client: 200 OK (APIResponse with data, safe_text)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

enhancement, ready-for-review

Suggested reviewers

  • AkhileshNegi
  • dennyabrain

Poem

"I hopped through code with nimble feet,
added guards to make outputs neat,
sessions, tokens, logs in a line,
CI sings while tests align.
— Rabbit 🐇"

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.91% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Repo setup : Adding CI and pre commit' is specific and directly related to the main changes: introducing a GitHub Actions CI workflow (.github/workflows/continuous_integration.yml) and a pre-commit configuration (.pre-commit-config.yaml).
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enhancement/CI_in_repo

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)
backend/app/core/config.py (2)

89-94: ⚠️ Potential issue | 🟡 Minor

staging environment maps to .env.test — is this intentional?

staging uses .env.test which seems like a copy-paste oversight. Staging environments typically have their own configuration distinct from test environments.

     env_files = {
         "testing": ROOT_DIR / ".env.test",
-        "staging": ROOT_DIR / ".env.test",
+        "staging": ROOT_DIR / ".env.staging",
         "development": ROOT_DIR / ".env",
         "production": ROOT_DIR / ".env",
     }

33-33: ⚠️ Potential issue | 🟡 Minor

"local" is not a valid ENVIRONMENT value, but is checked in main.py Line 21.

The ENVIRONMENT literal type allows "development" | "testing" | "staging" | "production", so pydantic will reject "local". However, backend/app/main.py Line 21 checks settings.ENVIRONMENT != "local" to gate Sentry initialization — that condition can never be False, making it dead logic. Either add "local" to the Literal or remove the check in main.py.

backend/app/alembic/env.py (1)

43-45: ⚠️ Potential issue | 🟡 Minor

Offline and online migrations use different URL sources.

run_migrations_offline() reads the URL from alembic.ini via config.get_main_option("sqlalchemy.url"), while run_migrations_online() uses get_url() which derives it from application settings. This inconsistency could cause offline-generated migrations to target a different database.

Consider using get_url() in both paths:

 def run_migrations_offline() -> None:
-    url = config.get_main_option("sqlalchemy.url")
+    url = get_url()
     context.configure(

Also applies to: 62-63

backend/app/main.py (1)

7-13: ⚠️ Potential issue | 🟠 Major

settings is constructed at import time, before load_environment() runs.

from app.core.config import settings (Line 7) triggers settings = get_settings() at module load — before load_environment() on Line 13 executes. If load_environment() sets environment variables that Settings depends on (e.g., POSTGRES_SERVER, AUTH_TOKEN), those values won't be available when Settings is instantiated.

Either call load_environment() inside get_settings() before constructing Settings, or restructure so that settings is lazily initialized.

backend/app/core/exception_handlers.py (1)

67-73: ⚠️ Potential issue | 🟠 Major

Avoid leaking internal error details in generic 500 responses.

str(exc) can expose internal implementation details (e.g., database errors, file paths, stack fragments) to the client. For unhandled exceptions, return a generic message and log the actual error server-side.

Proposed fix
     `@app.exception_handler`(Exception)
     async def generic_error_handler(request: Request, exc: Exception):
+        import logging
+        logging.getLogger(__name__).exception("Unhandled error")
         return JSONResponse(
             status_code=HTTP_500_INTERNAL_SERVER_ERROR,
             content=APIResponse.failure_response(
-                str(exc) or "An unexpected error occurred."
+                "An unexpected error occurred."
             ).model_dump(),
         )
backend/app/crud/validator_config.py (1)

34-44: ⚠️ Potential issue | 🟡 Minor

create doesn't roll back on non-IntegrityError commit failures.

The update and delete methods both catch generic Exception to rollback, but create only handles IntegrityError. If commit fails for another reason (e.g., connection error), the session is left in a broken state.

Proposed fix for consistency
         try:
             session.commit()
         except IntegrityError:
             session.rollback()
             raise HTTPException(
                 400,
                 "Validator already exists for this type and stage",
             )
+        except Exception:
+            session.rollback()
+            raise
backend/app/api/deps.py (1)

33-39: ⚠️ Potential issue | 🟠 Major

Use constant-time comparison for token validation.

token != settings.AUTH_TOKEN is vulnerable to timing side-channel attacks. Use secrets.compare_digest instead.

Proposed fix
+import secrets
+
 ...
-    if token != settings.AUTH_TOKEN:
+    if not secrets.compare_digest(token, settings.AUTH_TOKEN):
         raise HTTPException(
             status_code=status.HTTP_401_UNAUTHORIZED,
             detail="Invalid token",
         )
backend/app/tests/test_validator_configs_integration.py (1)

62-88: ⚠️ Potential issue | 🟡 Minor

Inconsistent URL construction in update_validator vs. get_validator/delete_validator.

get_validator (line 69) and delete_validator (line 88) include a / between validator_id and DEFAULT_QUERY_PARAMS, but update_validator (line 83) does not. This could cause issues depending on how the framework handles trailing slashes.

     def update_validator(self, client, validator_id, payload):
         """Helper to update a validator."""
         return client.patch(
-            f"{BASE_URL}{validator_id}{DEFAULT_QUERY_PARAMS}", json=payload
+            f"{BASE_URL}{validator_id}/{DEFAULT_QUERY_PARAMS}", json=payload
         )
backend/app/tests/conftest.py (1)

41-49: ⚠️ Potential issue | 🔴 Critical

Override SessionDep with the underlying dependency callable get_db, not the Annotated type alias.

FastAPI's dependency_overrides dict uses the original dependency callable as keys, not Annotated type aliases. The current override on line 45 (app.dependency_overrides[SessionDep] = override_session) silently does nothing—endpoints continue using the production database session. Tests are likely polluting production data.

Fix by importing and overriding the underlying function:

Suggested fix
-from app.api.deps import SessionDep, verify_bearer_token
+from app.api.deps import SessionDep, verify_bearer_token, get_db
 ...
-    app.dependency_overrides[SessionDep] = override_session
+    app.dependency_overrides[get_db] = override_session
🤖 Fix all issues with AI agents
In `@backend/app/alembic/versions/001_added_request_log.py`:
- Around line 42-44: The downgrade currently drops the request_log table but
leaves the PostgreSQL enum type used for the RequestStatus column (the
`requeststatus` enum created via sa.Enum(...)) orphaned; update the downgrade()
implementation (the block containing op.drop_table("request_log")) to explicitly
drop the enum type as well—either by invoking the corresponding SQL via
op.execute to DROP TYPE IF EXISTS requeststatus or by using
sa.Enum(...).drop(op.get_bind(), checkfirst=True) / the SA construct that
mirrors how the enum was created—so the enum is removed when the migration is
rolled back.
🧹 Nitpick comments (17)
backend/app/core/guardrail_controller.py (2)

8-10: Consider adding type hints for build_guard.

Adding parameter and return type annotations would improve discoverability and IDE support.

✏️ Suggested type hints
-def build_guard(validator_items):
+def build_guard(validator_items: list) -> Guard:
     validators = [v_item.build() for v_item in validator_items]
     return Guard().use_many(*validators)

13-16: get_validator_config_models assumes a specific type structure — fragile if ValidatorConfigItem changes.

The function hardcodes annotated_args[0] to reach the inner Union type. If ValidatorConfigItem's definition changes (e.g., drops Annotated, reorders args), this silently breaks. A guard or comment clarifying the expected structure would help future maintainers.

backend/app/alembic/versions/001_added_request_log.py (1)

28-33: default="PROCESSING" is a Python-side default, not a DB-level one.

This means the default is only applied for inserts going through SQLAlchemy. If any direct SQL inserts or other tools are used, the status column won't have a default. If all inserts are ORM-managed this is fine; otherwise consider using server_default=sa.text("'PROCESSING'").

backend/app/core/config.py (1)

84-101: get_settings() runs at import time with no error handling for missing env files.

If the resolved .env file doesn't exist, Settings(...) may silently proceed without any environment variables loaded (depending on pydantic-settings behavior), which could lead to confusing validation errors downstream. Consider logging a warning or validating the file exists.

backend/app/crud/request_log.py (1)

23-43: Missing return type annotation on update().

The create and get methods have return type annotations, but update does not. For consistency:

     def update(
         self,
         request_log_id: UUID,
         request_status: RequestStatus,
         request_log_update: RequestLogUpdate,
-    ):
+    ) -> RequestLog:
backend/app/evaluation/pii/run.py (2)

12-24: Module-level side effects make this file non-importable without triggering heavy I/O and model loading.

Lines 12–14 read a CSV and instantiate PIIRemover (which loads NLP models) at import time. Line 24 applies the transformation eagerly. If this module is ever imported by tests or other scripts, all of this runs immediately.

Consider wrapping the script logic in a if __name__ == "__main__": guard or a main() function.


17-21: Use the public validate() method instead of private _validate().

In guardrails-ai, _validate() is the protected method you override when implementing a custom validator; validate() is the public API that handles orchestration and framework integration. Other evaluation scripts in the codebase (e.g., lexical_slur/run.py) use validator.validate() consistently. Call validate(text) instead of _validate(text) to align with the framework's intended usage pattern.

backend/app/utils.py (1)

14-15: now() returns a timezone-naive datetime — document the implicit UTC contract.

replace(tzinfo=None) strips the timezone, producing a naive datetime that is implicitly UTC. This is fine if the database layer expects naive UTC, but consumers have no programmatic way to know the timezone. A brief docstring would prevent future misuse.

backend/app/tests/test_guardrails_api_integration.py (1)

124-127: Long assertion line not reformatted, inconsistent with the rest of the PR.

Line 126 exceeds typical line-length limits and wasn't wrapped like the assertion at lines 196–199. Consider wrapping for consistency with the formatting applied elsewhere.

♻️ Suggested wrap
     assert (
         body["data"][SAFE_TEXT_FIELD]
-        == "this sentence contains [REDACTED_SLUR] cause i want a s done to kill the female foetus."
+        == "this sentence contains [REDACTED_SLUR] cause i want a s"
+        " done to kill the female foetus."
     )
backend/app/tests/validators/test_gender_assumption_bias.py (1)

64-71: Patch scope is correct but could be clearer.

The read_csv patch exits before _validate is called (Line 68), which works because GenderAssumptionBias.__init__ eagerly loads the bias list. However, this differs from the validator fixture pattern (Lines 22-24) where the validator is returned from inside the patch. Consider keeping the _validate call inside the with block for consistency and to avoid confusion for future readers.

♻️ Optional: move _validate inside the patch block for consistency
 def test_category_filtering(mock_gender_bias_df):
     with patch("pandas.read_csv", return_value=mock_gender_bias_df):
         validator = GenderAssumptionBias(categories=[BiasCategories.Generic])
-
-    result = validator._validate("He is a policeman.")
+        result = validator._validate("He is a policeman.")
     # Only profession bias should apply
     assert isinstance(result, FailResult)
     assert result.fix_value == "He is a police officer."
backend/app/schemas/guardrail_config.py (1)

22-31: Consider modern union syntax (X | Y) per Ruff UP007.

Ruff flags this as a style improvement for Python 3.10+. Low priority since the current Union[...] syntax works fine.

backend/app/tests/validators/test_lexical_slurs.py (1)

94-109: patch_slur_load fixture is redundant in severity-filtering tests.

Tests test_severity_low_includes_all, test_severity_medium_includes_m_and_h, and test_severity_high_includes_only_h each accept patch_slur_load but immediately re-monkeypatch load_slur_list with their own fake. Since they already depend on slur_csv directly, the patch_slur_load parameter serves no purpose here.

♻️ Example for test_severity_low_includes_all (apply similarly to the other two)
-def test_severity_low_includes_all(patch_slur_load, monkeypatch, slur_csv):
+def test_severity_low_includes_all(monkeypatch, slur_csv):
backend/app/models/config/validator_config.py (1)

52-61: Unnecessary parentheses around the description string.

Minor nit — the parentheses on Line 60 are redundant.

♻️ Suggested fix
-        description=("Configuration for the validator"),
+        description="Configuration for the validator",
backend/app/core/validators/gender_assumption_bias.py (1)

57-83: Pre-existing: duplicate entries possible when BiasCategories.All is combined with specific categories.

Not introduced by this PR, but worth noting: if categories contains BiasCategories.All alongside other values (e.g., [All, Healthcare]), load_gender_bias_list will add all entries once for All and again for Healthcare, causing duplicate detections. The default [BiasCategories.All] is safe, but callers passing mixed lists would hit this.

backend/app/api/routes/guardrails.py (1)

49-73: list_validators returns a raw dict while other endpoints use APIResponse.

This endpoint returns {"validators": validators} directly (line 73) instead of wrapping in APIResponse.success_response(...), which is inconsistent with run_guardrails and the validator config routes. Consider standardizing for a uniform API contract.

backend/app/tests/conftest.py (2)

33-38: clean_db fixture lacks a yield — runs only as setup, no teardown.

This is fine if the intent is pre-test cleanup only (since the next test's setup will clean again). However, the clear_database fixture in test_validator_configs_integration.py (lines 46–56) performs both pre- and post-test cleanup and is now redundant given this autouse fixture. Consider removing clear_database from the integration tests to avoid confusion.


21-23: override_session generator doesn't commit or rollback — changes accumulate.

Since clean_db truncates tables before each test, this works in practice. However, if a test causes a DB error, the session may be left in a broken state. Consider adding a rollback in the teardown:

 def override_session():
     with Session(test_engine) as session:
-        yield session
+        try:
+            yield session
+        finally:
+            session.rollback()

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In @.github/workflows/continuous_integration.yml:
- Around line 57-61: The pre-commit step fails because .pre-commit-config.yaml
lives at the repo root but the job sets working-directory: backend and also
unnecessarily sources a venv; update the "Run pre-commit" step by removing the
working-directory directive (so pre-commit can find the root config) and remove
the redundant "source .venv/bin/activate" line, leaving the invocation (uv run
pre-commit run --all-files) to run from the repository root so pre-commit uses
the root .pre-commit-config.yaml.
- Around line 23-26: The CI workflow's strategy.matrix declares redis-version
but it's never used; either remove the redis-version entry from strategy.matrix
to avoid confusion or add a Redis service definition and use the matrix value
(e.g., reference redis-version) so tests can connect to Redis; update the matrix
key "redis-version" or the job's services block (and any env or host/port
references) accordingly to ensure the matrix value is actually consumed by the
workflow.
- Around line 36-39: The workflow step named "Making env file" currently runs
the commands `cp .env.test.example .env` and `cp .env.test.example .env.test`,
but `.env.test.example` is missing so the job will fail; either add a repository
file named `.env.test.example` with the required variables, or modify this step
to generate the env files at runtime (for example, copy from an existing
`.env.example` or create `.env` and `.env.test` by echoing the required
key=value pairs into files) so the `cp` commands no longer reference a
nonexistent path.
- Line 29: The workflow references a non-existent action version; update the
step that uses astral-sh/setup-uv@v6 to astral-sh/setup-uv@v7 (use the latest
v7.x, e.g., v7.1.6) so the CI action resolves correctly—leave
actions/checkout@v6 and actions/setup-python@v6 unchanged; locate the
astral-sh/setup-uv usage in the CI YAML and bump its tag to v7.
🧹 Nitpick comments (4)
.github/workflows/continuous_integration.yml (2)

51-55: Redundant source .venv/bin/activate when using uv run elsewhere.

Lines 53 and 75 manually activate the venv, but other steps (e.g., Lines 64, 60) use uv run which handles the venv automatically. For consistency, prefer uv run throughout or document why manual activation is needed for Alembic and coverage.

Also applies to: 73-77


67-71: Codecov upload may run even if tests fail.

If the "Run tests" step fails, coverage artifacts may be incomplete or missing, yet the Codecov upload step will still execute (unless the job already stopped). Consider adding if: success() or relying on the default behavior, and note that fail_ci_if_error: true will cause a second failure if the report is missing.

.pre-commit-config.yaml (2)

1-15: Consider adding a linter (e.g., Ruff) alongside Black.

Black handles formatting, but the config lacks a linter for catching code quality issues (unused imports, undefined names, etc.). Ruff is a popular, fast alternative that covers both linting and import sorting.


5-5: Hook versions are outdated.

pre-commit-hooks is pinned to v4.4.0 (latest: v6.0.0, released August 2025) and black to 23.3.0 (latest: 26.1.0, released January 2026). Update to recent versions via pre-commit autoupdate.

@nishika26 nishika26 self-assigned this Feb 10, 2026
@nishika26 nishika26 added the enhancement New feature or request label Feb 10, 2026
@nishika26 nishika26 linked an issue Feb 10, 2026 that may be closed by this pull request
@AkhileshNegi AkhileshNegi merged commit 3213be8 into main Feb 10, 2026
1 check passed
@AkhileshNegi AkhileshNegi deleted the enhancement/CI_in_repo branch February 10, 2026 14:29
rkritika1508 pushed a commit that referenced this pull request Feb 10, 2026
rkritika1508 pushed a commit that referenced this pull request Feb 11, 2026
@coderabbitai coderabbitai bot mentioned this pull request Mar 16, 2026
2 tasks
@coderabbitai coderabbitai bot mentioned this pull request Apr 10, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Repo setup : Adding CI and pre-commit

2 participants