Skip to content

fix: patch 4 critical security vulnerabilities#497

Open
gn00295120 wants to merge 2 commits intoopenai:mainfrom
gn00295120:fix/critical-security-vulnerabilities
Open

fix: patch 4 critical security vulnerabilities#497
gn00295120 wants to merge 2 commits intoopenai:mainfrom
gn00295120:fix/critical-security-vulnerabilities

Conversation

@gn00295120
Copy link
Copy Markdown

Summary

This PR fixes 4 security vulnerabilities identified in the codebase:

  • PLUGIN-01 — SQL Injection in postgres_datastore.py delete_by_filters
  • PLUGIN-06 — Zip Slip path traversal in process_zip.py
  • PLUGIN-03 — Hardcoded temp file path race condition in services/file.py
  • PLUGIN-07 — Timing side-channel in bearer token comparison in server/main.py

Changes

1. SQL Injection in delete_by_filters (datastore/providers/postgres_datastore.py)

Before (vulnerable): Filter values interpolated directly into SQL via f-strings.

filters += f" document_id = '{filter.document_id}' AND"
# ...
cur.execute(f"DELETE FROM {table} {filters}")

After (fixed): All filter values use parameterized %s placeholders passed via a params list.

filters += " document_id = %s AND"
params.append(filter.document_id)
# ...
cur.execute(f"DELETE FROM {table} {filters}", params)

The table name remains from config (not user input). All six user-controllable filter fields (document_id, source, source_id, author, start_date, end_date) are now parameterized.

2. Zip Slip path traversal (scripts/process_zip/process_zip.py)

Before (vulnerable): zip_file.extractall("dump") with no member path validation.

After (fixed): Added _safe_extract() that resolves each member's real path and rejects any that escape the target directory:

def _safe_extract(zip_file, target_dir):
    target_dir = os.path.realpath(target_dir)
    for member in zip_file.namelist():
        member_path = os.path.realpath(os.path.join(target_dir, member))
        if not member_path.startswith(target_dir + os.sep) and member_path != target_dir:
            raise ValueError(f"Attempted path traversal in zip: {member}")
    zip_file.extractall(target_dir)

3. Hardcoded temp file path (services/file.py)

Before (vulnerable): All uploads wrote to the same /tmp/temp_file, creating a race condition (TOCTOU) and predictable path.

temp_file_path = "/tmp/temp_file"
with open(temp_file_path, "wb") as f:
    f.write(file_stream)

After (fixed): Uses tempfile.NamedTemporaryFile for unique, unpredictable temp files with correct extension:

with tempfile.NamedTemporaryFile(
    delete=False, suffix=os.path.splitext(file.filename or "")[1]
) as tmp:
    tmp.write(file_stream)
    temp_file_path = tmp.name

4. Timing side-channel in token comparison (server/main.py)

Before (vulnerable): Direct != comparison leaks token length/content via timing.

credentials.credentials != BEARER_TOKEN

After (fixed): Constant-time comparison via secrets.compare_digest:

not secrets.compare_digest(credentials.credentials, BEARER_TOKEN)

Test plan

  • Verify delete_by_filters works with various filter combinations against a Postgres instance
  • Verify zip extraction rejects archives containing ../../etc/passwd style paths
  • Verify concurrent file uploads don't overwrite each other's temp files
  • Verify valid bearer tokens still authenticate successfully
  • Verify invalid bearer tokens are still rejected

Copilot AI review requested due to automatic review settings March 22, 2026 07:15
- Fix SQL injection in postgres_datastore.py delete_by_filters
  by replacing f-string interpolation with parameterized queries
- Fix Zip Slip path traversal in process_zip.py by validating
  member paths before extraction
- Fix race condition with hardcoded temp file path in file.py
  by using tempfile.NamedTemporaryFile
- Fix timing side-channel in bearer token comparison in main.py
  by using secrets.compare_digest
@gn00295120 gn00295120 force-pushed the fix/critical-security-vulnerabilities branch from abb2299 to 7ad2c87 Compare March 22, 2026 07:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses multiple security vulnerabilities across the Postgres datastore, zip processing script, upload temp-file handling, and bearer token validation.

Changes:

  • Parameterizes delete_by_filters SQL predicates to prevent SQL injection.
  • Adds a safe zip extraction helper to block Zip Slip path traversal.
  • Replaces a fixed /tmp upload path with NamedTemporaryFile for safer temp storage.
  • Uses secrets.compare_digest to reduce timing side-channel leakage in token checks.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
datastore/providers/postgres_datastore.py Switches delete_by_filters to parameterized predicates via %s + params list.
scripts/process_zip/process_zip.py Introduces _safe_extract() to validate zip member paths before extraction.
services/file.py Uses tempfile.NamedTemporaryFile instead of a hardcoded temp filepath for uploads.
server/main.py Replaces direct token inequality with constant-time comparison (compare_digest).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +115 to +138
filters = "WHERE"
params = []
if filter.document_id:
filters += f" document_id = '{filter.document_id}' AND"
filters += " document_id = %s AND"
params.append(filter.document_id)
if filter.source:
filters += f" source = '{filter.source}' AND"
filters += " source = %s AND"
params.append(filter.source)
if filter.source_id:
filters += f" source_id = '{filter.source_id}' AND"
filters += " source_id = %s AND"
params.append(filter.source_id)
if filter.author:
filters += f" author = '{filter.author}' AND"
filters += " author = %s AND"
params.append(filter.author)
if filter.start_date:
filters += f" created_at >= '{filter.start_date}' AND"
filters += " created_at >= %s AND"
params.append(filter.start_date)
if filter.end_date:
filters += f" created_at <= '{filter.end_date}' AND"
filters += " created_at <= %s AND"
params.append(filter.end_date)
filters = filters[:-4]

with self.client.cursor() as cur:
cur.execute(f"DELETE FROM {table} {filters}")
cur.execute(f"DELETE FROM {table} {filters}", params)
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filters = filters[:-4] assumes at least one predicate was appended (ending in " AND"). If the caller passes a DocumentMetadataFilter with all fields unset, filters becomes "W" and the resulting SQL (DELETE FROM <table> W) will fail. Consider explicitly handling the empty-filter case (e.g., raise a clear exception / return early) rather than issuing an invalid query (and definitely avoid turning it into an unbounded DELETE).

Copilot uses AI. Check for mistakes.
Comment on lines 110 to 139
async def delete_by_filters(self, table: str, filter: DocumentMetadataFilter):
"""
Deletes rows in the table that match the filter.
"""

filters = "WHERE"
params = []
if filter.document_id:
filters += f" document_id = '{filter.document_id}' AND"
filters += " document_id = %s AND"
params.append(filter.document_id)
if filter.source:
filters += f" source = '{filter.source}' AND"
filters += " source = %s AND"
params.append(filter.source)
if filter.source_id:
filters += f" source_id = '{filter.source_id}' AND"
filters += " source_id = %s AND"
params.append(filter.source_id)
if filter.author:
filters += f" author = '{filter.author}' AND"
filters += " author = %s AND"
params.append(filter.author)
if filter.start_date:
filters += f" created_at >= '{filter.start_date}' AND"
filters += " created_at >= %s AND"
params.append(filter.start_date)
if filter.end_date:
filters += f" created_at <= '{filter.end_date}' AND"
filters += " created_at <= %s AND"
params.append(filter.end_date)
filters = filters[:-4]

with self.client.cursor() as cur:
cur.execute(f"DELETE FROM {table} {filters}")
cur.execute(f"DELETE FROM {table} {filters}", params)
self.client.commit()
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change materially alters delete behavior and is security-sensitive, but there’s no test exercising delete_by_filters (especially for combinations of filters and the empty-filter case). Adding a targeted unit/integration test in the existing Postgres datastore test suite would help prevent regressions and confirm correct parameter binding.

Copilot uses AI. Check for mistakes.
Comment thread server/main.py
Comment on lines 28 to 30
def validate_token(credentials: HTTPAuthorizationCredentials = Depends(bearer_scheme)):
if credentials.scheme != "Bearer" or credentials.credentials != BEARER_TOKEN:
if credentials.scheme != "Bearer" or not secrets.compare_digest(credentials.credentials, BEARER_TOKEN):
raise HTTPException(status_code=401, detail="Invalid or missing token")
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

secrets.compare_digest will raise a TypeError if BEARER_TOKEN is None. Today that’s prevented via assert BEARER_TOKEN is not None, but asserts can be disabled (e.g., running Python with -O), which would turn a missing env var into a 500 instead of a deterministic startup failure or a 401. Prefer an explicit runtime check (raise a clear exception during startup / module import, or guard in validate_token) rather than relying on assert for required configuration.

Copilot uses AI. Check for mistakes.
- Refactor delete_by_filters to use conditions list instead of string
  slicing, preventing empty-filter edge case
- Replace assert with explicit ValueError for BEARER_TOKEN check,
  ensuring it works with python -O
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants