fix: patch 4 critical security vulnerabilities by gn00295120 · Pull Request #497 · openai/chatgpt-retrieval-plugin

gn00295120 · 2026-03-22T07:15:08Z

Summary

This PR fixes 4 security vulnerabilities identified in the codebase:

PLUGIN-01 — SQL Injection in postgres_datastore.py delete_by_filters
PLUGIN-06 — Zip Slip path traversal in process_zip.py
PLUGIN-03 — Hardcoded temp file path race condition in services/file.py
PLUGIN-07 — Timing side-channel in bearer token comparison in server/main.py

Changes

1. SQL Injection in `delete_by_filters` (`datastore/providers/postgres_datastore.py`)

Before (vulnerable): Filter values interpolated directly into SQL via f-strings.

filters += f" document_id = '{filter.document_id}' AND"
# ...
cur.execute(f"DELETE FROM {table} {filters}")

After (fixed): All filter values use parameterized %s placeholders passed via a params list.

filters += " document_id = %s AND"
params.append(filter.document_id)
# ...
cur.execute(f"DELETE FROM {table} {filters}", params)

The table name remains from config (not user input). All six user-controllable filter fields (document_id, source, source_id, author, start_date, end_date) are now parameterized.

2. Zip Slip path traversal (`scripts/process_zip/process_zip.py`)

Before (vulnerable): zip_file.extractall("dump") with no member path validation.

After (fixed): Added _safe_extract() that resolves each member's real path and rejects any that escape the target directory:

def _safe_extract(zip_file, target_dir):
    target_dir = os.path.realpath(target_dir)
    for member in zip_file.namelist():
        member_path = os.path.realpath(os.path.join(target_dir, member))
        if not member_path.startswith(target_dir + os.sep) and member_path != target_dir:
            raise ValueError(f"Attempted path traversal in zip: {member}")
    zip_file.extractall(target_dir)

3. Hardcoded temp file path (`services/file.py`)

Before (vulnerable): All uploads wrote to the same /tmp/temp_file, creating a race condition (TOCTOU) and predictable path.

temp_file_path = "/tmp/temp_file"
with open(temp_file_path, "wb") as f:
    f.write(file_stream)

After (fixed): Uses tempfile.NamedTemporaryFile for unique, unpredictable temp files with correct extension:

with tempfile.NamedTemporaryFile(
    delete=False, suffix=os.path.splitext(file.filename or "")[1]
) as tmp:
    tmp.write(file_stream)
    temp_file_path = tmp.name

4. Timing side-channel in token comparison (`server/main.py`)

Before (vulnerable): Direct != comparison leaks token length/content via timing.

credentials.credentials != BEARER_TOKEN

After (fixed): Constant-time comparison via secrets.compare_digest:

not secrets.compare_digest(credentials.credentials, BEARER_TOKEN)

Test plan

Verify delete_by_filters works with various filter combinations against a Postgres instance
Verify zip extraction rejects archives containing ../../etc/passwd style paths
Verify concurrent file uploads don't overwrite each other's temp files
Verify valid bearer tokens still authenticate successfully
Verify invalid bearer tokens are still rejected

- Fix SQL injection in postgres_datastore.py delete_by_filters by replacing f-string interpolation with parameterized queries - Fix Zip Slip path traversal in process_zip.py by validating member paths before extraction - Fix race condition with hardcoded temp file path in file.py by using tempfile.NamedTemporaryFile - Fix timing side-channel in bearer token comparison in main.py by using secrets.compare_digest

Copilot

Pull request overview

This PR addresses multiple security vulnerabilities across the Postgres datastore, zip processing script, upload temp-file handling, and bearer token validation.

Changes:

Parameterizes delete_by_filters SQL predicates to prevent SQL injection.
Adds a safe zip extraction helper to block Zip Slip path traversal.
Replaces a fixed /tmp upload path with NamedTemporaryFile for safer temp storage.
Uses secrets.compare_digest to reduce timing side-channel leakage in token checks.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
`datastore/providers/postgres_datastore.py`	Switches `delete_by_filters` to parameterized predicates via `%s` + params list.
`scripts/process_zip/process_zip.py`	Introduces `_safe_extract()` to validate zip member paths before extraction.
`services/file.py`	Uses `tempfile.NamedTemporaryFile` instead of a hardcoded temp filepath for uploads.
`server/main.py`	Replaces direct token inequality with constant-time comparison (`compare_digest`).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-22T07:18:30Z

        filters = "WHERE"
+        params = []
        if filter.document_id:
-            filters += f" document_id = '{filter.document_id}' AND"
+            filters += " document_id = %s AND"
+            params.append(filter.document_id)
        if filter.source:
-            filters += f" source = '{filter.source}' AND"
+            filters += " source = %s AND"
+            params.append(filter.source)
        if filter.source_id:
-            filters += f" source_id = '{filter.source_id}' AND"
+            filters += " source_id = %s AND"
+            params.append(filter.source_id)
        if filter.author:
-            filters += f" author = '{filter.author}' AND"
+            filters += " author = %s AND"
+            params.append(filter.author)
        if filter.start_date:
-            filters += f" created_at >= '{filter.start_date}' AND"
+            filters += " created_at >= %s AND"
+            params.append(filter.start_date)
        if filter.end_date:
-            filters += f" created_at <= '{filter.end_date}' AND"
+            filters += " created_at <= %s AND"
+            params.append(filter.end_date)
        filters = filters[:-4]

        with self.client.cursor() as cur:
-            cur.execute(f"DELETE FROM {table} {filters}")
+            cur.execute(f"DELETE FROM {table} {filters}", params)


filters = filters[:-4] assumes at least one predicate was appended (ending in " AND"). If the caller passes a DocumentMetadataFilter with all fields unset, filters becomes "W" and the resulting SQL (DELETE FROM <table> W) will fail. Consider explicitly handling the empty-filter case (e.g., raise a clear exception / return early) rather than issuing an invalid query (and definitely avoid turning it into an unbounded DELETE).

Copilot · 2026-03-22T07:18:31Z

    async def delete_by_filters(self, table: str, filter: DocumentMetadataFilter):
        """
        Deletes rows in the table that match the filter.
        """

        filters = "WHERE"
+        params = []
        if filter.document_id:
-            filters += f" document_id = '{filter.document_id}' AND"
+            filters += " document_id = %s AND"
+            params.append(filter.document_id)
        if filter.source:
-            filters += f" source = '{filter.source}' AND"
+            filters += " source = %s AND"
+            params.append(filter.source)
        if filter.source_id:
-            filters += f" source_id = '{filter.source_id}' AND"
+            filters += " source_id = %s AND"
+            params.append(filter.source_id)
        if filter.author:
-            filters += f" author = '{filter.author}' AND"
+            filters += " author = %s AND"
+            params.append(filter.author)
        if filter.start_date:
-            filters += f" created_at >= '{filter.start_date}' AND"
+            filters += " created_at >= %s AND"
+            params.append(filter.start_date)
        if filter.end_date:
-            filters += f" created_at <= '{filter.end_date}' AND"
+            filters += " created_at <= %s AND"
+            params.append(filter.end_date)
        filters = filters[:-4]

        with self.client.cursor() as cur:
-            cur.execute(f"DELETE FROM {table} {filters}")
+            cur.execute(f"DELETE FROM {table} {filters}", params)
            self.client.commit()


This change materially alters delete behavior and is security-sensitive, but there’s no test exercising delete_by_filters (especially for combinations of filters and the empty-filter case). Adding a targeted unit/integration test in the existing Postgres datastore test suite would help prevent regressions and confirm correct parameter binding.

Copilot · 2026-03-22T07:18:31Z

 def validate_token(credentials: HTTPAuthorizationCredentials = Depends(bearer_scheme)):
-    if credentials.scheme != "Bearer" or credentials.credentials != BEARER_TOKEN:
+    if credentials.scheme != "Bearer" or not secrets.compare_digest(credentials.credentials, BEARER_TOKEN):
        raise HTTPException(status_code=401, detail="Invalid or missing token")


secrets.compare_digest will raise a TypeError if BEARER_TOKEN is None. Today that’s prevented via assert BEARER_TOKEN is not None, but asserts can be disabled (e.g., running Python with -O), which would turn a missing env var into a 500 instead of a deterministic startup failure or a 401. Prefer an explicit runtime check (raise a clear exception during startup / module import, or guard in validate_token) rather than relying on assert for required configuration.

- Refactor delete_by_filters to use conditions list instead of string slicing, preventing empty-filter edge case - Replace assert with explicit ValueError for BEARER_TOKEN check, ensuring it works with python -O

Copilot AI review requested due to automatic review settings March 22, 2026 07:15

Copilot started reviewing on behalf of gn00295120 March 22, 2026 07:15 View session

gn00295120 force-pushed the fix/critical-security-vulnerabilities branch from abb2299 to 7ad2c87 Compare March 22, 2026 07:15

Copilot AI reviewed Mar 22, 2026

View reviewed changes

fix: address review feedback

a80a1d0

- Refactor delete_by_filters to use conditions list instead of string slicing, preventing empty-filter edge case - Replace assert with explicit ValueError for BEARER_TOKEN check, ensuring it works with python -O

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: patch 4 critical security vulnerabilities#497

fix: patch 4 critical security vulnerabilities#497
gn00295120 wants to merge 2 commits intoopenai:mainfrom
gn00295120:fix/critical-security-vulnerabilities

gn00295120 commented Mar 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gn00295120 commented Mar 22, 2026

Summary

Changes

1. SQL Injection in delete_by_filters (datastore/providers/postgres_datastore.py)

2. Zip Slip path traversal (scripts/process_zip/process_zip.py)

3. Hardcoded temp file path (services/file.py)

4. Timing side-channel in token comparison (server/main.py)

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. SQL Injection in `delete_by_filters` (`datastore/providers/postgres_datastore.py`)

2. Zip Slip path traversal (`scripts/process_zip/process_zip.py`)

3. Hardcoded temp file path (`services/file.py`)

4. Timing side-channel in token comparison (`server/main.py`)