Codex/fix sandbox gc expiry#1016
Conversation
📝 WalkthroughWalkthroughGC logic changed: sandboxes are now evaluated for inactivity using persisted lifecycle timestamps ( Changes
Sequence DiagramsequenceDiagram
autonumber
participant GC as Garbage Collector\n(_collect_expired_sandboxes)
participant Repo as Repository
participant Redis as Redis
participant Sandbox as Sandbox Object
rect rgba(100, 150, 200, 0.5)
Note over GC,Repo: Old flow (TTL-based ZSET query)
GC->>Repo: get_expired_sandbox_ids(redis_ttl)
Repo->>Redis: ZRANGEBYSCORE (by activity age)
Redis-->>Repo: expired_ids
Repo-->>GC: expired_ids
end
rect rgba(150, 200, 100, 0.5)
Note over GC,Repo: New flow (per-sandbox expires_at / inactivity)
GC->>Repo: get_active_sandbox_ids()
Repo->>Redis: SMEMBERS active_set
Redis-->>Repo: active_ids
Repo-->>GC: active_ids
loop For each active id
GC->>Repo: load_sandbox(id)
Repo->>Redis: HGET session_hash
Redis-->>Repo: sandbox_data
Repo-->>Sandbox: construct sandbox (includes last_activity_at, expires_at)
GC->>Sandbox: compute idle_seconds / is_expired(sandbox_inactive_timeout)
Sandbox-->>GC: expired? (true/false)
alt orphan (load_sandbox -> None)
GC->>Repo: remove_from_active_set(id)
end
alt expired
GC->>GC: terminate_sandbox(id)
end
end
end
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d85b483e3b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if sandbox.is_expired(): | ||
| expired_task_ids.append(task_id_str) |
There was a problem hiding this comment.
Handle sandboxes without expires_at during GC
The new GC path only expires entries when sandbox.is_expired() is true, but is_expired() returns false when expires_at is missing. Any active sandbox records that were written without expires_at (for example, pre-deploy data or mixed-version writers) will now be skipped forever, so they are never passed to terminate_sandbox and their containers can keep running until manual cleanup.
Useful? React with 👍 / 👎.
| for task_id_str in active_task_ids: | ||
| sandbox = self._repository.load_sandbox(task_id_str) | ||
| if sandbox is None: |
There was a problem hiding this comment.
Use async repository reads in the GC scan loop
_collect_expired_sandboxes is an async scheduled job, but it now iterates every active sandbox and calls synchronous load_sandbox for each ID. With many active sandboxes this introduces N blocking Redis round-trips on the event loop, which can delay other async jobs (like heartbeat checks) and API responsiveness during GC runs.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
executor_manager/tests/services/test_sandbox_manager.py (1)
990-1017:⚠️ Potential issue | 🟠 MajorAdd
@pytest.mark.asynciodecorator to this async test.This async test is missing
@pytest.mark.asyncio, unlike the neighboring async tests in the same class (test_handle_executor_dead_terminates_sandboxat line 104 andtest_collect_expired_sandboxes_cleans_orphanedat line 169). Without the decorator, it may not run correctly under pytest-asyncio.Proposed fix
+ `@pytest.mark.asyncio` async def test_collect_expired_sandboxes_terminates_old( self, sandbox_manager_with_mock_redis,Note: The initial verification also identified 20 other async tests across the codebase missing this decorator (in
backend/tests/,executor/tests/, and other test files). Consider running a systematic fix across all test files.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@executor_manager/tests/services/test_sandbox_manager.py` around lines 990 - 1017, The async test function test_collect_expired_sandboxes_terminates_old is missing the pytest-asyncio decorator; add `@pytest.mark.asyncio` above the test definition so pytest runs the coroutine correctly when invoking manager._collect_expired_sandboxes; ensure the import for pytest is present in the test module if not already, and apply the same `@pytest.mark.asyncio` pattern to any other async tests missing it (e.g., other tests in this file and the noted tests across the codebase).
🧹 Nitpick comments (2)
executor_manager/services/sandbox/manager.py (1)
977-1027: Consider splitting GC code out of this oversized module.This file is now over the 1000-line project limit; moving GC/background-task logic into a dedicated module would keep
SandboxManagerfocused. As per coding guidelines, "File size limit: if a file exceeds 1000 lines, split it into multiple sub-modules".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@executor_manager/services/sandbox/manager.py` around lines 977 - 1027, The _collect_expired_sandboxes method and its GC loop should be moved out of the oversized SandboxManager file into a new dedicated module (e.g., sandbox_gc.py) to keep SandboxManager focused; refactor by creating a SandboxGC class or standalone async function that owns the logic currently in _collect_expired_sandboxes (including acquiring get_distributed_lock("sandbox_gc"), calling self._repository.get_active_sandbox_ids(), load_sandbox, remove_from_active_set, checking sandbox.is_expired(), and invoking _terminate_expired_sandbox for each expired id), inject or pass the same repository and logger dependencies and expose a simple start/stop or run method, update SandboxManager to delegate to the new module (call the new SandboxGC.run or similar) and remove the GC code from manager.py so the original file drops below the 1000-line limit.executor_manager/tests/services/test_sandbox_manager.py (1)
15-1096: Consider splitting this oversized test module.The test file exceeds the project’s 1000-line limit; separating GC/repository round-trip tests into dedicated modules would keep future sandbox tests easier to maintain. As per coding guidelines, "File size limit: if a file exceeds 1000 lines, split it into multiple sub-modules".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@executor_manager/tests/services/test_sandbox_manager.py` around lines 15 - 1096, The file is over 1000 lines; split the large TestSandboxManager module by moving the Redis/repository and GC-related tests into one or more new test modules (e.g., tests for repository persistence and garbage-collection logic). Specifically, extract tests referencing repository and GC behavior such as test_save_sandbox_success, test_save_sandbox_round_trip_preserves_timing_fields, test_save_sandbox_missing_task_id, test_load_sandbox_*, test_save_execution_*, test_load_execution_*, test_dict_to_execution_conversion, and _collect_expired_sandboxes tests into a new module; keep SandboxManager API/behavior tests (create_sandbox, get_sandbox, create_execution, scheduler/heartbeat integration) in the original file. Ensure shared fixtures/reset logic (reset_singletons fixture and sandbox_manager_with_mock_redis/mock_redis_client) are available to both modules (move them into conftest.py if necessary) and update imports/pytest discovery so tests run unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@executor_manager/services/sandbox/manager.py`:
- Around line 998-1009: The GC loop must handle legacy sandboxes with
sandbox.expires_at == None so they aren't skipped; in the loop that loads
sandboxes (use _repository.load_sandbox and the sandbox variable) change the
expiry check to first detect expires_at is None and then apply a fallback TTL
(e.g. compute fallback_expired = sandbox.created_at + DEFAULT_SANDBOX_TTL < now)
or treat None as expired immediately depending on intended policy, otherwise
call sandbox.is_expired(); add or reuse a DEFAULT_SANDBOX_TTL constant (or
Sandbox.default_ttl) and update the expired_task_ids append logic to use this
fallback branch.
---
Outside diff comments:
In `@executor_manager/tests/services/test_sandbox_manager.py`:
- Around line 990-1017: The async test function
test_collect_expired_sandboxes_terminates_old is missing the pytest-asyncio
decorator; add `@pytest.mark.asyncio` above the test definition so pytest runs the
coroutine correctly when invoking manager._collect_expired_sandboxes; ensure the
import for pytest is present in the test module if not already, and apply the
same `@pytest.mark.asyncio` pattern to any other async tests missing it (e.g.,
other tests in this file and the noted tests across the codebase).
---
Nitpick comments:
In `@executor_manager/services/sandbox/manager.py`:
- Around line 977-1027: The _collect_expired_sandboxes method and its GC loop
should be moved out of the oversized SandboxManager file into a new dedicated
module (e.g., sandbox_gc.py) to keep SandboxManager focused; refactor by
creating a SandboxGC class or standalone async function that owns the logic
currently in _collect_expired_sandboxes (including acquiring
get_distributed_lock("sandbox_gc"), calling
self._repository.get_active_sandbox_ids(), load_sandbox, remove_from_active_set,
checking sandbox.is_expired(), and invoking _terminate_expired_sandbox for each
expired id), inject or pass the same repository and logger dependencies and
expose a simple start/stop or run method, update SandboxManager to delegate to
the new module (call the new SandboxGC.run or similar) and remove the GC code
from manager.py so the original file drops below the 1000-line limit.
In `@executor_manager/tests/services/test_sandbox_manager.py`:
- Around line 15-1096: The file is over 1000 lines; split the large
TestSandboxManager module by moving the Redis/repository and GC-related tests
into one or more new test modules (e.g., tests for repository persistence and
garbage-collection logic). Specifically, extract tests referencing repository
and GC behavior such as test_save_sandbox_success,
test_save_sandbox_round_trip_preserves_timing_fields,
test_save_sandbox_missing_task_id, test_load_sandbox_*, test_save_execution_*,
test_load_execution_*, test_dict_to_execution_conversion, and
_collect_expired_sandboxes tests into a new module; keep SandboxManager
API/behavior tests (create_sandbox, get_sandbox, create_execution,
scheduler/heartbeat integration) in the original file. Ensure shared
fixtures/reset logic (reset_singletons fixture and
sandbox_manager_with_mock_redis/mock_redis_client) are available to both modules
(move them into conftest.py if necessary) and update imports/pytest discovery so
tests run unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: aad11176-c945-4bd7-b895-f7358d807350
📒 Files selected for processing (4)
executor_manager/common/config.pyexecutor_manager/services/sandbox/manager.pyexecutor_manager/services/sandbox/repository.pyexecutor_manager/tests/services/test_sandbox_manager.py
| expired_task_ids = [] | ||
| for task_id_str in active_task_ids: | ||
| sandbox = self._repository.load_sandbox(task_id_str) | ||
| if sandbox is None: | ||
| self._repository.remove_from_active_set(task_id_str) | ||
| logger.debug( | ||
| f"[SandboxManager] Cleaned orphaned ZSet entry: {task_id_str}" | ||
| ) | ||
| continue | ||
|
|
||
| if sandbox.is_expired(): | ||
| expired_task_ids.append(task_id_str) |
There was a problem hiding this comment.
Add a legacy fallback for sandboxes without expires_at.
Records saved before this change can load with expires_at=None; sandbox.is_expired() then returns False, so GC will skip those running containers until the Redis hash expires and the metadata needed to terminate them is gone.
🛠️ Suggested fallback for legacy records
expired_task_ids = []
+ legacy_cutoff = time.time() - self._config.timeout.redis_ttl
for task_id_str in active_task_ids:
sandbox = self._repository.load_sandbox(task_id_str)
if sandbox is None:
self._repository.remove_from_active_set(task_id_str)
logger.debug(
f"[SandboxManager] Cleaned orphaned ZSet entry: {task_id_str}"
)
continue
- if sandbox.is_expired():
+ if sandbox.expires_at is None:
+ if sandbox.last_activity_at <= legacy_cutoff:
+ expired_task_ids.append(task_id_str)
+ continue
+
+ if sandbox.is_expired():
expired_task_ids.append(task_id_str)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@executor_manager/services/sandbox/manager.py` around lines 998 - 1009, The GC
loop must handle legacy sandboxes with sandbox.expires_at == None so they aren't
skipped; in the loop that loads sandboxes (use _repository.load_sandbox and the
sandbox variable) change the expiry check to first detect expires_at is None and
then apply a fallback TTL (e.g. compute fallback_expired = sandbox.created_at +
DEFAULT_SANDBOX_TTL < now) or treat None as expired immediately depending on
intended policy, otherwise call sandbox.is_expired(); add or reuse a
DEFAULT_SANDBOX_TTL constant (or Sandbox.default_ttl) and update the
expired_task_ids append logic to use this fallback branch.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
executor_manager/tests/services/test_sandbox_manager.py (1)
990-1018:⚠️ Potential issue | 🔴 CriticalMissing
@pytest.mark.asyncio— this test will be skipped/pass trivially.
test_collect_expired_sandboxes_terminates_oldis defined asasync defbut no@pytest.mark.asynciodecorator is applied (unlike the siblingtest_collect_expired_sandboxes_cleans_orphanedat line 1020 and the newtest_collect_expired_sandboxes_skips_unexpiredat line 1035). Depending on pytest-asyncio mode, this test will either emit a warning and be skipped, or return an un-awaited coroutine somock_terminate.assert_called_once_with("12345")is never reached — meaning the core assertion for the fixed idle-based GC path is effectively not exercised.🐛 Proposed fix
+ `@pytest.mark.asyncio` async def test_collect_expired_sandboxes_terminates_old( self, sandbox_manager_with_mock_redis, mock_redis_client, mocker, sample_sandbox, ): """Test terminates sandboxes idle for more than two hours."""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@executor_manager/tests/services/test_sandbox_manager.py` around lines 990 - 1018, The async test test_collect_expired_sandboxes_terminates_old is missing the pytest async marker, so pytest may skip or not run the coroutine; add the `@pytest.mark.asyncio` decorator above test_collect_expired_sandboxes_terminates_old (and ensure pytest is imported in the test module if not already) so the coroutine runs and the core assertion that manager._collect_expired_sandboxes triggers manager.terminate_sandbox("12345") is actually executed.
🧹 Nitpick comments (1)
executor_manager/tests/services/test_sandbox_manager.py (1)
1000-1001: Couple the idle threshold toTimeoutConfig.sandbox_inactive_timeoutinstead of hard-coding 2h.Hard-coding
2 * 3600here duplicates the default ofsandbox_inactive_timeoutand will silently go stale if the default is ever changed (e.g., to 1h or 4h), making the test pass for the wrong reason. Prefer sourcing the threshold from config so the test stays aligned with production behavior.♻️ Proposed refactor
- mock_redis_client.zrange.return_value = ["12345"] - sample_sandbox.last_activity_at = time.time() - (2 * 3600) - 60 - sample_sandbox.expires_at = time.time() + 3600 + from executor_manager.common.config import get_config + + inactive_timeout = get_config().timeout.sandbox_inactive_timeout + mock_redis_client.zrange.return_value = ["12345"] + sample_sandbox.last_activity_at = time.time() - inactive_timeout - 60 + sample_sandbox.expires_at = time.time() + 3600🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@executor_manager/tests/services/test_sandbox_manager.py` around lines 1000 - 1001, The test sets sample_sandbox.last_activity_at using a hard-coded 2*3600; replace that with the configured idle threshold by importing and using TimeoutConfig.sandbox_inactive_timeout (e.g., set last_activity_at = time.time() - TimeoutConfig.sandbox_inactive_timeout - 60) and ensure TimeoutConfig is imported into executor_manager/tests/services/test_sandbox_manager.py so the test remains aligned with the production timeout logic used by the sandbox expiration code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@executor_manager/tests/services/test_sandbox_manager.py`:
- Around line 990-1018: The async test
test_collect_expired_sandboxes_terminates_old is missing the pytest async
marker, so pytest may skip or not run the coroutine; add the
`@pytest.mark.asyncio` decorator above
test_collect_expired_sandboxes_terminates_old (and ensure pytest is imported in
the test module if not already) so the coroutine runs and the core assertion
that manager._collect_expired_sandboxes triggers
manager.terminate_sandbox("12345") is actually executed.
---
Nitpick comments:
In `@executor_manager/tests/services/test_sandbox_manager.py`:
- Around line 1000-1001: The test sets sample_sandbox.last_activity_at using a
hard-coded 2*3600; replace that with the configured idle threshold by importing
and using TimeoutConfig.sandbox_inactive_timeout (e.g., set last_activity_at =
time.time() - TimeoutConfig.sandbox_inactive_timeout - 60) and ensure
TimeoutConfig is imported into
executor_manager/tests/services/test_sandbox_manager.py so the test remains
aligned with the production timeout logic used by the sandbox expiration code.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8bf5ca37-0f4a-4e1a-b115-79cbf57daabb
📒 Files selected for processing (6)
executor_manager/common/config.pyexecutor_manager/routers/sandbox.pyexecutor_manager/schemas/sandbox.pyexecutor_manager/services/sandbox/manager.pyexecutor_manager/services/sandbox/repository.pyexecutor_manager/tests/services/test_sandbox_manager.py
✅ Files skipped from review due to trivial changes (2)
- executor_manager/routers/sandbox.py
- executor_manager/schemas/sandbox.py
🚧 Files skipped from review as they are similar to previous changes (2)
- executor_manager/services/sandbox/repository.py
- executor_manager/services/sandbox/manager.py
Summary by CodeRabbit
New Features
Bug Fixes
Documentation