fix: route himalaya chat commands to device backend#818
fix: route himalaya chat commands to device backend#818icycrystal4 wants to merge 4 commits intowecode-ai:mainfrom
Conversation
📝 WalkthroughWalkthroughAdds device-backed sandbox capabilities: new public/internal API endpoints, a DeviceSandboxService for selecting/dispatching to online devices via Socket.IO, device-side sandbox handlers and client routing in chat-shell tools, plus extensive tests and config updates for device-routed command behavior. Changes
Sequence DiagramsequenceDiagram
participant Client as User / Tool
participant API as Backend API
participant DB as Database
participant Service as DeviceSandboxService
participant SIO as Socket.IO Server
participant Device as Device Executor
Client->>API: POST /sandbox/exec\n(command, working_dir, timeout, ...)
API->>API: authenticate user
API->>Service: execute_command(user_id,...)
Service->>DB: query online devices / sticky binding
DB-->>Service: devices / binding
Service->>Service: filter by device_id/capability\napply priority / persist binding
Service->>SIO: call(namespace="/local-executor", event="sandbox:exec", payload, timeout)
SIO->>Device: deliver sandbox:exec event
Device->>Device: perform exec / file / attachment op
Device-->>SIO: return dict result (success, stdout, stderr, exit_code, execution_time, device_id)
SIO-->>Service: response dict
Service->>Service: validate & normalize response
Service-->>API: result dict
API-->>Client: HTTP JSON response
Estimated Code Review Effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Important Merge conflicts detected (Beta)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/api/api.py (1)
56-64:⚠️ Potential issue | 🟡 MinorFix isort import ordering.
The pipeline indicates an isort check failure. The imports in this block need to be reordered.
🔧 Run isort to fix import ordering
isort --write app/api/api.pyOr manually ensure alphabetical ordering within the import block.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/api/api.py` around lines 56 - 64, Import ordering in the import block is failing isort; reorder the names alphabetically within the tuple so the entries are sorted (callback_router, chat_storage_router, devices_router as internal_devices_router, services_router, skills_router, subscriptions_router, tables_router) or run isort --write on the file to apply the fix; update the import statement in api.py to reflect the corrected alphabetical order for those symbols.
🧹 Nitpick comments (7)
backend/app/api/ws/device_namespace.py (1)
168-181: Docstring missingcapabilitiesparameter.The
capabilitiesparameter was added to_register_device, but the docstring (lines 170-181) does not document it.📝 Add capabilities to docstring
Args: user_id: Device owner user ID device_id: Device unique identifier (stored in Kind.name) name: Device display name client_ip: Device's client IP address device_type: Device type ('local' or 'cloud') bind_shell: Shell runtime binding ('claudecode' or 'openclaw') + capabilities: Optional capability tags for the device Returns (success, error_message).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/api/ws/device_namespace.py` around lines 168 - 181, The docstring for _register_device is missing documentation for the new capabilities parameter; update the function _register_device's docstring to include a short description of capabilities (type Optional[list[str]]), what it represents (e.g., list of device capability strings or features), and its role in registration, and add it to the "Args:" section consistent with the existing style and formatting.backend/app/services/device_service.py (1)
381-381: Consider the difference between create and update behavior forcapabilities.When updating an existing device (lines 325-326),
capabilitiesis only written if notNone, preserving the existing value. However, when creating a new device,capabilitiesis set directly even ifNone, resulting in"capabilities": Nonein the spec.This asymmetry may be intentional, but if you want consistency, consider omitting the field entirely when
None:🔧 Optional: Omit capabilities from spec when None
"spec": { "deviceId": device_id, "displayName": name, "deviceType": resolved_device_type, "connectionMode": "websocket", "bindShell": resolved_bind_shell, "isDefault": is_first_device, - "capabilities": capabilities, "clientIp": client_ip, },Then conditionally add it after
device_jsoncreation:if capabilities is not None: device_json["spec"]["capabilities"] = capabilities🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/device_service.py` at line 381, The creation path currently unconditionally sets "capabilities": capabilities on the new device spec which can produce "capabilities": None; make create behave like the update path by omitting the field when capabilities is None. After building device_json (the new device spec), only add device_json["spec"]["capabilities"] if the local variable capabilities is not None (mirroring the update logic that writes capabilities only when not None), so existing devices keep their value and new specs don't contain a None capability entry.chat_shell/chat_shell/tools/sandbox/_base.py (1)
33-35: Regex pattern may miss some valid command invocations.The pattern
HIMALAYA_COMMAND_PATTERNwon't match commands like./himalaya,/usr/bin/himalaya, or subshell invocations like$(himalaya ...). This is acceptable for the current scope but may need expansion if more invocation patterns are expected.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@chat_shell/chat_shell/tools/sandbox/_base.py` around lines 33 - 35, HIMALAYA_COMMAND_PATTERN currently only matches bare "himalaya" or "which command -v himalaya" forms and will miss invocations like "./himalaya", "/usr/bin/himalaya" and subshell forms "$(himalaya ...)" — update the HIMALAYA_COMMAND_PATTERN regex to allow optional relative ("./") and absolute path prefixes and to detect subshell invocations (e.g. starting with "$(") or parentheses, ensuring you still anchor on word boundaries (use \b or equivalent) so you don't overmatch; update tests covering Sandbox._base behavior to include examples like "./himalaya", "/usr/bin/himalaya", and "$(himalaya args)".backend/tests/services/test_device_sandbox_service.py (1)
69-72: Consider asserting the actual Socket.IO call arguments.The test asserts
mock_sio.call.assert_awaited_once()but doesn't verify the call arguments. Adding argument assertions would strengthen the test by ensuring the correct event name, payload, and namespace are used.♻️ Suggested enhancement
assert result["success"] is True assert result["device_id"] == "cloud-device" mock_online_info.assert_awaited_once_with(1, "cloud-device") - mock_sio.call.assert_awaited_once() + mock_sio.call.assert_awaited_once() + call_args = mock_sio.call.call_args + assert call_args.args[0] == "sandbox:exec" + assert call_args.kwargs["namespace"] == "/local-executor"executor/tests/test_local_sandbox_handler.py (1)
36-39: Consider adding assertions forexecution_timefield.Both tests verify the result dict structure but don't assert that
execution_timeexists and has a valid value (>= 0). This would provide more complete coverage of the return contract.♻️ Suggested enhancement
assert result["success"] is True assert result["stdout"] == "hello\n" assert result["stderr"] == "" assert result["exit_code"] == 0 + assert "execution_time" in result + assert result["execution_time"] >= 0And similarly for the timeout test:
assert result["success"] is False assert result["stdout"] == "partial" assert "timed out" in result["stderr"] assert result["exit_code"] == -1 + assert "execution_time" in result + assert result["execution_time"] >= 0Also applies to: 60-63
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@executor/tests/test_local_sandbox_handler.py` around lines 36 - 39, Add assertions in both tests that the returned result dict contains an execution_time field and that its value is a non-negative number: assert "execution_time" in result and assert isinstance(result["execution_time"], (int, float)) and result["execution_time"] >= 0; do this for the success case (where result["success"] is True) and for the timeout test (where exit_code/timeout behavior is asserted) so the contract guarantees a present, valid execution_time in all code paths.backend/app/services/device_sandbox_service.py (2)
27-131: Splitexecute_commandinto smaller helpers.This method is significantly over the preferred function-size limit and mixes selection, dispatch, normalization, and logging in one block.
As per coding guidelines, "Function length: Max 50 lines per function (preferred)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/device_sandbox_service.py` around lines 27 - 131, The execute_command function is too large and mixes device selection, socket dispatch, response normalization, and logging; split it into smaller helpers: keep execute_command as the orchestrator that calls (1) _select_target_device and device_service.get_device_online_info to resolve target_device and socket_id, (2) a new helper (e.g., _dispatch_to_device or send_sandbox_exec) that wraps get_sio().call and error handling, and (3) a new helper (e.g., _normalize_sandbox_response) that validates and normalizes response fields (execution_time, exit_code, stdout, stderr, success). Move logging into either the orchestrator and/or each helper where appropriate so each function stays small and focused, and update execute_command to assemble and return the final dict.
83-83: Extract routing and timeout literals into named constants.Timeout buffer/minimum, fallback exit code, and priority ranks are magic numbers right now, which makes behavior tuning/error-prone.
As per coding guidelines, "Python: Extract magic numbers to constants".
Also applies to: 102-107, 157-166
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/device_sandbox_service.py` at line 83, Extract the magic numbers into module-level constants (e.g. TIMEOUT_BUFFER = 5, TIMEOUT_MIN = 30, FALLBACK_EXIT_CODE = <value>, and PRIORITY_RANKS = {...}) and use them where the literal values appear (replace timeout=max(timeout_seconds + 5, 30) with timeout=max(timeout_seconds + TIMEOUT_BUFFER, TIMEOUT_MIN) and replace the fallback exit code and hard-coded priority ranks in the same file). Add the new constants near the top of device_sandbox_service.py and update all occurrences referenced in the comment (the timeout computation using timeout_seconds, the fallback exit code usage, and the priority rank values) to use the constants instead of literals.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/api/endpoints/internal/devices.py`:
- Around line 43-47: The async handler execute_device_sandbox_command_internal
is missing OpenTelemetry tracing; add the trace_async decorator above the
function using the project's convention: `@trace_async`(span_name, tracer_name,
extract_attributes). Import trace_async if needed and choose a clear span_name
like "execute_device_sandbox_command_internal" and the appropriate tracer_name
and extract_attributes function used elsewhere (match other endpoints' usage) so
the entire async function is traced.
- Around line 42-46: The execute_device_sandbox_command_internal endpoint
currently allows caller-controlled request.user_id; protect it by adding an
explicit internal auth dependency (e.g., add a
Depends(get_internal_service_auth) or Depends(verify_internal_service) to the
execute_device_sandbox_command_internal signature) and enforce an impersonation
check before running sandbox commands: use a policy function (e.g.,
can_impersonate(calling_service, request.user_id) or
enforce_impersonation_policy(auth_principal, request.user_id)) and return 403 if
not allowed; keep the existing db: Session = Depends(get_db) and request:
InternalDeviceSandboxExecRequest but ensure the new auth dependency supplies the
caller identity used by the impersonation check.
In `@backend/app/services/device_sandbox_service.py`:
- Around line 27-37: The async method execute_command should be instrumented
with the OpenTelemetry async tracer: add the `@trace_async` decorator above its
definition using a descriptive span name (e.g.,
"device_sandbox.execute_command"), the appropriate tracer_name, and an
extract_attributes function that captures relevant args (user_id, device_id,
command); similarly add `@trace_async` or `@trace_sync` (as appropriate) to the
other service methods referenced in the review (the methods covering the ranges
around lines 133-140 and 194-200) using span names like
"device_sandbox.<method_name>" and the same tracer_name/extract_attributes
pattern so the entire call is traced end-to-end. Ensure you import
trace_async/trace_sync and the attribute extractor and place the decorator
immediately above each function definition (preserve async vs sync choice per
method signature).
- Around line 109-126: Normalize the Socket.IO response fields before using
truthiness or len(): compute a normalized_success by checking
response.get("success") — if it's a bool use it, if it's a str compare lower()
to "true"/"1" (or "false"/"0") to derive a boolean, otherwise fall back to
exit_code == 0; compute normalized_stdout and normalized_stderr by pulling
response.get("stdout"/"stderr"), treating None as "", if the value is bytes
decode to str, if it's not a sized string/bytes convert to str(value) so len()
is safe; then use normalized_success, normalized_stdout, normalized_stderr for
logger.info and the returned dict (referencing response, exit_code, logger.info
and the DeviceSandboxService command handling code).
In `@backend/init_data/skills/sandbox/command_tool.py`:
- Around line 198-221: The device-backend branch calling
sandbox_manager.execute_command_via_device lacks exception handling and can
raise httpx.HTTPStatusError or connection errors; wrap the call in a try/except
that catches httpx.HTTPStatusError and httpx.RequestError (or Exception as a
fallback), log the exception, call self._emit_tool_status("failed", ...) with a
clear message and error details (include response-like dict with
error/message/exit_code), and return a json.dumps(...) of that error dict so
error formatting is consistent with the existing success/failure handling in the
should_use_device_backend_for_command branch.
---
Outside diff comments:
In `@backend/app/api/api.py`:
- Around line 56-64: Import ordering in the import block is failing isort;
reorder the names alphabetically within the tuple so the entries are sorted
(callback_router, chat_storage_router, devices_router as
internal_devices_router, services_router, skills_router, subscriptions_router,
tables_router) or run isort --write on the file to apply the fix; update the
import statement in api.py to reflect the corrected alphabetical order for those
symbols.
---
Nitpick comments:
In `@backend/app/api/ws/device_namespace.py`:
- Around line 168-181: The docstring for _register_device is missing
documentation for the new capabilities parameter; update the function
_register_device's docstring to include a short description of capabilities
(type Optional[list[str]]), what it represents (e.g., list of device capability
strings or features), and its role in registration, and add it to the "Args:"
section consistent with the existing style and formatting.
In `@backend/app/services/device_sandbox_service.py`:
- Around line 27-131: The execute_command function is too large and mixes device
selection, socket dispatch, response normalization, and logging; split it into
smaller helpers: keep execute_command as the orchestrator that calls (1)
_select_target_device and device_service.get_device_online_info to resolve
target_device and socket_id, (2) a new helper (e.g., _dispatch_to_device or
send_sandbox_exec) that wraps get_sio().call and error handling, and (3) a new
helper (e.g., _normalize_sandbox_response) that validates and normalizes
response fields (execution_time, exit_code, stdout, stderr, success). Move
logging into either the orchestrator and/or each helper where appropriate so
each function stays small and focused, and update execute_command to assemble
and return the final dict.
- Line 83: Extract the magic numbers into module-level constants (e.g.
TIMEOUT_BUFFER = 5, TIMEOUT_MIN = 30, FALLBACK_EXIT_CODE = <value>, and
PRIORITY_RANKS = {...}) and use them where the literal values appear (replace
timeout=max(timeout_seconds + 5, 30) with timeout=max(timeout_seconds +
TIMEOUT_BUFFER, TIMEOUT_MIN) and replace the fallback exit code and hard-coded
priority ranks in the same file). Add the new constants near the top of
device_sandbox_service.py and update all occurrences referenced in the comment
(the timeout computation using timeout_seconds, the fallback exit code usage,
and the priority rank values) to use the constants instead of literals.
In `@backend/app/services/device_service.py`:
- Line 381: The creation path currently unconditionally sets "capabilities":
capabilities on the new device spec which can produce "capabilities": None; make
create behave like the update path by omitting the field when capabilities is
None. After building device_json (the new device spec), only add
device_json["spec"]["capabilities"] if the local variable capabilities is not
None (mirroring the update logic that writes capabilities only when not None),
so existing devices keep their value and new specs don't contain a None
capability entry.
In `@chat_shell/chat_shell/tools/sandbox/_base.py`:
- Around line 33-35: HIMALAYA_COMMAND_PATTERN currently only matches bare
"himalaya" or "which command -v himalaya" forms and will miss invocations like
"./himalaya", "/usr/bin/himalaya" and subshell forms "$(himalaya ...)" — update
the HIMALAYA_COMMAND_PATTERN regex to allow optional relative ("./") and
absolute path prefixes and to detect subshell invocations (e.g. starting with
"$(") or parentheses, ensuring you still anchor on word boundaries (use \b or
equivalent) so you don't overmatch; update tests covering Sandbox._base behavior
to include examples like "./himalaya", "/usr/bin/himalaya", and "$(himalaya
args)".
In `@executor/tests/test_local_sandbox_handler.py`:
- Around line 36-39: Add assertions in both tests that the returned result dict
contains an execution_time field and that its value is a non-negative number:
assert "execution_time" in result and assert
isinstance(result["execution_time"], (int, float)) and result["execution_time"]
>= 0; do this for the success case (where result["success"] is True) and for the
timeout test (where exit_code/timeout behavior is asserted) so the contract
guarantees a present, valid execution_time in all code paths.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 46ff243b-45c0-4acc-97e7-e834249cf4af
📒 Files selected for processing (18)
backend/app/api/api.pybackend/app/api/endpoints/devices.pybackend/app/api/endpoints/internal/__init__.pybackend/app/api/endpoints/internal/devices.pybackend/app/api/ws/device_namespace.pybackend/app/services/device_sandbox_service.pybackend/app/services/device_service.pybackend/init_data/skills/sandbox/command_tool.pybackend/tests/services/test_device_sandbox_service.pychat_shell/chat_shell/skills/registry.pychat_shell/chat_shell/tools/sandbox/_base.pychat_shell/tests/test_skill_registry.pyexecutor/modes/local/__init__.pyexecutor/modes/local/events.pyexecutor/modes/local/handlers.pyexecutor/modes/local/runner.pyexecutor/modes/local/websocket_client.pyexecutor/tests/test_local_sandbox_handler.py
| @router.post("/sandbox/exec", response_model=DeviceSandboxExecResponse) | ||
| async def execute_device_sandbox_command_internal( | ||
| request: InternalDeviceSandboxExecRequest, | ||
| db: Session = Depends(get_db), | ||
| ) -> DeviceSandboxExecResponse: |
There was a problem hiding this comment.
Protect this internal exec endpoint with explicit internal authz/authn.
Line 42 exposes remote command execution with caller-controlled request.user_id. Combined with backend/app/api/api.py (Lines 178-181) including this router without route-level dependencies, this is a privilege-escalation path if /internal is reachable. Add a strict internal-service auth dependency and enforce impersonation policy before executing.
🔐 Example hardening sketch
-from fastapi import APIRouter, Depends, HTTPException, status
+from fastapi import APIRouter, Depends, HTTPException, status
+# from app.api.dependencies import require_internal_service_auth
...
-router = APIRouter(prefix="/devices", tags=["internal-devices"])
+router = APIRouter(
+ prefix="/devices",
+ tags=["internal-devices"],
+ # dependencies=[Depends(require_internal_service_auth)],
+)🧰 Tools
🪛 Ruff (0.15.6)
[warning] 45-45: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/endpoints/internal/devices.py` around lines 42 - 46, The
execute_device_sandbox_command_internal endpoint currently allows
caller-controlled request.user_id; protect it by adding an explicit internal
auth dependency (e.g., add a Depends(get_internal_service_auth) or
Depends(verify_internal_service) to the execute_device_sandbox_command_internal
signature) and enforce an impersonation check before running sandbox commands:
use a policy function (e.g., can_impersonate(calling_service, request.user_id)
or enforce_impersonation_policy(auth_principal, request.user_id)) and return 403
if not allowed; keep the existing db: Session = Depends(get_db) and request:
InternalDeviceSandboxExecRequest but ensure the new auth dependency supplies the
caller identity used by the impersonation check.
| async def execute_device_sandbox_command_internal( | ||
| request: InternalDeviceSandboxExecRequest, | ||
| db: Session = Depends(get_db), | ||
| ) -> DeviceSandboxExecResponse: | ||
| """Execute a command on a user's device for internal trusted services.""" |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add OpenTelemetry tracing on the async endpoint handler.
Line 43 defines an async API handler without the required trace decorator/span metadata, which leaves this execution path under-instrumented.
As per coding guidelines, "Use @trace_async(span_name, tracer_name, extract_attributes) decorator to trace entire async functions in OpenTelemetry".
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 45-45: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/endpoints/internal/devices.py` around lines 43 - 47, The
async handler execute_device_sandbox_command_internal is missing OpenTelemetry
tracing; add the trace_async decorator above the function using the project's
convention: `@trace_async`(span_name, tracer_name, extract_attributes). Import
trace_async if needed and choose a clear span_name like
"execute_device_sandbox_command_internal" and the appropriate tracer_name and
extract_attributes function used elsewhere (match other endpoints' usage) so the
entire async function is traced.
| async def execute_command( | ||
| self, | ||
| db: Session, | ||
| user_id: int, | ||
| command: str, | ||
| working_dir: str = "/home/user", | ||
| timeout_seconds: int = 300, | ||
| required_capability: Optional[str] = None, | ||
| device_id: Optional[str] = None, | ||
| ) -> dict[str, Any]: | ||
| """Execute a command on a user's online device via Socket.IO.""" |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Instrument service methods with OpenTelemetry decorators.
The new async/sync methods are not decorated for tracing, so this critical routing/dispatch path loses standard observability coverage.
As per coding guidelines, "Use @trace_async(span_name, tracer_name, extract_attributes) decorator to trace entire async functions in OpenTelemetry" and "Use @trace_sync(span_name, tracer_name, extract_attributes) decorator to trace entire sync functions in OpenTelemetry".
Also applies to: 133-140, 194-200
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/services/device_sandbox_service.py` around lines 27 - 37, The
async method execute_command should be instrumented with the OpenTelemetry async
tracer: add the `@trace_async` decorator above its definition using a descriptive
span name (e.g., "device_sandbox.execute_command"), the appropriate tracer_name,
and an extract_attributes function that captures relevant args (user_id,
device_id, command); similarly add `@trace_async` or `@trace_sync` (as appropriate)
to the other service methods referenced in the review (the methods covering the
ranges around lines 133-140 and 194-200) using span names like
"device_sandbox.<method_name>" and the same tracer_name/extract_attributes
pattern so the entire call is traced end-to-end. Ensure you import
trace_async/trace_sync and the attribute extractor and place the decorator
immediately above each function definition (preserve async vs sync choice per
method signature).
| logger.info( | ||
| "[DeviceSandboxService] Device command completed: user_id=%s, device_id=%s, " | ||
| "socket_id=%s, success=%s, exit_code=%s, execution_time=%.2fs, " | ||
| "stdout_len=%s, stderr_len=%s", | ||
| user_id, | ||
| target_device_id, | ||
| socket_id, | ||
| bool(response.get("success", exit_code == 0)), | ||
| exit_code, | ||
| execution_time, | ||
| len(response.get("stdout", "") or ""), | ||
| len(response.get("stderr", "") or ""), | ||
| ) | ||
|
|
||
| return { | ||
| "success": bool(response.get("success", exit_code == 0)), | ||
| "stdout": response.get("stdout", "") or "", | ||
| "stderr": response.get("stderr", "") or "", |
There was a problem hiding this comment.
Avoid unsafe truthiness and len() on untyped Socket.IO payload fields.
Line 116 can turn "false" into True (bool("false")), and Lines 119-120 can raise TypeError if stdout/stderr are non-sized values. This can produce incorrect success reporting or a 500 on malformed device responses.
🛠️ Safer normalization before logging/return
+ success_raw = response.get("success")
+ if isinstance(success_raw, bool):
+ success = success_raw
+ elif isinstance(success_raw, str):
+ success = success_raw.strip().lower() in {"1", "true", "yes"}
+ else:
+ success = exit_code == 0
+
+ stdout_raw = response.get("stdout", "")
+ stderr_raw = response.get("stderr", "")
+ stdout = stdout_raw if isinstance(stdout_raw, str) else ("" if stdout_raw is None else str(stdout_raw))
+ stderr = stderr_raw if isinstance(stderr_raw, str) else ("" if stderr_raw is None else str(stderr_raw))
+
logger.info(
"[DeviceSandboxService] Device command completed: user_id=%s, device_id=%s, "
"socket_id=%s, success=%s, exit_code=%s, execution_time=%.2fs, "
"stdout_len=%s, stderr_len=%s",
user_id,
target_device_id,
socket_id,
- bool(response.get("success", exit_code == 0)),
+ success,
exit_code,
execution_time,
- len(response.get("stdout", "") or ""),
- len(response.get("stderr", "") or ""),
+ len(stdout),
+ len(stderr),
)
return {
- "success": bool(response.get("success", exit_code == 0)),
- "stdout": response.get("stdout", "") or "",
- "stderr": response.get("stderr", "") or "",
+ "success": success,
+ "stdout": stdout,
+ "stderr": stderr,
"exit_code": exit_code,
"execution_time": execution_time,
"device_id": target_device_id,
"backend": "device",
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.info( | |
| "[DeviceSandboxService] Device command completed: user_id=%s, device_id=%s, " | |
| "socket_id=%s, success=%s, exit_code=%s, execution_time=%.2fs, " | |
| "stdout_len=%s, stderr_len=%s", | |
| user_id, | |
| target_device_id, | |
| socket_id, | |
| bool(response.get("success", exit_code == 0)), | |
| exit_code, | |
| execution_time, | |
| len(response.get("stdout", "") or ""), | |
| len(response.get("stderr", "") or ""), | |
| ) | |
| return { | |
| "success": bool(response.get("success", exit_code == 0)), | |
| "stdout": response.get("stdout", "") or "", | |
| "stderr": response.get("stderr", "") or "", | |
| success_raw = response.get("success") | |
| if isinstance(success_raw, bool): | |
| success = success_raw | |
| elif isinstance(success_raw, str): | |
| success = success_raw.strip().lower() in {"1", "true", "yes"} | |
| else: | |
| success = exit_code == 0 | |
| stdout_raw = response.get("stdout", "") | |
| stderr_raw = response.get("stderr", "") | |
| stdout = stdout_raw if isinstance(stdout_raw, str) else ("" if stdout_raw is None else str(stdout_raw)) | |
| stderr = stderr_raw if isinstance(stderr_raw, str) else ("" if stderr_raw is None else str(stderr_raw)) | |
| logger.info( | |
| "[DeviceSandboxService] Device command completed: user_id=%s, device_id=%s, " | |
| "socket_id=%s, success=%s, exit_code=%s, execution_time=%.2fs, " | |
| "stdout_len=%s, stderr_len=%s", | |
| user_id, | |
| target_device_id, | |
| socket_id, | |
| success, | |
| exit_code, | |
| execution_time, | |
| len(stdout), | |
| len(stderr), | |
| ) | |
| return { | |
| "success": success, | |
| "stdout": stdout, | |
| "stderr": stderr, | |
| "exit_code": exit_code, | |
| "execution_time": execution_time, | |
| "device_id": target_device_id, | |
| "backend": "device", | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/services/device_sandbox_service.py` around lines 109 - 126,
Normalize the Socket.IO response fields before using truthiness or len():
compute a normalized_success by checking response.get("success") — if it's a
bool use it, if it's a str compare lower() to "true"/"1" (or "false"/"0") to
derive a boolean, otherwise fall back to exit_code == 0; compute
normalized_stdout and normalized_stderr by pulling
response.get("stdout"/"stderr"), treating None as "", if the value is bytes
decode to str, if it's not a sized string/bytes convert to str(value) so len()
is safe; then use normalized_success, normalized_stdout, normalized_stderr for
logger.info and the returned dict (referencing response, exit_code, logger.info
and the DeviceSandboxService command handling code).
| if sandbox_manager.should_use_device_backend_for_command(raw_command): | ||
| logger.info( | ||
| "[SandboxCommandTool] Routing command to device backend: %s", | ||
| raw_command[:100], | ||
| ) | ||
| response = await sandbox_manager.execute_command_via_device( | ||
| command=raw_command, | ||
| working_dir=working_dir, | ||
| timeout_seconds=effective_timeout, | ||
| required_capability="himalaya_mail", | ||
| ) | ||
|
|
||
| if response.get("success"): | ||
| await self._emit_tool_status( | ||
| "completed", "Command executed successfully", response | ||
| ) | ||
| else: | ||
| await self._emit_tool_status( | ||
| "failed", | ||
| f"Command failed with exit code {response.get('exit_code', -1)}", | ||
| response, | ||
| ) | ||
|
|
||
| return json.dumps(response, ensure_ascii=False, indent=2) |
There was a problem hiding this comment.
Missing exception handling for device backend HTTP calls.
The execute_command_via_device call (from _base.py) can raise httpx.HTTPStatusError or connection errors, but this block has no try/except. If the device backend is unreachable or returns a non-2xx status, the exception will propagate unhandled, bypassing the existing error formatting logic.
Consider wrapping in try/except to provide a consistent error response format.
🛡️ Proposed fix to add exception handling
if sandbox_manager.should_use_device_backend_for_command(raw_command):
logger.info(
"[SandboxCommandTool] Routing command to device backend: %s",
raw_command[:100],
)
- response = await sandbox_manager.execute_command_via_device(
- command=raw_command,
- working_dir=working_dir,
- timeout_seconds=effective_timeout,
- required_capability="himalaya_mail",
- )
+ try:
+ response = await sandbox_manager.execute_command_via_device(
+ command=raw_command,
+ working_dir=working_dir,
+ timeout_seconds=effective_timeout,
+ required_capability="himalaya_mail",
+ )
+ except Exception as e:
+ logger.error(
+ "[SandboxCommandTool] Device backend execution failed: %s",
+ e,
+ exc_info=True,
+ )
+ error_msg = f"Device backend execution failed: {e}"
+ result = self._format_error(
+ error_message=error_msg,
+ stdout="",
+ stderr=str(e),
+ exit_code=-1,
+ execution_time=time.time() - start_time,
+ suggestion="Check if a compatible device is online.",
+ )
+ await self._emit_tool_status("failed", error_msg)
+ return result
if response.get("success"):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/init_data/skills/sandbox/command_tool.py` around lines 198 - 221, The
device-backend branch calling sandbox_manager.execute_command_via_device lacks
exception handling and can raise httpx.HTTPStatusError or connection errors;
wrap the call in a try/except that catches httpx.HTTPStatusError and
httpx.RequestError (or Exception as a fallback), log the exception, call
self._emit_tool_status("failed", ...) with a clear message and error details
(include response-like dict with error/message/exit_code), and return a
json.dumps(...) of that error dict so error formatting is consistent with the
existing success/failure handling in the should_use_device_backend_for_command
branch.
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
backend/app/api/endpoints/internal/devices.py (1)
154-183:⚠️ Potential issue | 🟠 MajorInternal endpoint allows caller-controlled
user_idwithout authentication.This endpoint accepts
user_idfrom the request body without any authentication dependency. Per context snippet frombackend/app/api/api.py, the/internalrouter is included without route-level auth dependencies.If
/api/internal/*is exposed to untrusted networks, an attacker could execute commands on any user's device by specifying arbitraryuser_id. Ensure either:
- Network-level isolation (internal endpoints only reachable from trusted services)
- Add explicit internal service authentication dependency
🔐 Add internal auth dependency
+from app.api.dependencies import get_internal_service_auth + -router = APIRouter(prefix="/devices", tags=["internal-devices"]) +router = APIRouter( + prefix="/devices", + tags=["internal-devices"], + dependencies=[Depends(get_internal_service_auth)], +)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/api/endpoints/internal/devices.py` around lines 154 - 183, The internal endpoint execute_device_sandbox_command_internal currently accepts a caller-controlled user_id from InternalDeviceSandboxExecRequest with no auth; add an explicit internal-auth dependency (e.g., Depends(get_internal_service) or Depends(require_internal_token)) to the endpoint signature and verify the dependency before using request.user_id, or alternatively remove user_id from the request and derive the target user from the authenticated internal service context; update the router/endpoint signature to include that dependency and fail with HTTP 401/403 if the internal auth check fails so only trusted services can invoke device_sandbox_service.execute_command.
🧹 Nitpick comments (8)
chat_shell/chat_shell/agents/graph_builder.py (2)
278-278: Consider iterable unpacking for cleaner list construction.Per static analysis (Ruff RUF005), prefer
[*lc_messages, item]overlist(lc_messages) + [item].♻️ Proposed fix
- final_messages = list(lc_messages) + [HumanMessage(content=system_notice)] + final_messages = [*lc_messages, HumanMessage(content=system_notice)]Apply similarly at lines 300 and 316.
Also applies to: 300-300, 316-316
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@chat_shell/chat_shell/agents/graph_builder.py` at line 278, Replace list concatenation using list(lc_messages) + [item] with iterable unpacking for clarity and to satisfy Ruff RUF005; specifically update the construction of final_messages where final_messages = list(lc_messages) + [HumanMessage(content=system_notice)] to use [*lc_messages, HumanMessage(content=system_notice)], and apply the same change at the two other occurrences referenced (the similar constructions at lines around 300 and 316) so each uses [*lc_messages, ...] instead of list(...) + [...] in the GraphBuilder/graph building code.
289-292: Inconsistent dict content extraction between helper methods.
_stream_final_response_without_toolsextracts text from any dict with a"text"key (line 290), while_build_final_state_without_toolsrequirespart.get("type") == "text"(line 312). This could cause different behavior between streaming and non-streaming recovery paths for the same tool-limit scenario.Consider aligning both methods. The type-checked approach (matching
get_final_contentandon_chain_endlogic) is safer:♻️ Proposed fix to align streaming helper with non-streaming
elif isinstance(part, dict): - text = part.get("text", "") - if text: - yield text + if part.get("type") == "text": + text = part.get("text", "") + if text: + yield textAlso applies to: 312-313
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@chat_shell/chat_shell/agents/graph_builder.py` around lines 289 - 292, The streaming helper _stream_final_response_without_tools currently yields any dict with a "text" key, while _build_final_state_without_tools only accepts dicts where part.get("type") == "text"; update _stream_final_response_without_tools to mirror the type-checked behavior: when encountering a dict, check part.get("type") == "text" before extracting part.get("text", "") and yielding it so its behavior aligns with get_final_content / on_chain_end and the non-streaming recovery path.backend/app/services/device_sandbox_service.py (2)
231-248: Socket.IO dispatch lacks retry logic for transient failures.The
sio.call()invocation catches all exceptions and immediately raisesDeviceSandboxError. For transient network issues, consider adding retry with exponential backoff before failing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/device_sandbox_service.py` around lines 231 - 248, The DeviceSandboxService currently calls sio.call(...) and immediately raises DeviceSandboxError on any Exception; add retry-with-exponential-backoff around the sio.call invocation to handle transient network errors: implement a small retry loop (e.g., 3 attempts) around sio.call in the function containing that code, catching only transient exceptions (network/timeouts) and sleeping with backoff between attempts, log each retry via logger.error/warn including user_id, target_device_id and event_name, and only raise DeviceSandboxError after all attempts fail (preserve original exception as the cause); keep existing timeout logic and ensure you still pass the same parameters (event_name, payload, to=socket_id, namespace="/local-executor", timeout=...) when retrying.
60-78: Parameterformatshadows Python builtin.The
formatparameter name inread_fileandwrite_fileshadows Python's built-informat()function (Ruff A002).♻️ Rename to `file_format`
async def read_file( self, db: Session, user_id: int, file_path: str, - format: str = "text", + file_format: str = "text", device_id: Optional[str] = None, task_id: Optional[int] = None, ) -> dict[str, Any]: """Read a file from the bound device.""" return await self._execute_device_event( ... - payload={"file_path": file_path, "format": format}, + payload={"file_path": file_path, "format": file_format},Also applies to
write_fileat line 106.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/device_sandbox_service.py` around lines 60 - 78, The parameter name format in read_file and write_file shadows the built-in; rename it to file_format in both function signatures and all internal references (including type hints and default value), and update the payload construction passed to _execute_device_event to use the original "format" key set to file_format (e.g., payload={"file_path": file_path, "format": file_format}); also update any local variable uses and external callsites to use the new parameter name (functions: read_file, write_file, and any tests or callers).executor/modes/local/handlers.py (2)
769-775: HTTP download does not verify TLS certificates for http:// URLs.The
requests.get()call uses default settings. Ifdownload_urlstarts withhttp://, credentials in theAuthorizationheader are sent in cleartext. Consider logging a warning or enforcing HTTPS for production use.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@executor/modes/local/handlers.py` around lines 769 - 775, The HTTP download call using requests.get with download_url, auth_token, timeout_seconds and stream=True may send Authorization in cleartext for http:// URLs; update the code around requests.get/response in executor/modes/local/handlers.py (where download_url and auth_token are used) to enforce or require HTTPS (e.g., validate download_url.startswith("https://") and raise or abort for non-HTTPS), or at minimum log a clear warning before making the request when the URL scheme is http://; ensure any decision leaves the existing response.raise_for_status() flow intact.
382-390: Store the task reference returned byasyncio.create_task.The fire-and-forget task created here is not stored, which means if it raises an exception it will be silently lost. Python's asyncio logs a warning when a task is garbage-collected with an unhandled exception.
♻️ Proposed fix
+ # Track restart task to prevent GC and ensure exception visibility + self._restart_task: Optional[asyncio.Task] = None # Schedule the restart without awaiting it - asyncio.create_task(delayed_restart()) + self._restart_task = asyncio.create_task(delayed_restart())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@executor/modes/local/handlers.py` around lines 382 - 390, The fire-and-forget task created with asyncio.create_task(delayed_restart()) should be stored and monitored to avoid silent lost exceptions; assign the returned Task to a persistent variable (e.g., self._restart_task or a module-level _restart_task) where UpgradeHandler runs, and attach a done callback that logs any exception (task.add_done_callback(lambda t: _log_exception(t))). Ensure you replace the current bare asyncio.create_task call with storing the Task and adding the callback so the task isn't garbage-collected silently and exceptions get logged.chat_shell/chat_shell/tools/sandbox/_base.py (2)
469-483: Parameterformatshadows Python builtin.The
formatparameter name shadows Python's built-informat()function. While it won't cause runtime issues here, it's flagged by static analysis (Ruff A002) and can confuse readers.♻️ Rename to `file_format`
async def read_file_via_device( self, file_path: str, - format: str = "text", + file_format: str = "text", ) -> dict[str, Any]: """Read a file through the backend's device sandbox bridge.""" return await self._post_device_backend( endpoint=DEVICE_READ_FILE_ENDPOINT, payload={ "file_path": file_path, - "format": format, + "format": file_format, },Also applies to
write_file_via_deviceat line 505.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@chat_shell/chat_shell/tools/sandbox/_base.py` around lines 469 - 483, Rename the parameter `format` to `file_format` in the methods read_file_via_device and write_file_via_device to avoid shadowing Python's built-in; update the function signatures and any internal references so the payload still sends the API field "format": use payload={"file_path": file_path, "format": file_format} (and for write_file_via_device include its other payload fields), and adjust the docstrings/parameter names to `file_format` so static analysis (Ruff A002) and readers are satisfied while the external API key remains "format".
413-448: Silent failure when loading device binding.The
ensure_device_binding_loadedmethod catches all exceptions and only logs a warning (lines 434-441). If the backend is temporarily unavailable, the task will silently fall back to cloud sandbox instead of retrying or surfacing the error.This may be intentional for resilience, but consider whether certain errors (e.g., network timeout vs. 404) should be handled differently.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@chat_shell/chat_shell/tools/sandbox/_base.py` around lines 413 - 448, ensure_device_binding_loaded currently swallows all exceptions and marks _binding_loaded true, causing silent fallback to cloud; change it to distinguish transient network errors from permanent/not-found responses: catch httpx.TimeoutException and httpx.NetworkError and retry a few times (or re-raise) without setting self._binding_loaded so transient failures can be retried, but for httpx.HTTPStatusError inspect response.status_code and if it's 404 set _binding_loaded=True and return, otherwise log and re-raise or propagate the error; ensure bind_device_backend is only called on successful payload parsing and only mark _binding_loaded=True on success or definitive 404, and include the original exception details in logs for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@chat_shell/tests/test_sandbox_manager.py`:
- Around line 1-3: Replace the incorrect copyright header in the
test_sandbox_manager module: change the SPDX header text that currently says
"Weibo, Inc." to "WeCode, Inc." so it matches other files in the PR; update the
top-of-file comment block in chat_shell/tests/test_sandbox_manager.py to use
"WeCode, Inc." while keeping the SPDX-License-Identifier unchanged.
In `@executor/modes/local/handlers.py`:
- Around line 896-907: The _normalize_path function currently returns absolute
paths unchanged, allowing access outside the user's home; change it to resolve
the final filesystem path (use os.path.realpath or equivalent) after mapping
"/home/user" prefixes and ensure the resolved path is within the user's home
directory by comparing with os.path.commonpath(home_dir, resolved_path); if the
resolved path falls outside home, either raise a ValueError/PermissionError or
sanitize it to a safe location under home (pick consistent behavior for
LocalDeviceSandbox), and ensure inputs with ".." are normalized before the
containment check to prevent traversal attacks.
- Around line 913-936: The recursive walk function (walk) currently calls
entry.stat(), entry.is_symlink(), and os.readlink(entry_path) without handling
PermissionError/OSError; update walk to catch PermissionError/OSError around the
per-entry operations (stat and readlink) so inaccessible entries are skipped
instead of crashing the whole listing: wrap the entry.stat() and any
os.readlink() calls in a try/except that skips the entry on
PermissionError/OSError (optionally continue to next iterator item), and ensure
the directory recursion (entry.is_dir(follow_symlinks=False) / walk(...)) only
runs when stat succeeded; refer to symbols walk, entry.stat, entry.is_symlink,
entry.is_dir, and os.readlink to locate and modify the code.
- Around line 547-557: Add an explicit clarifying comment immediately above the
subprocess.run call (the block invoking subprocess.run with shell=True,
cwd=resolved_working_dir, timeout=timeout_seconds) stating that shell=True is
intentional for device sandbox functionality, and document the authenticated
trust boundary: that command dispatch is gated by WebSocket connection
authentication (JWT or API key) and user-scoped device access control so
commands execute with the device owner's OS privileges; mention that maintainers
must not assume untrusted input and refer to the authentication/authorization
checks enforcing this boundary.
In `@executor/tests/test_local_sandbox_handler.py`:
- Around line 1-3: Update the copyright header at the top of the
test_local_sandbox_handler.py file to match the project's standard by replacing
"Weibo, Inc." with "WeCode, Inc." so it is consistent with the other test files;
ensure both SPDX-FileCopyrightText and SPDX-License-Identifier lines remain
intact and only the organization string is changed.
---
Duplicate comments:
In `@backend/app/api/endpoints/internal/devices.py`:
- Around line 154-183: The internal endpoint
execute_device_sandbox_command_internal currently accepts a caller-controlled
user_id from InternalDeviceSandboxExecRequest with no auth; add an explicit
internal-auth dependency (e.g., Depends(get_internal_service) or
Depends(require_internal_token)) to the endpoint signature and verify the
dependency before using request.user_id, or alternatively remove user_id from
the request and derive the target user from the authenticated internal service
context; update the router/endpoint signature to include that dependency and
fail with HTTP 401/403 if the internal auth check fails so only trusted services
can invoke device_sandbox_service.execute_command.
---
Nitpick comments:
In `@backend/app/services/device_sandbox_service.py`:
- Around line 231-248: The DeviceSandboxService currently calls sio.call(...)
and immediately raises DeviceSandboxError on any Exception; add
retry-with-exponential-backoff around the sio.call invocation to handle
transient network errors: implement a small retry loop (e.g., 3 attempts) around
sio.call in the function containing that code, catching only transient
exceptions (network/timeouts) and sleeping with backoff between attempts, log
each retry via logger.error/warn including user_id, target_device_id and
event_name, and only raise DeviceSandboxError after all attempts fail (preserve
original exception as the cause); keep existing timeout logic and ensure you
still pass the same parameters (event_name, payload, to=socket_id,
namespace="/local-executor", timeout=...) when retrying.
- Around line 60-78: The parameter name format in read_file and write_file
shadows the built-in; rename it to file_format in both function signatures and
all internal references (including type hints and default value), and update the
payload construction passed to _execute_device_event to use the original
"format" key set to file_format (e.g., payload={"file_path": file_path,
"format": file_format}); also update any local variable uses and external
callsites to use the new parameter name (functions: read_file, write_file, and
any tests or callers).
In `@chat_shell/chat_shell/agents/graph_builder.py`:
- Line 278: Replace list concatenation using list(lc_messages) + [item] with
iterable unpacking for clarity and to satisfy Ruff RUF005; specifically update
the construction of final_messages where final_messages = list(lc_messages) +
[HumanMessage(content=system_notice)] to use [*lc_messages,
HumanMessage(content=system_notice)], and apply the same change at the two other
occurrences referenced (the similar constructions at lines around 300 and 316)
so each uses [*lc_messages, ...] instead of list(...) + [...] in the
GraphBuilder/graph building code.
- Around line 289-292: The streaming helper _stream_final_response_without_tools
currently yields any dict with a "text" key, while
_build_final_state_without_tools only accepts dicts where part.get("type") ==
"text"; update _stream_final_response_without_tools to mirror the type-checked
behavior: when encountering a dict, check part.get("type") == "text" before
extracting part.get("text", "") and yielding it so its behavior aligns with
get_final_content / on_chain_end and the non-streaming recovery path.
In `@chat_shell/chat_shell/tools/sandbox/_base.py`:
- Around line 469-483: Rename the parameter `format` to `file_format` in the
methods read_file_via_device and write_file_via_device to avoid shadowing
Python's built-in; update the function signatures and any internal references so
the payload still sends the API field "format": use payload={"file_path":
file_path, "format": file_format} (and for write_file_via_device include its
other payload fields), and adjust the docstrings/parameter names to
`file_format` so static analysis (Ruff A002) and readers are satisfied while the
external API key remains "format".
- Around line 413-448: ensure_device_binding_loaded currently swallows all
exceptions and marks _binding_loaded true, causing silent fallback to cloud;
change it to distinguish transient network errors from permanent/not-found
responses: catch httpx.TimeoutException and httpx.NetworkError and retry a few
times (or re-raise) without setting self._binding_loaded so transient failures
can be retried, but for httpx.HTTPStatusError inspect response.status_code and
if it's 404 set _binding_loaded=True and return, otherwise log and re-raise or
propagate the error; ensure bind_device_backend is only called on successful
payload parsing and only mark _binding_loaded=True on success or definitive 404,
and include the original exception details in logs for clarity.
In `@executor/modes/local/handlers.py`:
- Around line 769-775: The HTTP download call using requests.get with
download_url, auth_token, timeout_seconds and stream=True may send Authorization
in cleartext for http:// URLs; update the code around requests.get/response in
executor/modes/local/handlers.py (where download_url and auth_token are used) to
enforce or require HTTPS (e.g., validate download_url.startswith("https://") and
raise or abort for non-HTTPS), or at minimum log a clear warning before making
the request when the URL scheme is http://; ensure any decision leaves the
existing response.raise_for_status() flow intact.
- Around line 382-390: The fire-and-forget task created with
asyncio.create_task(delayed_restart()) should be stored and monitored to avoid
silent lost exceptions; assign the returned Task to a persistent variable (e.g.,
self._restart_task or a module-level _restart_task) where UpgradeHandler runs,
and attach a done callback that logs any exception
(task.add_done_callback(lambda t: _log_exception(t))). Ensure you replace the
current bare asyncio.create_task call with storing the Task and adding the
callback so the task isn't garbage-collected silently and exceptions get logged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1d1b45e7-897d-4c2c-844b-ecc6518ecd7d
📒 Files selected for processing (21)
backend/app/api/api.pybackend/app/api/endpoints/internal/devices.pybackend/app/services/device_sandbox_service.pybackend/init_data/skills/sandbox/command_tool.pybackend/init_data/skills/sandbox/download_attachment_tool.pybackend/init_data/skills/sandbox/list_files_tool.pybackend/init_data/skills/sandbox/read_file_tool.pybackend/init_data/skills/sandbox/upload_attachment_tool.pybackend/init_data/skills/sandbox/write_file_tool.pybackend/tests/services/test_device_sandbox_service.pychat_shell/.env.examplechat_shell/chat_shell/agent.pychat_shell/chat_shell/agents/graph_builder.pychat_shell/chat_shell/api/schemas.pychat_shell/chat_shell/core/config.pychat_shell/chat_shell/tools/sandbox/_base.pychat_shell/tests/test_sandbox_manager.pyexecutor/modes/local/events.pyexecutor/modes/local/handlers.pyexecutor/modes/local/runner.pyexecutor/tests/test_local_sandbox_handler.py
✅ Files skipped from review due to trivial changes (4)
- chat_shell/chat_shell/core/config.py
- chat_shell/chat_shell/agent.py
- chat_shell/.env.example
- chat_shell/chat_shell/api/schemas.py
🚧 Files skipped from review as they are similar to previous changes (4)
- executor/modes/local/runner.py
- backend/app/api/api.py
- executor/modes/local/events.py
- backend/tests/services/test_device_sandbox_service.py
| # SPDX-FileCopyrightText: 2025 Weibo, Inc. | ||
| # | ||
| # SPDX-License-Identifier: Apache-2.0 |
There was a problem hiding this comment.
Copyright header inconsistency.
The copyright header shows "Weibo, Inc." but other files in this PR use "WeCode, Inc." This should be corrected for consistency.
🔧 Proposed fix
-# SPDX-FileCopyrightText: 2025 Weibo, Inc.
+# SPDX-FileCopyrightText: 2025 WeCode, Inc.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # SPDX-FileCopyrightText: 2025 Weibo, Inc. | |
| # | |
| # SPDX-License-Identifier: Apache-2.0 | |
| # SPDX-FileCopyrightText: 2025 WeCode, Inc. | |
| # | |
| # SPDX-License-Identifier: Apache-2.0 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@chat_shell/tests/test_sandbox_manager.py` around lines 1 - 3, Replace the
incorrect copyright header in the test_sandbox_manager module: change the SPDX
header text that currently says "Weibo, Inc." to "WeCode, Inc." so it matches
other files in the PR; update the top-of-file comment block in
chat_shell/tests/test_sandbox_manager.py to use "WeCode, Inc." while keeping the
SPDX-License-Identifier unchanged.
| try: | ||
| result = subprocess.run( | ||
| command, | ||
| shell=True, | ||
| capture_output=True, | ||
| text=True, | ||
| cwd=resolved_working_dir, | ||
| timeout=timeout_seconds, | ||
| encoding="utf-8", | ||
| errors="replace", | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify that SandboxHandler events are only dispatched from trusted backend sources
# Check how the WebSocket namespace is authenticated
rg -n "local-executor" --type py -C 3Repository: wecode-ai/Wegent
Length of output: 11955
🏁 Script executed:
cat -n backend/app/api/ws/device_namespace.py | head -100Repository: wecode-ai/Wegent
Length of output: 4087
🏁 Script executed:
grep -n "async def on_connect" backend/app/api/ws/device_namespace.py -A 30Repository: wecode-ai/Wegent
Length of output: 1285
🏁 Script executed:
grep -n "async def on_connect" backend/app/api/ws/device_namespace.py -A 80 | head -100Repository: wecode-ai/Wegent
Length of output: 3492
🏁 Script executed:
grep -n "task:execute" backend/app/api/ws/device_namespace.py -B 5 -A 30Repository: wecode-ai/Wegent
Length of output: 1343
🏁 Script executed:
grep -n "def on_task_execute\|@self.on\|\.on('task:execute'" backend/app/api/ws/device_namespace.pyRepository: wecode-ai/Wegent
Length of output: 42
🏁 Script executed:
grep -n "task:execute" backend/app/api/ws/device_namespace.pyRepository: wecode-ai/Wegent
Length of output: 108
🏁 Script executed:
grep -n "task:execute\|emit.*task:execute" backend/app/services/device_sandbox_service.py -B 3 -A 3Repository: wecode-ai/Wegent
Length of output: 42
🏁 Script executed:
rg "task:execute" -A 5 -B 5 --type pyRepository: wecode-ai/Wegent
Length of output: 5009
🏁 Script executed:
rg "device:.*device-1\|room.*device" backend/app/services/ -A 2 -B 2 | head -60Repository: wecode-ai/Wegent
Length of output: 42
🏁 Script executed:
rg "emit.*task:execute\|sio.emit.*task" backend/app/ -B 3 -A 3 --type pyRepository: wecode-ai/Wegent
Length of output: 42
🏁 Script executed:
rg "to=.*room\|namespace.*local-executor" backend/app/services/ -B 5 -A 5 --type py | head -100Repository: wecode-ai/Wegent
Length of output: 42
🏁 Script executed:
grep -rn "sio.emit\|socketio.*emit" backend/app/services/device_sandbox_service.py -B 3 -A 3Repository: wecode-ai/Wegent
Length of output: 42
🏁 Script executed:
cat -n backend/app/services/device_sandbox_service.py | grep -A 20 "def execute_on_device\|def dispatch"Repository: wecode-ai/Wegent
Length of output: 42
🏁 Script executed:
ls -la backend/app/services/device_sandbox_service.py && wc -l backend/app/services/device_sandbox_service.pyRepository: wecode-ai/Wegent
Length of output: 207
🏁 Script executed:
cat -n backend/app/services/device_sandbox_service.pyRepository: wecode-ai/Wegent
Length of output: 18529
🏁 Script executed:
grep -rn "sandbox:exec\|sandbox:read_file" backend/app/api/ws/device_namespace.py -A 20 | head -80Repository: wecode-ai/Wegent
Length of output: 42
Add clarifying comment about authenticated trust boundary for shell=True usage.
Using subprocess.run(..., shell=True) is intentional for device sandbox functionality. However, the trust boundary relies on:
- WebSocket connection authentication (JWT token or API Key required at connect)
- User-scoped device access control (events dispatched only to authenticated user's devices)
While this authentication and access control properly restricts command dispatch to trusted users, adding an explicit code comment documenting the authentication/authorization boundary would improve clarity for future maintainers. Consider noting that commands run with the device owner's OS privileges and that access is gated by user authentication.
🧰 Tools
🪛 Ruff (0.15.6)
[error] 548-548: subprocess call with shell=True identified, security issue
(S602)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@executor/modes/local/handlers.py` around lines 547 - 557, Add an explicit
clarifying comment immediately above the subprocess.run call (the block invoking
subprocess.run with shell=True, cwd=resolved_working_dir,
timeout=timeout_seconds) stating that shell=True is intentional for device
sandbox functionality, and document the authenticated trust boundary: that
command dispatch is gated by WebSocket connection authentication (JWT or API
key) and user-scoped device access control so commands execute with the device
owner's OS privileges; mention that maintainers must not assume untrusted input
and refer to the authentication/authorization checks enforcing this boundary.
| def _normalize_path(self, path: str) -> str: | ||
| """Map sandbox-style paths onto the local device home directory.""" | ||
| home_dir = os.path.expanduser("~") | ||
| normalized = os.path.expanduser(path) | ||
| if not os.path.isabs(normalized): | ||
| return os.path.join(home_dir, normalized) | ||
| if normalized == "/home/user": | ||
| return home_dir | ||
| if normalized.startswith("/home/user/"): | ||
| suffix = normalized[len("/home/user/") :] | ||
| return os.path.join(home_dir, suffix) | ||
| return normalized |
There was a problem hiding this comment.
Path normalization may allow path traversal outside home directory.
The _normalize_path method only translates /home/user prefixes to the actual home directory. Absolute paths that don't start with /home/user are returned unchanged (line 907), allowing access to any filesystem location (e.g., /etc/passwd, /var/log).
If this is intentional (full device access), document this trust model. If device-sandbox should be confined to the user's home directory, add validation.
🛡️ Optional: Confine paths to home directory
def _normalize_path(self, path: str) -> str:
"""Map sandbox-style paths onto the local device home directory."""
home_dir = os.path.expanduser("~")
normalized = os.path.expanduser(path)
if not os.path.isabs(normalized):
return os.path.join(home_dir, normalized)
if normalized == "/home/user":
return home_dir
if normalized.startswith("/home/user/"):
suffix = normalized[len("/home/user/") :]
return os.path.join(home_dir, suffix)
- return normalized
+ # For full device access, allow any absolute path
+ # To confine: raise ValueError or return os.path.join(home_dir, normalized.lstrip('/'))
+ return normalized🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@executor/modes/local/handlers.py` around lines 896 - 907, The _normalize_path
function currently returns absolute paths unchanged, allowing access outside the
user's home; change it to resolve the final filesystem path (use
os.path.realpath or equivalent) after mapping "/home/user" prefixes and ensure
the resolved path is within the user's home directory by comparing with
os.path.commonpath(home_dir, resolved_path); if the resolved path falls outside
home, either raise a ValueError/PermissionError or sanitize it to a safe
location under home (pick consistent behavior for LocalDeviceSandbox), and
ensure inputs with ".." are normalized before the containment check to prevent
traversal attacks.
| def walk(current_path: str, remaining_depth: int) -> None: | ||
| with os.scandir(current_path) as iterator: | ||
| for entry in iterator: | ||
| entry_path = entry.path | ||
| stat_result = entry.stat(follow_symlinks=False) | ||
| entries.append( | ||
| { | ||
| "name": entry.name, | ||
| "path": entry_path, | ||
| "type": self._entry_type(entry), | ||
| "size": stat_result.st_size, | ||
| "permissions": oct(stat_result.st_mode & 0o777), | ||
| "owner": self._resolve_owner(stat_result.st_uid), | ||
| "group": self._resolve_group(stat_result.st_gid), | ||
| "modified_time": self._iso_mtime(entry_path), | ||
| **( | ||
| {"symlink_target": os.readlink(entry_path)} | ||
| if entry.is_symlink() | ||
| else {} | ||
| ), | ||
| } | ||
| ) | ||
| if remaining_depth > 1 and entry.is_dir(follow_symlinks=False): | ||
| walk(entry_path, remaining_depth - 1) |
There was a problem hiding this comment.
Recursive directory walk can raise PermissionError on inaccessible entries.
The walk function iterates over os.scandir() and calls entry.stat() without handling PermissionError. If a directory contains entries the user cannot stat (e.g., protected system files), this will raise an unhandled exception and fail the entire listing.
🛡️ Proposed fix to skip inaccessible entries
def walk(current_path: str, remaining_depth: int) -> None:
- with os.scandir(current_path) as iterator:
- for entry in iterator:
- entry_path = entry.path
- stat_result = entry.stat(follow_symlinks=False)
+ try:
+ scanner = os.scandir(current_path)
+ except PermissionError:
+ return
+ with scanner as iterator:
+ for entry in iterator:
+ entry_path = entry.path
+ try:
+ stat_result = entry.stat(follow_symlinks=False)
+ except (PermissionError, OSError):
+ continue # Skip inaccessible entries
entries.append(
{📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def walk(current_path: str, remaining_depth: int) -> None: | |
| with os.scandir(current_path) as iterator: | |
| for entry in iterator: | |
| entry_path = entry.path | |
| stat_result = entry.stat(follow_symlinks=False) | |
| entries.append( | |
| { | |
| "name": entry.name, | |
| "path": entry_path, | |
| "type": self._entry_type(entry), | |
| "size": stat_result.st_size, | |
| "permissions": oct(stat_result.st_mode & 0o777), | |
| "owner": self._resolve_owner(stat_result.st_uid), | |
| "group": self._resolve_group(stat_result.st_gid), | |
| "modified_time": self._iso_mtime(entry_path), | |
| **( | |
| {"symlink_target": os.readlink(entry_path)} | |
| if entry.is_symlink() | |
| else {} | |
| ), | |
| } | |
| ) | |
| if remaining_depth > 1 and entry.is_dir(follow_symlinks=False): | |
| walk(entry_path, remaining_depth - 1) | |
| def walk(current_path: str, remaining_depth: int) -> None: | |
| try: | |
| scanner = os.scandir(current_path) | |
| except PermissionError: | |
| return | |
| with scanner as iterator: | |
| for entry in iterator: | |
| entry_path = entry.path | |
| try: | |
| stat_result = entry.stat(follow_symlinks=False) | |
| except (PermissionError, OSError): | |
| continue # Skip inaccessible entries | |
| entries.append( | |
| { | |
| "name": entry.name, | |
| "path": entry_path, | |
| "type": self._entry_type(entry), | |
| "size": stat_result.st_size, | |
| "permissions": oct(stat_result.st_mode & 0o777), | |
| "owner": self._resolve_owner(stat_result.st_uid), | |
| "group": self._resolve_group(stat_result.st_gid), | |
| "modified_time": self._iso_mtime(entry_path), | |
| **( | |
| {"symlink_target": os.readlink(entry_path)} | |
| if entry.is_symlink() | |
| else {} | |
| ), | |
| } | |
| ) | |
| if remaining_depth > 1 and entry.is_dir(follow_symlinks=False): | |
| walk(entry_path, remaining_depth - 1) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@executor/modes/local/handlers.py` around lines 913 - 936, The recursive walk
function (walk) currently calls entry.stat(), entry.is_symlink(), and
os.readlink(entry_path) without handling PermissionError/OSError; update walk to
catch PermissionError/OSError around the per-entry operations (stat and
readlink) so inaccessible entries are skipped instead of crashing the whole
listing: wrap the entry.stat() and any os.readlink() calls in a try/except that
skips the entry on PermissionError/OSError (optionally continue to next iterator
item), and ensure the directory recursion (entry.is_dir(follow_symlinks=False) /
walk(...)) only runs when stat succeeded; refer to symbols walk, entry.stat,
entry.is_symlink, entry.is_dir, and os.readlink to locate and modify the code.
| # SPDX-FileCopyrightText: 2025 Weibo, Inc. | ||
| # | ||
| # SPDX-License-Identifier: Apache-2.0 |
There was a problem hiding this comment.
Copyright header inconsistency.
Same as the other test file - the copyright header shows "Weibo, Inc." but should be "WeCode, Inc." for consistency.
🔧 Proposed fix
-# SPDX-FileCopyrightText: 2025 Weibo, Inc.
+# SPDX-FileCopyrightText: 2025 WeCode, Inc.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # SPDX-FileCopyrightText: 2025 Weibo, Inc. | |
| # | |
| # SPDX-License-Identifier: Apache-2.0 | |
| # SPDX-FileCopyrightText: 2025 WeCode, Inc. | |
| # | |
| # SPDX-License-Identifier: Apache-2.0 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@executor/tests/test_local_sandbox_handler.py` around lines 1 - 3, Update the
copyright header at the top of the test_local_sandbox_handler.py file to match
the project's standard by replacing "Weibo, Inc." with "WeCode, Inc." so it is
consistent with the other test files; ensure both SPDX-FileCopyrightText and
SPDX-License-Identifier lines remain intact and only the organization string is
changed.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
chat_shell/chat_shell/tools/sandbox/_base.py (3)
19-19: Unused import:base64The
base64module is imported but does not appear to be used anywhere in this file. Consider removing it if it's not needed.Proposed fix
import asyncio -import base64 import logging🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@chat_shell/chat_shell/tools/sandbox/_base.py` at line 19, Remove the unused top-level import "base64" from chat_shell.tools.sandbox._base (the import statement 'import base64') since it's not referenced anywhere in this module; delete the import line and run the linter/tests to confirm nothing else relies on it, and if the module is needed in future usage, re-add it at the point of actual use.
494-508: Parameterformatshadows Python builtin.The
formatparameter inread_file_via_device(Line 497) andwrite_file_via_device(Line 530) shadows the Python builtinformat()function. Consider renaming tofile_formatorcontent_formatfor clarity.Proposed fix for `read_file_via_device`
async def read_file_via_device( self, file_path: str, - format: str = "text", + file_format: str = "text", ) -> dict[str, Any]: """Read a file through the backend's device sandbox bridge.""" return await self._post_device_backend( endpoint=DEVICE_READ_FILE_ENDPOINT, payload={ "file_path": file_path, - "format": format, + "format": file_format, }, timeout_seconds=60, unwrap_data=True, )Proposed fix for `write_file_via_device`
async def write_file_via_device( self, file_path: str, content: str, - format: str = "text", + file_format: str = "text", create_dirs: bool = True, ) -> dict[str, Any]: """Write a file through the backend's device sandbox bridge.""" return await self._post_device_backend( endpoint=DEVICE_WRITE_FILE_ENDPOINT, payload={ "file_path": file_path, "content": content, - "format": format, + "format": file_format, "create_dirs": create_dirs, }, timeout_seconds=60, unwrap_data=True, )Also applies to: 526-544
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@chat_shell/chat_shell/tools/sandbox/_base.py` around lines 494 - 508, The parameter name `format` in read_file_via_device and write_file_via_device shadows the Python builtin; rename it to file_format (or content_format) in both function signatures, update the docstrings and type hints accordingly, replace all internal uses of `format` with `file_format` (including the payload key value passed to _post_device_backend) and update any local callers/tests that call these methods to use the new parameter name; ensure the JSON payload still sends the expected "format" key name if the backend requires that (i.e., payload={"file_path": file_path, "format": file_format}) so only the Python parameter is renamed while the external contract remains unchanged.
438-473: Consider narrowing exception handling and adding tracing.
The broad
Exceptioncatch (Line 459) could mask unexpected errors. Consider catching specific exceptions likehttpx.HTTPErrororhttpx.TimeoutException.Per coding guidelines, async functions should use
@trace_asyncdecorator for OpenTelemetry tracing.Proposed refinement
+ `@trace_async`("ensure_device_binding_loaded", "sandbox_manager") async def ensure_device_binding_loaded(self) -> None: """Load sticky device binding from backend once per task if present.""" if self._binding_loaded or self.is_device_backend_bound(): return backend_url = _get_backend_api_url() headers = {} if self.auth_token: headers["Authorization"] = f"Bearer {self.auth_token}" binding_url = f"{backend_url}{DEVICE_BINDING_ENDPOINT_TEMPLATE.format(task_id=self.task_id)}" try: async with httpx.AsyncClient(timeout=30) as client: response = await client.get( binding_url, params={"user_id": self.user_id}, headers=headers, ) response.raise_for_status() payload = response.json() - except Exception as exc: + except (httpx.HTTPError, httpx.TimeoutException) as exc: logger.warning( "[SandboxManager] Failed to load task sandbox binding: task_id=%s, error=%s", self.task_id, exc, ) self._binding_loaded = True returnAs per coding guidelines: "Use
@trace_async(span_name, tracer_name, extract_attributes)decorator to trace entire async functions in OpenTelemetry".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@chat_shell/chat_shell/tools/sandbox/_base.py` around lines 438 - 473, The try/except in ensure_device_binding_loaded is too broad and the async function lacks OpenTelemetry tracing; narrow the exception handling to httpx exceptions (e.g., httpx.HTTPError, httpx.TimeoutException) when calling client.get/response.raise_for_status and handle other errors separately, and add the `@trace_async` decorator (use trace_async with an appropriate span name and tracer name and supply extract_attributes to include task_id/user_id) above ensure_device_binding_loaded so the whole coroutine is traced; keep the existing behavior of setting self._binding_loaded True on handled backend-related network failures and still call bind_device_backend(device_id) when payload indicates a device backend.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@chat_shell/chat_shell/tools/sandbox/_base.py`:
- Line 19: Remove the unused top-level import "base64" from
chat_shell.tools.sandbox._base (the import statement 'import base64') since it's
not referenced anywhere in this module; delete the import line and run the
linter/tests to confirm nothing else relies on it, and if the module is needed
in future usage, re-add it at the point of actual use.
- Around line 494-508: The parameter name `format` in read_file_via_device and
write_file_via_device shadows the Python builtin; rename it to file_format (or
content_format) in both function signatures, update the docstrings and type
hints accordingly, replace all internal uses of `format` with `file_format`
(including the payload key value passed to _post_device_backend) and update any
local callers/tests that call these methods to use the new parameter name;
ensure the JSON payload still sends the expected "format" key name if the
backend requires that (i.e., payload={"file_path": file_path, "format":
file_format}) so only the Python parameter is renamed while the external
contract remains unchanged.
- Around line 438-473: The try/except in ensure_device_binding_loaded is too
broad and the async function lacks OpenTelemetry tracing; narrow the exception
handling to httpx exceptions (e.g., httpx.HTTPError, httpx.TimeoutException)
when calling client.get/response.raise_for_status and handle other errors
separately, and add the `@trace_async` decorator (use trace_async with an
appropriate span name and tracer name and supply extract_attributes to include
task_id/user_id) above ensure_device_binding_loaded so the whole coroutine is
traced; keep the existing behavior of setting self._binding_loaded True on
handled backend-related network failures and still call
bind_device_backend(device_id) when payload indicates a device backend.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a916e41b-0d63-45fd-a06a-ad967966ff4c
📒 Files selected for processing (4)
chat_shell/.env.examplechat_shell/chat_shell/core/config.pychat_shell/chat_shell/tools/sandbox/_base.pychat_shell/tests/test_sandbox_manager.py
🚧 Files skipped from review as they are similar to previous changes (3)
- chat_shell/chat_shell/core/config.py
- chat_shell/tests/test_sandbox_manager.py
- chat_shell/.env.example
Summary
Testing
uv run --project /Users/crystal/dev/git/Wegent/backend/pyproject.toml pytest /Users/crystal/dev/git/Wegent/backend/tests/services/test_device_sandbox_service.pyuv run --project /Users/crystal/dev/git/Wegent/chat_shell/pyproject.toml pytest /Users/crystal/dev/git/Wegent/chat_shell/tests/test_skill_registry.pyuv run --project /Users/crystal/dev/git/Wegent/executor/pyproject.toml pytest /Users/crystal/dev/git/Wegent/executor/tests/test_local_sandbox_handler.py(did not complete cleanly in this environment; existing runner behavior appears to hang during collection/execution)Summary by CodeRabbit
New Features
Improvements
Tests