feat: verify if tokens exposed in sandbox#1857
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds Phase 2 credential-isolation checks that capture the sandbox environment, process cmdlines, and perform recursive filesystem searches for real Telegram/Discord tokens; introduces Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CI as Tester (CI)
participant SSH as SSH Proxy
participant Guest as Sandbox Guest
participant FS as Sandbox Filesystem
CI->>SSH: invoke sandbox_exec_stdin(cmd, token-pattern via STDIN)
SSH->>Guest: execute command (capture `env`, read /proc/*/cmdline, grep filesystem paths)
Guest->>FS: read files under /sandbox /home /etc /tmp /var
Guest-->>SSH: return env dump, process cmdlines, grep results
SSH-->>CI: stdout captured into sandbox_env_all, sandbox_ps, sandbox_fs_*
CI->>CI: evaluate outputs -> pass / fail / skip for real tokens & placeholders
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
8e8ca0e to
32a0d83
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/test-messaging-providers.sh`:
- Around line 286-287: The snapshot captures using sandbox_exec (variables
sandbox_env_all and sandbox_ps) may return empty strings on SSH/remote failure,
causing false PASS; modify the test logic so after calling sandbox_exec for
env/ps (and the other captures at 289-300, 323-334) you validate the returned
snapshot is non-empty and parseable before proceeding: if sandbox_env_all or
sandbox_ps (or other snapshot variables) are empty or fail basic parsing, mark
the check as failed/skipped (exit non-zero or emit a failing test result) and do
not run the token-leak assertions (M5a/M5b/M5e/M5f) until a confirmed successful
capture exists; use the sandbox_exec wrapper name and the snapshot variable
names to locate and gate the subsequent token checks.
- Around line 303-305: The grep call embeds TELEGRAM_TOKEN into the remote
command argv (variable sandbox_fs_tg via sandbox_exec), exposing the secret in
the sandbox process list; change the approach to send the token via stdin
instead of the command line (e.g., use grep -Ff - or an equivalent remote helper
that reads patterns from stdin) and pipe TELEGRAM_TOKEN into sandbox_exec so the
token never appears in the remote argv; apply the same change to the other
occurrence around lines 337-338 to ensure both searches use stdin-based pattern
input rather than embedding $TELEGRAM_TOKEN in the SSH/command string.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: d89a3554-9583-4c30-b7a7-1194c0fcccb3
📒 Files selected for processing (1)
test/e2e/test-messaging-providers.sh
32a0d83 to
00a5064
Compare
… vars and process cmdlines
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
test/e2e/test-messaging-providers.sh (2)
337-337: Add--max-count=1togrepfor early exit.Once the first match is found, the check fails regardless of additional matches. Adding
-m 1avoids scanning the entire filesystem unnecessarily.⚡ Proposed optimization
-sandbox_fs_tg=$(printf '%s' "$TELEGRAM_TOKEN" | sandbox_exec_stdin "grep -rFl -f /dev/stdin /sandbox /home /etc /tmp /var 2>/dev/null || true") +sandbox_fs_tg=$(printf '%s' "$TELEGRAM_TOKEN" | sandbox_exec_stdin "grep -rFl -m 1 -f /dev/stdin /sandbox /home /etc /tmp /var 2>/dev/null || true")Apply the same change to Line 374 for the Discord token search.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-messaging-providers.sh` at line 337, The grep calls that scan the filesystem for tokens (e.g., the command assigning sandbox_fs_tg) should exit after the first match to avoid unnecessary full scans; update the grep invocation inside sandbox_exec_stdin to include --max-count=1 (or -m 1) so it stops on the first hit, and apply the same change to the corresponding Discord token search (the variable/assignment for the Discord token) to mirror the optimization.
307-309: Consider using consistent capture methods.Line 307 uses
sandbox_execwhile Lines 308-309 useopenshell sandbox execdirectly. Both work, butsandbox_execprovides consistent timeout handling and error suppression. Using the same helper for both captures would improve maintainability.♻️ Suggested change for consistency
sandbox_env_all=$(sandbox_exec "env 2>/dev/null" 2>/dev/null || true) -sandbox_ps=$(openshell sandbox exec -n "$SANDBOX_NAME" -- \ - sh -c 'cat /proc/[0-9]*/cmdline 2>/dev/null | tr "\0" "\n"' 2>/dev/null || true) +sandbox_ps=$(sandbox_exec 'cat /proc/[0-9]*/cmdline 2>/dev/null | tr "\0" "\n"' 2>/dev/null || true)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-messaging-providers.sh` around lines 307 - 309, The two captures use different helpers: sandbox_env_all uses sandbox_exec but sandbox_ps calls openshell directly; change sandbox_ps to use the same sandbox_exec helper so timeout handling and error suppression are consistent. Replace the direct openshell invocation (the command generating sandbox_ps) with a sandbox_exec call running the same sh -c 'cat /proc/[0-9]*/cmdline ...' string, preserve the current 2>/dev/null || true behavior and assign the result back to sandbox_ps so both sandbox_env_all and sandbox_ps use the sandbox_exec helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/test-messaging-providers.sh`:
- Around line 102-121: The sandbox_exec_stdin helper uses remote grep with -f
/dev/stdin which can silently fail when /dev/stdin does not exist on the remote,
producing false negatives; update sandbox_exec_stdin to send the pattern via a
temporary file on the remote (or use a here-doc/process substitution that is
supported by the remote shell) instead of relying on /dev/stdin, ensuring the
temp file is created on the remote before running grep -f, cleaned up afterward,
and that callers (the M5c/M5g checks) continue to pass the token pattern into
sandbox_exec_stdin unchanged; reference the sandbox_exec_stdin function to
locate where to implement remote temp-file creation, grep invocation change, and
cleanup.
---
Nitpick comments:
In `@test/e2e/test-messaging-providers.sh`:
- Line 337: The grep calls that scan the filesystem for tokens (e.g., the
command assigning sandbox_fs_tg) should exit after the first match to avoid
unnecessary full scans; update the grep invocation inside sandbox_exec_stdin to
include --max-count=1 (or -m 1) so it stops on the first hit, and apply the same
change to the corresponding Discord token search (the variable/assignment for
the Discord token) to mirror the optimization.
- Around line 307-309: The two captures use different helpers: sandbox_env_all
uses sandbox_exec but sandbox_ps calls openshell directly; change sandbox_ps to
use the same sandbox_exec helper so timeout handling and error suppression are
consistent. Replace the direct openshell invocation (the command generating
sandbox_ps) with a sandbox_exec call running the same sh -c 'cat
/proc/[0-9]*/cmdline ...' string, preserve the current 2>/dev/null || true
behavior and assign the result back to sandbox_ps so both sandbox_env_all and
sandbox_ps use the sandbox_exec helper.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: e278c7a8-aef5-410e-bee3-11f1c8fea4f2
📒 Files selected for processing (1)
test/e2e/test-messaging-providers.sh
…ss with false negatives
ericksoa
left a comment
There was a problem hiding this comment.
Looks good — approving. The core logic is correct, stdin-based pattern passing avoids leaking tokens in the remote argv, skip-on-empty prevents false greens, and the CodeRabbit issues have all been addressed.
A few minor items to clean up in a follow-up PR:
-
Inconsistent capture helper —
sandbox_env_allusessandbox_exec(SSH with timeout) butsandbox_pscallsopenshell sandbox execdirectly. Usesandbox_execfor both so timeout/error handling is consistent. -
Repetitive check blocks — The 8 checks follow 3 repeating patterns duplicated for Telegram/Discord. A small helper like
assert_token_absent "$surface_data" "$TOKEN" "M5a" "description"would cut ~60 lines and make adding future providers easier. -
Guard against empty token variable — If
$TELEGRAM_TOKENor$DISCORD_TOKENwere somehow unset,grep -f -with empty stdin matches nothing and M5c/M5g pass trivially. A[ -z "$TOKEN" ] && skipguard before the filesystem greps would add a safety net for a security test. -
Temp file cleanup on signal —
sandbox_exec_stdinleaks the ssh-config temp file if the script is killed mid-SSH. A trap would be more robust (low priority since this runs on ephemeral infra).
None of these block merge. Nice work on the credential isolation coverage. 👍
Summary
Add deep credential exposure checks (M5a–M5h) to the messaging provider E2E test.
These verify that real bot tokens never appear on any observable surface inside the
sandbox — full environment dump, process cmdlines, and filesystem — beyond the existing
single-env-var check (M3/M4).
Related Issue
Closes #1852
Changes
test/e2e/test-messaging-providers.shwith 8 new assertions (M5a–M5h):envdump must not contain the real Telegram/Discord token.process cmdlinesmust not contain the real Telegram/Discord token./sandbox,/home,/etc,/tmp,/var) must not find the real token.envandprocess cmdlinesonce via SSH and reuse across all checks to avoid redundant round-trips.Type of Change
Testing
npx prek run --all-filespasses (or equivalentlymake check).npm testpasses.make docsbuilds without warnings. (for doc-only changes)Checklist
General
Code Changes
npx prek run --all-filesauto-fixes formatting (ormake formatfor targeted runs).Signed-off-by: Hung Le [email protected]
Summary by CodeRabbit