fix(pty): fix child processes not being killed #6553

Electroid · 2025-11-12T20:48:22Z

Closes #5229
Closes #4337

Summary

Users have reported that commands sometimes hang. The root cause is because the timeout handler only kills the direct child process, but grandchildren processes are kept alive.

Note: While this appears similar to recently merged #5258, those changes did not address the same issue that was present in the unified_exec codepath.

Reproduction

If you run the command below, it will hang. Instead, it should yield back to Codex.

$ echo "Run this bash command: sleep 30 & echo \"Background PID: \$!\"; exit 0" | codex --enable unified_exec

Solution

While portable_pty already calls setsid() internally (creating a process group where pgid == pid), it only kills the direct child. This PR adds process group killing:

Track the child PID when spawning
On timeout/drop, call killpg(pgid, SIGKILL) to kill the entire process group
Then call killer.kill() as backup

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-11-12T20:52:55Z

codex-rs/utils/pty/src/lib.rs

 impl Drop for ExecCommandSession {
    fn drop(&mut self) {
+        #[cfg(unix)]
+        if let Ok(mut pid_guard) = self.pid.lock() {
+            if let Some(pid) = pid_guard.take() {
+                if let Err(e) = kill_child_process_group(pid) {
+                    trace!("Failed to kill process group for pid {}: {}", pid, e);
+                }


Guard against killing reused PIDs after child exit

The session destructor kills a process group whenever pid is still set (kill_child_process_group(pid)), but the PID is only cleared in the wait thread after child.wait() returns. Because wait() reaps the process and frees its PID before the mutex is updated (lines ~233‑247), there is a brief window where the child has fully exited and its PID may already be reused by an unrelated process while pid_guard still contains the stale value. If the session is dropped during that window, getpgid/killpg will target the new process group and may SIGKILL an unrelated process tree. Consider checking exit_status before killing or clearing pid before calling wait() so that drop never acts on a PID that might already be recycled.

Useful? React with 👍 / 👎.

Taking a look.

etraut-openai · 2025-11-12T23:17:35Z

Thanks for the contribution. Before I ask someone on the codex team to review your PR, could you please fix the lint issues? You can ignore the failed test. That's a flaky test that someone on the team is already working on fixing.

fix(pty): fix child processes not being killed

bce4927

chatgpt-codex-connector bot reviewed Nov 12, 2025

View reviewed changes

Electroid marked this pull request as draft November 12, 2025 23:08

etraut-openai added the needs-response Additional information is requested label Nov 12, 2025

Merge branch 'main' into fix/pty-kill-process-group

46c0096

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(pty): fix child processes not being killed #6553

fix(pty): fix child processes not being killed #6553

Electroid commented Nov 12, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Nov 12, 2025

Uh oh!

Electroid Nov 12, 2025

Uh oh!

etraut-openai commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(pty): fix child processes not being killed #6553

Are you sure you want to change the base?

fix(pty): fix child processes not being killed #6553

Conversation

Electroid commented Nov 12, 2025

Summary

Reproduction

Solution

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Electroid Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

etraut-openai commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants