Skip to content

[LEGACY] [CI]: TestRestartWithSignal does not seem to (always?) work correctly (EL8 only?) #4068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
apostasie opened this issue Apr 1, 2025 · 3 comments
Labels
area/ci e.g., CI failure bug Something isn't working

Comments

@apostasie
Copy link
Contributor

Description

Seen on #4056

See attached trace (note that PR #4056 has the new timeout logic).

Signal has been sent and received, but the container has not restarted (and was instead terminated by timeout).

@AkihiroSuda the good news here is that the new timeout logic is working as expected, and this test would have previously hung-up on EL8.

Now we just have to figure out what is happening here.

Steps to reproduce the issue

Describe the results you received and expected

=== FAIL: cmd/nerdctl/container TestRestartWithSignal (21.97s)
    container_restart_linux_test.go:163: 
    container_restart_linux_test.go:163: ======================== Pre-test cleanup ========================
    container_restart_linux_test.go:163: 
    container_restart_linux_test.go:163: ======================== Test setup ========================
    container_restart_linux_test.go:163: 
    container_restart_linux_test.go:163: ======================== Test Run ========================
    command.go:435: [2025-04-01T16:46:36Z] [command=/usr/local/bin/nerdctl --namespace=nerdctl-test run --name testrestartwithsignal-27bfb0e4 ghcr.io/stargz-containers/alpine:3.13-org sh -c #!/bin/sh
        	set -eu
        
        	sig_msg () {
        		printf "received\n"
        		[ "false" != true ] || exit 0
        	}
        
        	trap sig_msg 10
        	printf "trap ready\n"
        	while true; do
        		printf "waiting...\n"
        		sleep 0.5
        	done
        ] command cancelled
    command.go:115: expected: 137  - to be equal to: -1
    command.go:115: Expected exit code: 137
        
    command.go:115: 
        =================================
        | Command:	/usr/local/bin/nerdctl run --name testrestartwithsignal-27bfb0e4 ghcr.io/stargz-containers/alpine:3.13-org sh -c #!/bin/sh
        	set -eu
        
        	sig_msg () {
        		printf "received\n"
        		[ "false" != true ] || exit 0
        	}
        
        	trap sig_msg 10
        	printf "trap ready\n"
        	while true; do
        		printf "waiting...\n"
        		sleep 0.5
        	done
        
        | Working Dir:	/tmp/TestRestartWithSignal2256824605/001
        | Timeout:	20s
        =================================
        	SHELL=/bin/bash
        	LOGNAME=rootless
        	XDG_SESSION_TYPE=tty
        	HOME=/home/rootless
        	LANG=C.UTF-8
        	SSH_CONNECTION=::1 41538 ::1 22
        	XDG_SESSION_CLASS=user
        	IPFS_PATH=/home/rootless/.local/share/ipfs
        	USER=rootless
        	SHLVL=1
        	XDG_SESSION_ID=4
        	XDG_RUNTIME_DIR=/run/user/1001
        	SSH_CLIENT=::1 41538 22
        	DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1001/bus
        	OLDPWD=/home/rootless
        	_=/usr/local/bin/gotestsum
        	PATH=/usr/local/go/bin:/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
        	***
        	DOCKER_CONFIG=/tmp/TestRestartWithSignal2256824605/001
        	NERDCTL_TOML=/tmp/TestRestartWithSignal2256824605/001/nerdctl.toml
        =================================
        | Stderr:
        =================================
        
        =================================
        | Stdout:
        =================================
        trap ready
        waiting...
        waiting...
        waiting...
        received
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        waiting...
        
        =================================
        | Exit Code: -1
        | Signaled: killed
        | Err: command timed out
        =================================
    case.go:180: 
    case.go:181: ======================== Post-test cleanup ========================

What version of nerdctl are you using?

main

Are you using a variant of nerdctl? (e.g., Rancher Desktop)

None

Host information

I WANT A GREEN CI AND WILL NOT STOP BEING ENRAGED UNTIL WE GET IT 😡🤣

@apostasie apostasie added the kind/unconfirmed-bug-claim Unconfirmed bug claim label Apr 1, 2025
@apostasie
Copy link
Contributor Author

Maybe EL8 is just really really really slow and we need to extend the timeout from 20 seconds to 30?

@AkihiroSuda
Copy link
Member

Maybe EL8 is just really really really slow and we need to extend the timeout from 20 seconds to 30?

Maybe yes, as EL 8 is executed on a nested virt

@AkihiroSuda AkihiroSuda added bug Something isn't working area/ci e.g., CI failure and removed kind/unconfirmed-bug-claim Unconfirmed bug claim labels Apr 18, 2025
@apostasie
Copy link
Contributor Author

Ok, let's test the hypothesis.

We can merge it - it is rather innocuous. Worse case scenario, it will still fail on EL8, just slower...

#4130

@apostasie apostasie changed the title [CI]: TestRestartWithSignal does not seem to (always?) work correctly (EL8 only?) [LEGACY] [CI]: TestRestartWithSignal does not seem to (always?) work correctly (EL8 only?) Apr 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci e.g., CI failure bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants