-
-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bump Vampire/setup-wsl from 4.1.1 to 5.0.0 #2008
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Cygwin failure here is due to #2004 and the same as occurs on main; it will go away with #2009. Because it is unrelated to the changes in this PR and affects only a workflow where the action this PR upgrades is not used, I think it need not block this.
The other failure is of the action being updated, and it looks to be due to Vampire/setup-wsl#72. This problem is intermittent and not due to the changes in this PR, but it is related to the changes in this PR, both in the general sense that it is a failure in the action being upgraded, and in the specific sense that the upgrade includes changes that are related to the error. But the changes do not seem to exacerbate the error; instead, they mitigate it, but the mitigation unfortunately does not apply to the way we are currently using setup-wsl
.
I recommend rerunning the workflow on this PR just in case I am missing something, and merging the PR if it passes. I am probably not missing anything, because the failure would have happened on all the Windows test
jobs instead of just one if it were due to a non-intermittent cause, and because I made a branch from this in my fork and the tests passed, including when rerun.
We can also modify the way we use WSL to prevent the error from ever occurring, which I plan to open a PR for, but I don't think that needs to be included here, nor to block this.
Relevant log output
From the run log here:
C:\Windows\system32\wsl.exe --set-default-version 2
For information on key differences with WSL 2 please visit https://aka.ms/wsl2
The operation completed successfully.
]C:\Windows\system32\wsl.exe --update
Forbidden (403).
::debug::Error: The process 'C:\Windows\system32\wsl.exe' failed with exit code 4294967295%0A at ExecState._setResult (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:592:1)%0A at ExecState.CheckComplete (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:575:1)%0A at ChildProcess.<anonymous> (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:469:1)%0A at ChildProcess.emit (node:events:5[19](https://github.com/gitpython-developers/GitPython/actions/runs/13631523273/job/38100167395?pr=2008#step:4:21):28)%0A at maybeClose (node:internal/child_process:1105:16)%0A at Process.ChildProcess._handle.onexit (node:internal/child_process:305:5)%0A
Failure happened, retrying (The process 'C:\Windows\system32\wsl.exe' failed with exit code 4294967295)
A red herring: encodings
The above looks different from what you see when you view the log. If you're not worried about that, you can skip this section. Otherwise, this describes the cause and why it is not related to the failure seen here.
That looks different from what appears in the linked run output, because the linked run output shows text from the output of wsl.exe
with spaces after each letter, e.g., F o r b i d d e n ( 4 0 3 ) .
This is not related to the error being reported, nor to the underlying cause of it, but is instead due to the messages being output by wsl.exe
in the UTF-16LE encoding. The resulting null bytes are displayed in different ways based on how the log is examined. Copying them, as here, makes them seem to vanish, while looking at them in the non-raw log shows them as spaces. (This is not new, but it can be misleading if not addressed, so I figured I'd cover it here.)
To verify that the spaces are just null bytes, as occur in those positions when UTF-16LE is interpreted as UTF-8, here's the same thing copied from the raw log:
2025-03-03T13:02:53.1042669Z [command]C:\Windows\system32\wsl.exe --set-default-version 2
2025-03-03T13:02:53.1239543Z F�o�r� �i�n�f�o�r�m�a�t�i�o�n� �o�n� �k�e�y� �d�i�f�f�e�r�e�n�c�e�s� �w�i�t�h� �W�S�L� �2� �p�l�e�a�s�e� �v�i�s�i�t� �h�t�t�p�s�:�/�/�a�k�a�.�m�s�/�w�s�l�2�
2025-03-03T13:02:53.1240472Z �
2025-03-03T13:02:53.1240758Z �
2025-03-03T13:02:53.1241229Z �T�h�e� �o�p�e�r�a�t�i�o�n� �c�o�m�p�l�e�t�e�d� �s�u�c�c�e�s�s�f�u�l�l�y�.�
2025-03-03T13:02:53.1241806Z �
2025-03-03T13:02:53.1242099Z �
2025-03-03T13:02:53.2195310Z �[command]C:\Windows\system32\wsl.exe --update
2025-03-03T13:02:53.3985816Z F�o�r�b�i�d�d�e�n� �(�4�0�3�)�.�
2025-03-03T13:02:53.3986360Z �
2025-03-03T13:02:53.3986573Z �
2025-03-03T13:02:53.6038224Z �::debug::Error: The process 'C:\Windows\system32\wsl.exe' failed with exit code 4294967295%0A at ExecState._setResult (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:592:1)%0A at ExecState.CheckComplete (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:575:1)%0A at ChildProcess.<anonymous> (file:///D:/a/_actions/Vampire/setup-wsl/node_modules/@actions/exec/lib/toolrunner.js:469:1)%0A at ChildProcess.emit (node:events:519:28)%0A at maybeClose (node:internal/child_process:1105:16)%0A at Process.ChildProcess._handle.onexit (node:internal/child_process:305:5)%0A
2025-03-03T13:02:53.6044228Z Failure happened, retrying (The process 'C:\Windows\system32\wsl.exe' failed with exit code 4294967295)
An analogous encoding issue arises when dealing with error messages from WSL (rather than from a program running in a WSL distribution) due to running the WSL-associated bash.exe
wrapper, which our test suite handles this way:
Lines 154 to 159 in fe7533e
# When bash.exe is the WSL wrapper but the output is from WSL itself rather than | |
# code running in a distribution, the output is often in UTF-16LE, which Windows | |
# uses internally. The UTF-16LE representation of a Windows-style line ending is | |
# rarely seen otherwise, so use it to detect this situation. | |
if b"\r\0\n\0" in stdout: | |
return stdout.decode("utf-16le") |
That works well enough because it is only used in the test suite and only for a very narrow purpose. Nonetheless, there may be a way to incorporate encoding detection into the setup-wsl
action to produce more readable messages. One complicating factor is that the total output contains UTF-16LE portions as well as UTF-8 portions. I hope eventually to investigate and, if I can, propose a patch to the setup-wsl
action to help with this. But there are a number of other things I'm working on that take precedence, so I don't know when I would get around to doing that. In any case, the GitPython test suite performs detection because it needs to examine the text of the output. That doesn't seem to be needed in setup-wsl
, so this doesn't cause or contribute to any other problems.
The actual problem
As described in Vampire/setup-wsl#72 (comment), the windows-2022
image requires a WSL update to run WSL2. Downloading this update from GitHub-hosted runners sometimes fails with HTTP 403, and automated retries within the same job run are not usually enough to make it go away, even though rerunning the job or workflow manually is. This seems to happen due to behavior of the remote server, rather than something that can be fixed in the action.
Although Vampire/setup-wsl#72 is closed as not planned, it is mitigated in Vampire/setup-wsl@f4cd646 by refraining from attempting the update check on the windows-2025
(actions/runner-images#11228) where it is no longer needed (Vampire/setup-wsl#72 (comment)). That is actually the change in the version this Dependabot PR is upgrading to, v5.0.0:
Before this release a WSL update command was always issued if WSLv2 was going to be used. The windows-2025 image now has the necessary update already installed, so currently only the GitHub-hosted windows-2022 runners need the call to get WSLv2 working. But as there are still updates available for the windows-2025 runners it would potentially waste time unnecessarily.
So now the WSL update command is only done for WSLv2 on GItHub-hosted windows-2022 runners, where it is known that the update is necessary currently to make it work at all.
While windows-2025
is available, windows-latest
still aliases windows-2022
, which is why it was possible for this failure to occur in this Dependabot PR.
Fixing the problem here
This Dependabot PR mitigates the problem but does not directly help here. So we should make some further change. One possible change is to switch to windows-2025
, which we could switch to explicitly or wait for windows-latest
to point to. In that case, the update in this PR is a necessary part of fixing it.
This problem can also be fixed by switching to WSL1. When we started using setup-wsl
in #1745, it defaulted to WSL1 on the runner we were using at that time. At some point, it switched to defaulting to WSL2. (This is related to the availability of nested virtualization, and I think also to other factors. See actions/runner-images#5920 and actions/runner-images#10563. I don't remember what Windows image windows-latest
was on as of #1745, but I believe windows-2019
on GitHub-hosted runners does not support WSL2, or at least not with the setup-wsl
action. Recent versions of setup-wsl
default to WSL2 on windows-2022
and windows-2025
. I am not sure if windows-2022
on GitHub-hosted runners has always supported WSL2, or if this started later due the virtualization infrastructure supporting nested virtualization, or if other factors are in play.)
Usually WSL2 is preferable to WSL1, but I think we may actually be better off using WSL1. The test cases that pass and fail seem to do so irrespective of whether WSL1 or WSL2 is used (as well as not depending on what distribution is used in WSL, so long as it provides bash
). The check in the test suite for whether we are using WSL works both in WSL1 and WSL2. The core of that check is:
Lines 126 to 131 in fe7533e
# Output rather than forwarding the test command's exit status so that if a | |
# failure occurs before we even get to this point, we will detect it. For | |
# information on ways to check for WSL, see https://superuser.com/a/1749811. | |
script = 'test -e /proc/sys/fs/binfmt_misc/WSLInterop; echo "$?"' | |
command = ["bash.exe", "-c", script] | |
process = subprocess.run(command, capture_output=True) |
From this answer to "How can I check if the environment is WSL from a shell script?" on Super User by NotTheDr01ds:
The presence of
/proc/sys/fs/binfmt_misc/WSLInterop
is a pretty good indicator that you are on WSL. This may be the most reliable method, and it is what Ubuntu's Snapd project uses as its detection mechanism. This file exists under both WSL1 and WSL2 by default. Even when Interop is disabled via/etc/wsl.conf
, this file will still be created by WSL at startup.
The answer goes on describe rare and deliberate cases where this would misdetect whether WSL is present. For our purposes, I believe those cases are not important, and that in at least some of them it is actually better that we misdetect, since they seems to arise mainly when someone is deliberately investigating the effect of misdetection.
We only use bash
for the existing partial implementation of running hooks. In the future, it seems likely that this will become more robust and sophisticated, and gain many new tests. However, that would only happen after #1791 (or after the problem it is solving is solved in some other way, but my guess is that the conceptually related work I am doing on gitoxide will eventually result in a level of understanding of the relevant considerations such that I would then feel comfortable making any remaining needed changes to #1791 to allow it to be merged, even if its author is not available to critique them and propose further changes, though I hope that will also be able to happen).
Given that our current use of WSL is just in a few tests that end up running hooks with the WSL bash
or attempting to do so, the extra memory resources associated with using WSL2 instead of WSL1 on CI do not seem justified. (WSL2 distributions actually run as containers rather than virtual machines, but this means that they all run in the same VM, not that no VM is needed.) After installation, WSL2 tends to be significantly faster than WSL1, but I don't think the difference matters much for the way we are currently using it.
I am aware of two possible reasons to prefer WSL2 for us:
-
My assumption about speed may be wrong, because of the operations involved in installing
bash
in the Alpine Linux distribution. This is taken care of bysetup-wsl
but I believe it is achieved by runningapk
in the distribution.This first reason is the more powerful of the two. (I include the following second reason mainly to say why I think it is not compelling.)
-
I don't currently have access to nested virtualization when I run manual tests. Therefore, for some of the weirder cases that involve modifying a system in a way that I would only want to do on a virtual machine where I can restore a checkpoint afterwards to decisively undo them, local testing is only on WSL1. Having CI run WSL2 might theoretically surface something that is not found otherwise.
(I think systems where
bash.exe
is absent inSystem32
exist, including in versions that are still supported with security updates and therefore reasonable to use. But the way I produce this in manual testing is to deletebash.exe
from there. That requires either taking ownership of it or performing the rename or delete operation with a token in which theTrustedInstaller
SID is enabled. These are unusual things to do to a file inSystem32
and, while I believe I always do them in a way that has no extra unanticipated effects, I prefer not to rely on that belief.)However, this is unlikely. At least for the case of
bash.exe
being absent inSystem32
, we do not currently attempt to test that on CI anyway. It would be feasible to do so. The virtual machines are recreated for every job on GitHub-hosted runners, so doing strange things inSystem32
would be self-contained. But this is less important than more common cases and the extra time and usage needed to test it on CI may not be justified.
Since major changes are unlikely prior to further work on #1791, which would eliminate the need for WSL altogether--unless we want to test that it not used anymore even when fully available--I think our current use of WSL is limited enough that it is fine to use either WSL1 or WSL2 but likely preferable to use WSL1.
But for this PR...
Even if we remain implicitly on windows-2022
(via windows-latest
until it upgrades) and even if we switch to WSL1, the changes in this PR should cause no harm (the major version bump is only because it is potentially breaking for self-hosted runners that may start out with fewer WSL updates even on Windows Server 2025).
This avoids an occasional HTTP 403 error updating WSL for WSL2. For details on that issue and possible approaches, see: gitpython-developers#2008 (review)
@dependabot rebase |
Bumps [Vampire/setup-wsl](https://github.com/vampire/setup-wsl) from 4.1.1 to 5.0.0. - [Release notes](https://github.com/vampire/setup-wsl/releases) - [Commits](Vampire/setup-wsl@v4.1.1...v5.0.0) --- updated-dependencies: - dependency-name: Vampire/setup-wsl dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]>
2e1c5b4
to
54e1c1b
Compare
Thanks a lot for the analysis. Frankly, I am not able to grasp it (or even read it) because it feels like the topic is way above my head and am absolutely happy to just do as you say, you are the expert.
This and my paragraph above should be an indicator that you really should have more capabilities in this repository. Once again, I leave it to you to decide if you'd like to follow up on this offer, and what permissions you'd like to have. If you think more permissions can improve your convenience and/or save you time, I really think it should be done. |
This avoids an occasional HTTP 403 error updating WSL for WSL2. For details on that issue and possible approaches, see: gitpython-developers#2008 (review)
I think I did not write #2008 (review) all that well, which probably also contributed. I've written a clearer summary below. This is partly to ensure things are clear enough that I will myself be able to understand them in the future. So you are under no obligation to read it. But I suspect you might find it useful as well.
Yes, this sounds like a good idea. I suggest adding me to the gitpython-developers organization, unless there is some reason you prefer not to do that. In addition, please feel free to confer write access or whatever other capabilities of any kind that you expect would be helpful (I am not sure what is best). The main caveat I can think of is that although, as noted in the readme, GitPython is open to new maintainers, I am not currently able or willing to take on a social role of maintainership. That is, I cannot at this time take on any actual commitments (other than the responsibility to avoid damaging things and to try to fix things if I do inadvertently damage them). Based on #1654 (comment), #1769 (review), and the current context, my guess is that your offer is not contingent on this. But I want to be sure, rather than assuming. (Speculating wildly: I do not expect that I would be able or willing to take on such a full role in GitPython in the future either. What is more likely is that, at some point in the future, I may be able to take on responsibilities in specific areas, such as security advisories, CI upkeep, and helping out with GSoC or Outreachy mentorship if GitPython ever participates in those.) WSL2 uses virtualization, while WSL1 uses a translation layer that does not involve virtualization. Because GitHub-hosted runners are virtual machines, using WSL2 on them requires support for nested virtualization. That and other factors affected the availability of WSL2 on CI. We don't need WSL2, and when we started using Since then, The change in The new version improved |
Great, I have invited you! The permissions are Triage + Write, and there are no obligations or expectations. Just that having these permissions makes your life easier, doing what you do. Also, thanks for the summary, I think I get it now. |
Thanks! I have joined the gitpython-developers org, and I have also checked that the GitHub interface shows abilities conferred by the new permissions such as the ability to rerun CI checks in this repository (both in PRs and on main), to request reviews on a PR (unprivileged users can only request re-reviews from reviewers who have already reviewed), and to create new branches. |
Bumps Vampire/setup-wsl from 4.1.1 to 5.0.0.
Release notes
Sourced from Vampire/setup-wsl's releases.
Commits
f40fb59
[Gradle Release Plugin] - pre tag commit: 'v5.0.0'.f4cd646
Only do WSL update command on windows-2022 runner image hosted by GitHuba87a88e
Increase version to 5.0.0c19a8f3
[Gradle Release Plugin] - new version commit: 'v4.1.2-SNAPSHOT'.Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebase
will rebase this PR@dependabot recreate
will recreate this PR, overwriting any edits that have been made to it@dependabot merge
will merge this PR after your CI passes on it@dependabot squash and merge
will squash and merge this PR after your CI passes on it@dependabot cancel merge
will cancel a previously requested merge and block automerging@dependabot reopen
will reopen this PR if it is closed@dependabot close
will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditions
will show all of the ignore conditions of the specified dependency@dependabot ignore this major version
will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor version
will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependency
will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)