-
-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make gix_path::env:shell()
more robust and use it in gix-command
#1862
Conversation
Thanks you very much for tackling this! I didn't look at the code of this PR but want to share some thoughts on the failure. Here is the program that is run: for arg in %O %A %B %L %P %S %X %Y %F; do echo $arg >> \"%A\"; done; cat \"%O\" \"%B\" >> \"%A\"" The failing output has 10 lines, whereas a valid output has 11 lines. Probably that is related. My feeling is that one of the arguments in the for loop substitutions is empty. It would disappear then, and everything is offset. From there it should become more obvious which parameter comes out empty, and from there it should become clear what the problem really is. I hope that helps. |
Yes, in the failing run, the |
A couple things didn't mention in #1862 (comment):
It doesn't look like that specific mechanism accounts for the failure, because in the failing case a As an early step (before even doing most of the testing described above), I tried adding The broader issue for me examining this specific test is that I am familiar neither with the details of
No problem. My goal that this is related to is to be able to fix some other bugs where we do not run scripts correctly, such as bugs in the special casing of scripts with shebangs when But that also means I risk become separated from the original bug-fixing goals, which this is peripheral to, if I shift my focus too much onto this. If possible, I'd like to avoid become too embroiled in the details of the
Edit: Reworded for clarity and reordered the sections. |
It attempts to make the arguments that the merge-driver was called with observable. I think there is no quoting because each of the arguments is known not to have spaces in it. However, the lack of quoting also makes empty arguments unobservable, a definitive shortcoming of the current implementation.
Here is the docs, but the short version is that it's a script into which various parameters can be substituted into - this substitution is performed by the the caller of the merge-driver. That caller does a bare string substitution, without any care for quotes or whitespace in general.
I definitely wouldn't want you to suffer unnecessarily. The key for me to solve this certainly is to reproduce the error. For now, it doesn't make much sense to me either, unless… it's a substitution engine, what if one of the arguments that it substitutes also contains substitutable characters? Probably that's not what's happening here though, as This is a patch that should tell exactly what's executed, and from there it would become more obvious what's really happening. diff --git a/gix-merge/src/blob/platform/merge.rs b/gix-merge/src/blob/platform/merge.rs
index 058a30128..324742a00 100644
--- a/gix-merge/src/blob/platform/merge.rs
+++ b/gix-merge/src/blob/platform/merge.rs
@@ -211,7 +211,7 @@ pub(super) mod inner {
cmd.extend_from_slice(token);
}
- Ok(merge::Command {
+ Ok(dbg!(merge::Command {
cmd: gix_command::prepare(gix_path::from_bstring(cmd))
.with_context(context)
.command_may_be_shell_script()
@@ -223,7 +223,7 @@ pub(super) mod inner {
current_path: ours_path,
ancestor: base_tmp,
other: theirs_tmp,
- })
+ }))
}
/// Return the configured driver program for use with [`Self::prepare_external_driver()`], or `Err` I hope that helps. |
Thanks--you're right that the actual arguments that are passed are important information. I had glossed over this because it did not look like they differed between the passing and failing environment, but that doesn't mean they aren't needed to know what's going on. The results, with the patch you gave that reveals the
That's a bit hard to read due to the way the diff --git a/gix-merge/src/blob/platform/merge.rs b/gix-merge/src/blob/platform/merge.rs
index 058a30128..39e27e51f 100644
--- a/gix-merge/src/blob/platform/merge.rs
+++ b/gix-merge/src/blob/platform/merge.rs
@@ -399,6 +399,7 @@ impl<'parent> PlatformRef<'parent> {
match self.configured_driver() {
Ok(driver) => {
let mut cmd = self.prepare_external_driver(driver.command.clone(), labels, context.clone())?;
+ panic!("program: {:?}\nargs: {:#?}", cmd.get_program(), cmd.get_args());
let status = cmd.status().map_err(|err| Error::SpawnExternalDriver {
cmd: format!("{:?}", cmd.cmd),
source: err, Then, when run from PowerShell, the output included:
When run from Git Bash, where the test passes when allowed to proceed, the output included this instead:
This is to say that only the randomly generated suffixes after On the main branch, in PowerShell, where the test also passes if allowed to proceed, the only differences besides those temporary files' names is the shell command:
State from previous runs in this experiment is not likely to be a contributing factor; I ran On main, diff --git a/gix-path/src/env/mod.rs b/gix-path/src/env/mod.rs
index 78e2da294..cd6203df7 100644
--- a/gix-path/src/env/mod.rs
+++ b/gix-path/src/env/mod.rs
@@ -51,7 +51,7 @@ pub fn shell() -> &'static OsStr {
// more readable messages, append literally with `/` separators. The path from
// `git --exec-path` will already have all `/` separators (and no trailing `/`)
// unless it was explicitly overridden to an unusual value via `GIT_EXEC_PATH`.
- raw_path.push("/usr/bin/sh.exe");
+ raw_path.push("/bin/sh.exe");
raw_path
})
.filter(|raw_path| { But I am reluctant to apply that change without understanding what it is about the effect of the Git for Windows shim that is helping, to know that this shim really is preferable rather than the test happening to hinge on a quirk that is different. One reason I am reluctant is that, under that change, if My guess is that the relevant effect of the shim is in changing the environment, but I am not certain it refrains entirely from adjusting arguments. (I think the shim does not adjust arguments directly. But even if that is right, I'll try to modify the command to report its arguments and environment in a way that they cannot be confused with anything else, such as by writing them to a file. However, since this will necessarily modifying the command or its environment, it is not certain to be successful at revealing the cause. Edit: A possible contributing factor is that I also have some When the shim is used, two directories belonging to the MSYS2 installation that is associated with the running shell, The reason I say C:\Users\ek\scoop\apps\git\2.48.1\usr\bin\sh.exe -c 'echo "$PATH" | tr ":" "\n" | sort' > ubp.txt
C:\Users\ek\scoop\apps\git\2.48.1\bin\sh.exe -c 'echo "$PATH" | tr ":" "\n" | sort' > bp.txt
git --no-pager diff --no-index ubp.txt bp.txt Where the output of that last command is: diff --git a/ubp.txt b/bp.txt
index 7390dca90..7190bd5b3 100644
--- a/ubp.txt
+++ b/bp.txt
@@ -1,3 +1,6 @@
+/mingw64/bin
+/usr/bin
+/c/Users/ek/bin
/c/Users/ek/scoop/apps/pwsh/7.5.0
/c/Program Files (x86)/VMware/VMware Workstation/bin
/c/Windows/system32 Note that this experiment is not equivalent to checking the |
This is very strange and puzzling. And that's particularly strange as there isn't even any argument to interpret, the values are quite literally baked in. All I can think of is poking around in the script, maybe by starting to execute it directly. It's pretty clear what is invoked in the working and non-working case, so that could be generalized to invocations that are repeatable on the command-line directly. From there it can be reduced and probed to see what exactly is causing the observable difference (assuming it reproduces in the first place). If it doesn't, we'd know that some environment variable is affecting it as well. And I say this without even implying that you spend more time on this, I am just sharing thoughts. |
Actually, that is no problem. What I wanted to avoid was becoming too distracted trying to understand the semantics of external merge drivers, how But it looks like the problem is due to unexpected behavior of Of course, I still hope I manage to come to a sufficient understanding sooner rather than later, since it is itself a fragment of the larger issue of how to make
Yes. I believe even more strongly, now, that the effect of the shim is the key. The effect is probably through its customization of environment variables, though I am not certain. I continue to suspect that the presence of a separate MSYS2 installation with a bin directory in my With the shim
With the non-shim
So the If we are going to prefer an More broadly, as I understand it, the Git for Windows shim exists for two reasons. It delegates to the "real" executable. But it also modifies the environment. Shims don't have to modify the environment; for example, the This might be separate from the issue of whether to prefer one of the shims of Switching to using the shell through its shim makes sense both for
Further investigation, so farAs for why I believe the effect of the shim is the cause, I tried this modification: diff --git a/gix-merge/tests/merge/blob/platform.rs b/gix-merge/tests/merge/blob/platform.rs
index 6230377b8..2dc09ebe5 100644
--- a/gix-merge/tests/merge/blob/platform.rs
+++ b/gix-merge/tests/merge/blob/platform.rs
@@ -265,7 +265,7 @@ theirs
[gix_merge::blob::Driver {
name: "b".into(),
command:
- "for arg in %O %A %B %L %P %S %X %Y %F; do echo $arg >> \"%A\"; done; cat \"%O\" \"%B\" >> \"%A\""
+ "(printf '[%q]\\n' \"$0\" \"$@\"; set -o posix; set) >args+env; for arg in %O %A %B %L %P %S %X %Y %F; do echo $arg >> \"%A\"; done; cat \"%O\" \"%B\" >> \"%A\""
.into(),
..Default::default()
}], That makes it create a file called
I ran
In the second run, the change to diff --git a/gix-path/src/env/mod.rs b/gix-path/src/env/mod.rs
index 78e2da294..cd6203df7 100644
--- a/gix-path/src/env/mod.rs
+++ b/gix-path/src/env/mod.rs
@@ -51,7 +51,7 @@ pub fn shell() -> &'static OsStr {
// more readable messages, append literally with `/` separators. The path from
// `git --exec-path` will already have all `/` separators (and no trailing `/`)
// unless it was explicitly overridden to an unusual value via `GIT_EXEC_PATH`.
- raw_path.push("/usr/bin/sh.exe");
+ raw_path.push("/bin/sh.exe");
raw_path
})
.filter(|raw_path| { After each run, I moved the just-created diff --git a/no-shim.txt b/with-shim.txt
index 3f51928c0..11cbfa69b 100644
--- a/no-shim.txt
+++ b/with-shim.txt
@@ -4 +4 @@ APPDATA='C:\Users\ek\AppData\Roaming'
-BASH=/usr/bin/sh
+BASH=/usr/bin/bash
@@ -10 +10 @@ BASH_CMDS=()
-BASH_EXECUTION_STRING='(printf '\''[%q]\n'\'' "$0" "$@"; set -o posix; set) >args+env; for arg in C:\Users\ek\source\repos\gitoxide\gix-merge\.tmpLx3byu C:\Users\ek\source\repos\gitoxide\gix-merge\.tmp7p4vnw C:\Users\ek\source\repos\gitoxide\gix-merge\.tmpnxT3lk 7 '\''b'\'' '\''ancestor label'\'' '\''current label'\'' '\''other label'\'' %F; do echo $arg >> "C:\Users\ek\source\repos\gitoxide\gix-merge\.tmp7p4vnw"; done; cat "C:\Users\ek\source\repos\gitoxide\gix-merge\.tmpLx3byu" "C:\Users\ek\source\repos\gitoxide\gix-merge\.tmpnxT3lk" >> "C:\Users\ek\source\repos\gitoxide\gix-merge\.tmp7p4vnw"'
+BASH_EXECUTION_STRING='(printf '\''[%q]\n'\'' "$0" "$@"; set -o posix; set) >args+env; for arg in C:\Users\ek\source\repos\gitoxide\gix-merge\.tmp5rjEOo C:\Users\ek\source\repos\gitoxide\gix-merge\.tmpaOs19F C:\Users\ek\source\repos\gitoxide\gix-merge\.tmp62jrFQ 7 '\''b'\'' '\''ancestor label'\'' '\''current label'\'' '\''other label'\'' %F; do echo $arg >> "C:\Users\ek\source\repos\gitoxide\gix-merge\.tmpaOs19F"; done; cat "C:\Users\ek\source\repos\gitoxide\gix-merge\.tmp5rjEOo" "C:\Users\ek\source\repos\gitoxide\gix-merge\.tmp62jrFQ" >> "C:\Users\ek\source\repos\gitoxide\gix-merge\.tmpaOs19F"'
@@ -39,0 +40 @@ EUID=197609
+EXEPATH='C:\Users\ek\scoop\apps\git\2.48.1\bin'
@@ -56,0 +58 @@ MACHTYPE=x86_64-pc-msys
+MSYSTEM=MINGW64
@@ -61 +63 @@ NEXTEST_PROFILE=default
-NEXTEST_RUN_ID=203cd7ac-794f-4c11-af4a-c96333d95745
+NEXTEST_RUN_ID=ab48dadf-b347-40f2-8770-a73e91c8eb7d
@@ -72 +74 @@ PAGER='less -F'
-PATH='/c/Users/ek/source/repos/gitoxide/target/debug/deps:/c/Users/ek/source/repos/gitoxide/target/debug:/c/Users/ek/.rustup/toolchains/stable-x86_64-pc-windows-msvc/lib/rustlib/x86_64-pc-windows-msvc/lib:/c/Users/ek/scoop/apps/pwsh/7.5.0:/c/Program Files (x86)/VMware/VMware Workstation/bin:/c/Windows/system32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0:/c/Windows/System32/OpenSSH:/c/Program Files (x86)/Windows Kits/10/Windows Performance Toolkit:/c/Program Files/dotnet:/c/Users/ek/scoop/apps/vscodium/current/bin:/c/Users/ek/scoop/apps/nodejs-lts/current/bin:/c/Users/ek/scoop/apps/nodejs-lts/current:/c/Users/ek/scoop/apps/openjdk23/current/bin:/c/Users/ek/scoop/apps/ghostscript/current/lib:/c/Users/ek/scoop/apps/gpg/current/bin:/c/Users/ek/scoop/apps/mpv/current:/c/Users/ek/scoop/apps/maven/current/bin:/c/Users/ek/.cargo/bin:/c/Users/ek/bin:/c/Users/ek/.local/share/gem/ruby/3.0.0/bin:/c/Users/ek/scoop/shims:/c/Users/ek/AppData/Local/Packages/PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0/LocalCache/local-packages/Python312/Scripts:/c/Users/ek/AppData/Local/Microsoft/WindowsApps:/c/Users/ek/.dotnet/tools:/c/Users/ek/AppData/Local/Programs/Microsoft VS Code/bin:/c/msys64/ucrt64/bin:/c/msys64/ucrt64/sbin:/c/msys64/usr/bin:/c/Users/ek/AppData/Local/Programs/MiKTeX/miktex/bin/x64:/c/users/ek/.local/bin:/c/Users/ek/AppData/Local/JetBrains/Toolbox/scripts:/c/Program Files/IronPython 3.4:/c/Users/ek/AppData/Local/Programs/Microsoft VS Code Insiders/bin:/c/Users/ek/.dotnet/tools'
+PATH='/mingw64/bin:/usr/bin:/c/Users/ek/bin:/c/Users/ek/source/repos/gitoxide/target/debug/deps:/c/Users/ek/source/repos/gitoxide/target/debug:/c/Users/ek/.rustup/toolchains/stable-x86_64-pc-windows-msvc/lib/rustlib/x86_64-pc-windows-msvc/lib:/c/Users/ek/scoop/apps/pwsh/7.5.0:/c/Program Files (x86)/VMware/VMware Workstation/bin:/c/Windows/system32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0:/c/Windows/System32/OpenSSH:/c/Program Files (x86)/Windows Kits/10/Windows Performance Toolkit:/c/Program Files/dotnet:/c/Users/ek/scoop/apps/vscodium/current/bin:/c/Users/ek/scoop/apps/nodejs-lts/current/bin:/c/Users/ek/scoop/apps/nodejs-lts/current:/c/Users/ek/scoop/apps/openjdk23/current/bin:/c/Users/ek/scoop/apps/ghostscript/current/lib:/c/Users/ek/scoop/apps/gpg/current/bin:/c/Users/ek/scoop/apps/mpv/current:/c/Users/ek/scoop/apps/maven/current/bin:/c/Users/ek/.cargo/bin:/c/Users/ek/bin:/c/Users/ek/.local/share/gem/ruby/3.0.0/bin:/c/Users/ek/scoop/shims:/c/Users/ek/AppData/Local/Packages/PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0/LocalCache/local-packages/Python312/Scripts:/c/Users/ek/AppData/Local/Microsoft/WindowsApps:/c/Users/ek/.dotnet/tools:/c/Users/ek/AppData/Local/Programs/Microsoft VS Code/bin:/c/msys64/ucrt64/bin:/c/msys64/ucrt64/sbin:/c/msys64/usr/bin:/c/Users/ek/AppData/Local/Programs/MiKTeX/miktex/bin/x64:/c/users/ek/.local/bin:/c/Users/ek/AppData/Local/JetBrains/Toolbox/scripts:/c/Program Files/IronPython 3.4:/c/Users/ek/AppData/Local/Programs/Microsoft VS Code Insiders/bin:/c/Users/ek/.dotnet/tools'
@@ -74,0 +77 @@ PIPESTATUS=([0]="0")
+PLINK_PROTOCOL=ssh (The The changed lines that I think are very unlikely to contribute in any way to the difference in behavior are:
In contrast, the changed lines for environment variables that I think could plausibly make a difference are:
The differences in I did another experiment where I applies this change instead, tracing the shell commands and recording standard error including from external processes such as diff --git a/gix-merge/tests/merge/blob/platform.rs b/gix-merge/tests/merge/blob/platform.rs
index 6230377b8..050e97873 100644
--- a/gix-merge/tests/merge/blob/platform.rs
+++ b/gix-merge/tests/merge/blob/platform.rs
@@ -265,7 +265,7 @@ theirs
[gix_merge::blob::Driver {
name: "b".into(),
command:
- "for arg in %O %A %B %L %P %S %X %Y %F; do echo $arg >> \"%A\"; done; cat \"%O\" \"%B\" >> \"%A\""
+ "exec 2>messages; set -x; for arg in %O %A %B %L %P %S %X %Y %F; do echo $arg >> \"%A\"; done; cat \"%O\" \"%B\" >> \"%A\""
.into(),
..Default::default()
}], Analogously to the above-described experiment, I ran the test with that change twice, once without the shim and once with the shim, moving and renaming the log file after each run. The diffs showed only the change in temporary-file names. In neither run was anything in a different order. Nor were there any error messages, such as I would expect if the shell could not open a file for redirection or if
Running modified versions of the script that write to different locations I can easily inspect within a test directory does not reveal anything. But I am not sure this captures what is relevant. I think the next step along these lines is to preserve the temporary files that the test uses and examine them. |
That's good to know.
This is incredibly valued work, and work that I can't even do as it, I wouldn't last long. I guess that's another way to thank you for your incredible work!
My initial intuition was to jump to it and be done, but the reasoning to not do that just yet is very sound. I found the "do the work of the shim and avoid calling it that way" very interesting. Understanding this better could at least lead to documentation that helps to one day implement it, even if it's not done in the initial implementation. I mostly skimmed over the details that followed, but saw the details of the merge-driver runs. It's strange that it has the wrong output despite the calls being the same, and that it seems so elusive to find out why it does that. The caller is known, the input is known, the binary is known, the script itself is known, so what could possibly be causing the different result. I wonder if it can be executed directly while reproducing the issue, allowing it to be prodded and probed with impunity until it reveals its secret. The diff-tool is very interesting by the way, I starred the repo and would hope to benefit from it one day. |
I have found that the cause of the problem is that, if...
...then, instead of appending, the effect of In the failing test, the eight bytes Simplified, repeatable demonstrationHere is a boiled-down example, with more explicit (but otherwise simpler) commands, using two PortableGit installations:
It should produce
( When the outer This is distinct from "DLL hell"This also does not arise due to executables using the wrong DLL, at least not in a straightforward way:
That they use
But it is not unprecedentedWhile this seems unlikely to be anticipated or desired, I am not certain that this is even considered a bug in MSYS2. As far as I can find, no MSYS2 documentation warns against a program in on MSYS2 installation calling a program in another MSYS2 installation. However, this is explicitly documented as unsupported in Cygwin (from which MSYS2 derives):
Though what I have observed seems to me much subtler, more surprising, and if unexpected more insidious in its effects, than some of the usual effects that are warned about:
Outer
|
65f706b
to
adc2ce1
Compare
It now prefers the `(git root)/bin/sh.exe` shim, falling back to the `(git root)/usr/bin/sh.exe` non-shim to support the Git for Windows SDK which doesn't have the shim. The reason to prefer the shim is that it sets environment variables, including prepending `bin` directories that provide tools one would expect to have when using it. Without this, common POSIX commands may be unavailable, or different and incompatible implementations of them may be found. In particular, if they are found in a different MSYS2 installation whose `msys-2.0.dll` is of a different version or otherwise a different build, then calling them directly may produce strange behavior. See: - https://cygwin.com/faq.html#faq.using.multiple-copies - GitoxideLabs#1862 (comment) Overall this makes things more robust than either preferring the non-shim or just doing a path search for `sh` as was done before that. But it exacerbates GitoxideLabs#1868 (as described there), so if the Git for Windows `sh.exe` shim continues to work as it currently does, then further improvements may be called for here.
This makes a few changes to make `shell()` more robust: 1. Check the last two components of the path `git --exec-path` gave, to make sure they are `libexec/git-core`. (The check is done in such a way that the separator may be `/` or `\`, though a `\` separator here would be unexpected. We permit it because it may plausibly be present due to an overriden `GIT_EXEC_PATH` that breaks with Git's own behavior of using `/` but that is otherwise fully usable.) If the directory is not named `git-core`, or it is a top-level directory (no parent), or its parent is not named `libexec`, then it is not reasonable to guess that this is in a directory where it would be safe to use `sh.exe` in the expected relative location. (Even if safe, such a layout does not suggest that a `sh.exe` found in it would be better choice than the fallback of just doing a `PATH` search.) 2. Check the grandparent component (that `../..` would go to) of the path `git --exec-path` gave, to make sure it is recognized name of a platform-specific `usr`-like directory that has been used in MSYS2. This is to avoid traversing up out of less common directory trees that have some different and shallower structure than found in a typical Git for Windows or MSYS2 installation. 3. Instead of using only the `(git root)/usr/bin/sh.exe` non-shim, prefer the `(git root)/bin/sh.exe` shim. If that is not found, fall back to the `(git root)/usr/bin/sh.exe` non-shim, mainly to support the Git for Windows SDK, which doesn't have the shim. The reason to prefer the shim is that it sets environment variables, including prepending `bin` directories that provide tools one would expect to have when using it. Without this, common POSIX commands may be unavailable, or different and incompatible implementations of them may be found. In particular, if they are found in a different MSYS2 installation whose `msys-2.0.dll` is of a different version or otherwise a different build, then calling them directly may produce strange behavior. See: - https://cygwin.com/faq.html#faq.using.multiple-copies - GitoxideLabs#1862 (comment) This makes things more robust overall than either preferring the non-shim or just doing a path search for `sh` as was done before that. But it exacerbates GitoxideLabs#1868 (as described there), so if the Git for Windows `sh.exe` shim continues to work as it currently does, then further improvements may be called for here.
352581d
to
ac34530
Compare
This makes a few changes to make `shell()` more robust: 1. Check the last two components of the path `git --exec-path` gave, to make sure they are `libexec/git-core`. (The check is done in such a way that the separator may be `/` or `\`, though a `\` separator here would be unexpected. We permit it because it may plausibly be present due to an overriden `GIT_EXEC_PATH` that breaks with Git's own behavior of using `/` but that is otherwise fully usable.) If the directory is not named `git-core`, or it is a top-level directory (no parent), or its parent is not named `libexec`, then it is not reasonable to guess that this is in a directory where it would be safe to use `sh.exe` in the expected relative location. (Even if safe, such a layout does not suggest that a `sh.exe` found in it would be better choice than the fallback of just doing a `PATH` search.) 2. Check the grandparent component (that `../..` would go to) of the path `git --exec-path` gave, to make sure it is recognized name of a platform-specific `usr`-like directory that has been used in MSYS2. This is to avoid traversing up out of less common directory trees that have some different and shallower structure than found in a typical Git for Windows or MSYS2 installation. 3. Instead of using only the `(git root)/usr/bin/sh.exe` non-shim, prefer the `(git root)/bin/sh.exe` shim. If that is not found, fall back to the `(git root)/usr/bin/sh.exe` non-shim, mainly to support the Git for Windows SDK, which doesn't have the shim. The reason to prefer the shim is that it sets environment variables, including prepending `bin` directories that provide tools one would expect to have when using it. Without this, common POSIX commands may be unavailable, or different and incompatible implementations of them may be found. In particular, if they are found in a different MSYS2 installation whose `msys-2.0.dll` is of a different version or otherwise a different build, then calling them directly may produce strange behavior. See: - https://cygwin.com/faq.html#faq.using.multiple-copies - GitoxideLabs#1862 (comment) This makes things more robust overall than either preferring the non-shim or just doing a path search for `sh` as was done before that. But it exacerbates GitoxideLabs#1868 (as described there), so if the Git for Windows `sh.exe` shim continues to work as it currently does, then further improvements may be called for here.
ac34530
to
93f0804
Compare
I've added a task list (and outscoped list) to the PR description for the plan articulated in #1862 (comment), so it's clear what is done and what is left to do. I think this may be almost ready. |
This makes a few changes to make `shell()` more robust: 1. Check the last two components of the path `git --exec-path` gave, to make sure they are `libexec/git-core`. (The check is done in such a way that the separator may be `/` or `\`, though a `\` separator here would be unexpected. We permit it because it may plausibly be present due to an overriden `GIT_EXEC_PATH` that breaks with Git's own behavior of using `/` but that is otherwise fully usable.) If the directory is not named `git-core`, or it is a top-level directory (no parent), or its parent is not named `libexec`, then it is not reasonable to guess that this is in a directory where it would be safe to use `sh.exe` in the expected relative location. (Even if safe, such a layout does not suggest that a `sh.exe` found in it would be better choice than the fallback of just doing a `PATH` search.) 2. Check the grandparent component (that `../..` would go to) of the path `git --exec-path` gave, to make sure it is recognized name of a platform-specific `usr`-like directory that has been used in MSYS2. This is to avoid traversing up out of less common directory trees that have some different and shallower structure than found in a typical Git for Windows or MSYS2 installation. 3. Instead of using only the `(git root)/usr/bin/sh.exe` non-shim, prefer the `(git root)/bin/sh.exe` shim. If that is not found, fall back to the `(git root)/usr/bin/sh.exe` non-shim, mainly to support the Git for Windows SDK, which doesn't have the shim. The reason to prefer the shim is that it sets environment variables, including prepending `bin` directories that provide tools one would expect to have when using it. Without this, common POSIX commands may be unavailable, or different and incompatible implementations of them may be found. In particular, if they are found in a different MSYS2 installation whose `msys-2.0.dll` is of a different version or otherwise a different build, then calling them directly may produce strange behavior. See: - https://cygwin.com/faq.html#faq.using.multiple-copies - GitoxideLabs#1862 (comment) This makes things more robust overall than either preferring the non-shim or just doing a path search for `sh` as was done before that. But it exacerbates GitoxideLabs#1868 (as described there), so if the Git for Windows `sh.exe` shim continues to work as it currently does, then further improvements may be called for here.
93f0804
to
37c7b0e
Compare
Something I wanted to just have mentioned, even though this might not be the best place. Indeed, using the The idea is to have a higher chance of picking up the configuration that the user sees in terminals. In practice, that doesn't work much better, at last not for me, so I keep thinking that there should be another way of extracting the environment variables from an actual login shell instead. If |
The change in the previous commit of switching to the non-shim `bash.exe` in `(git root)/usr/bin` causes problems, because the environment may not be correct for shell commands and scripts. In particular, the `PATH` might not enable standard POSIX tools to be found, or the tools that are found may interoperate incorrectly with the shell. The latter caused failures in GitoxideLabs#1862 in an analogous choice of `sh.exe`, which were addressed by preferring the shim when available. See: - GitoxideLabs#1862 (comment) Here, 90 tests started to fail when the test suite was run locally from PowerShell (i.e. not a Git Bash environment) on a Windows 10 system that, in addition to a full Git for Windows installation, contains a separate non-GfW MSYS2 installation whose `bin` directories are in `PATH` even in non-MSYS2 environments. The failures were described, and most of them investigated, as follows: - GitoxideLabs#1864 (comment) - https://gist.github.com/EliahKagan/3c5eebd091e66d8c912fddbce0a064cd - https://gist.github.com/EliahKagan/17066ad1f7b0aa98e4fdf4642abe1d93 Most failures, including all those that were unintuitive, were directly or indirectly due to the `make_remote_repos.sh` fixture script encountering the error: fatal: bad config line 10 in file ./config This happened due to the same incorrect behavior of `>>`, when used by a shell that links to one `msys-2.0.dll` running a program that links to a different `msys-2.0.dll` of another version or build, as caused the failure encountered with the non-shim in GitoxideLabs#1862. (It may be the handful of other failures are also caused by this `>>` problem, but as of now that has not been examined.) This commit temporarily instruments that fixture script so that, when tests are run, the observations and analysis in the last gist above can be confirmed. (These changes are also shown there.)
The change in the previous commit of switching to the non-shim `bash.exe` in `(git root)/usr/bin` causes problems, because the environment may not be correct for shell commands and scripts. In particular, the `PATH` might not enable standard POSIX tools to be found, or the tools that are found may interoperate incorrectly with the shell. The latter caused failures in GitoxideLabs#1862 in an analogous choice of `sh.exe`, which were addressed by preferring the shim when available. See: - GitoxideLabs#1862 (comment) Here, 90 tests started to fail when the test suite was run locally from PowerShell (i.e. not a Git Bash environment) on a Windows 10 system that, in addition to a full Git for Windows installation, contains a separate non-GfW MSYS2 installation whose `bin` directories are in `PATH` even in non-MSYS2 environments. The failures were described, and most of them investigated, as follows: - GitoxideLabs#1864 (comment) - https://gist.github.com/EliahKagan/3c5eebd091e66d8c912fddbce0a064cd - https://gist.github.com/EliahKagan/17066ad1f7b0aa98e4fdf4642abe1d93 Most failures, including all those that were unintuitive, were directly or indirectly due to the `make_remote_repos.sh` fixture script encountering the error: fatal: bad config line 10 in file ./config This happened due to the same incorrect behavior of `>>`, when used by a shell that links to one `msys-2.0.dll` running a program that links to a different `msys-2.0.dll` of another version or build, as caused the failure encountered with the non-shim in GitoxideLabs#1862. (It may be the handful of other failures are also caused by this `>>` problem, but as of now that has not been examined.) This commit temporarily instruments that fixture script so that, when tests are run, the observations and analysis in the last gist above can be confirmed. (These changes are also shown there.)
Now that Git for Windows 2.49.0 has a stable release, this changes the upgrade step that was added to `test-fixtures-windows` in 4237e5a (GitoxideLabs#1870), so that it downloads an installer from the release marked as "latest", rather than the release that has the newest tag. The release marked "latest" is usually a stable release in projects that have any stable releases, and in particular it is a stable release in Git for Windows. This is *not* needed to switch from the release candidate to the stable release for 2.49.0. The download logic already in place currently gets the stable release automatically, because it is the newest tag. Nonetheless, there are three reasons to prefer the "latest" tag to get the stable release, now that the stable release is available. In descending order of significance, they are: - We upgrade to work around GitoxideLabs#1849, for which 2.49.0 is preferable to 2.48.1 (which the Windows runner images currently have). Continuing to take the newest tag will eventually take a pre-release for the next version. That would probably work, but it is not currently a goal. There is sometimes a delay between when a stable release of Git for Windows comes out and when the stable runner images are released with it. (Pre-release runner images exist, but they are not run on GitHub-hosted runners.) So even assuming this upgrade step is to be removed once it is no longer needed, it could easily end up remaining long enough for a new Git for Windows pre-release to come out. - An update may potentially be released for an earlier minor version (y in x.y.z), in which case the tag for it would be newer and we would downgrade instead. Now that the release marked "latest" is usable here, we can use it and avoid that. - If we decide to eventually deliberately test pre-releases, the step added in GitoxideLabs#1849 would probably not be usable in that form, because it could take either the next pre-release or a patch to an ealier release per the above points, and also for the separate reason that this CI job is not necessarily where we would want to test that. (As one example, there is currently no CI testing of the Git for Windows SDK, even though supporting it is an explicit goal discussed in GitoxideLabs#1758, GitoxideLabs#1761, GitoxideLabs#1862, and GitoxideLabs#1864. If that is added, it may be a more opportune way to test prereleases.)
Now that Git for Windows 2.49.0 has a stable release, this changes the upgrade step that was added to `test-fixtures-windows` in 4237e5a (GitoxideLabs#1870), so that it downloads an installer from the release marked as "latest", rather than the release that has the newest tag. The release marked "latest" is usually a stable release in projects that have any stable releases, and in particular it is a stable release in Git for Windows. This is *not* needed to switch from the release candidate to the stable release for 2.49.0. The download logic already in place currently gets the stable release automatically, because it is the newest tag. Nonetheless, there are three reasons to prefer the "latest" tag to get the stable release, now that the stable release is available. In descending order of significance, they are: - We upgrade to work around GitoxideLabs#1849, for which 2.49.0 is preferable to 2.48.1 (which the Windows runner images currently have). Continuing to take the newest tag will eventually take a pre-release for the next version. That would probably work, but it is not currently a goal. There is sometimes a delay between when a stable release of Git for Windows comes out and when the stable runner images are released with it. (Pre-release runner images exist, but they are not run on GitHub-hosted runners.) So even assuming this upgrade step is to be removed once it is no longer needed, it could easily end up remaining long enough for a new Git for Windows pre-release to come out. - An update may potentially be released for an earlier minor version (y in x.y.z), in which case the tag for it would be newer and we would downgrade instead. Now that the release marked "latest" is usable here, we can use it and avoid that. - If we decide to eventually deliberately test pre-releases, the step added in GitoxideLabs#1849 would probably not be usable in that form, because it could take either the next pre-release or a patch to an ealier release per the above points, and also for the separate reason that this CI job is not necessarily where we would want to test that. (As one example, there is currently no CI testing of the Git for Windows SDK, even though supporting it, in general and for running the test suite, is an explicit goal discussed in GitoxideLabs#1758, GitoxideLabs#1761, GitoxideLabs#1862, and GitoxideLabs#1864. If that is added, it may be a more opportune way to test prereleases.)
This makes a few changes to make `shell()` more robust: 1. Check the last two components of the path `git --exec-path` gave, to make sure they are `libexec/git-core`. (The check is done in such a way that the separator may be `/` or `\`, though a `\` separator here would be unexpected. We permit it because it may plausibly be present due to an overriden `GIT_EXEC_PATH` that breaks with Git's own behavior of using `/` but that is otherwise fully usable.) If the directory is not named `git-core`, or it is a top-level directory (no parent), or its parent is not named `libexec`, then it is not reasonable to guess that this is in a directory where it would be safe to use `sh.exe` in the expected relative location. (Even if safe, such a layout does not suggest that a `sh.exe` found in it would be better choice than the fallback of just doing a `PATH` search.) 2. Check the grandparent component (that `../..` would go to) of the path `git --exec-path` gave, to make sure it is recognized name of a platform-specific `usr`-like directory that has been used in MSYS2. This is to avoid traversing up out of less common directory trees that have some different and shallower structure than found in a typical Git for Windows or MSYS2 installation. 3. Instead of using only the `(git root)/usr/bin/sh.exe` non-shim, prefer the `(git root)/bin/sh.exe` shim. If that is not found, fall back to the `(git root)/usr/bin/sh.exe` non-shim, mainly to support the Git for Windows SDK, which doesn't have the shim. The reason to prefer the shim is that it sets environment variables, including prepending `bin` directories that provide tools one would expect to have when using it. Without this, common POSIX commands may be unavailable, or different and incompatible implementations of them may be found. In particular, if they are found in a different MSYS2 installation whose `msys-2.0.dll` is of a different version or otherwise a different build, then calling them directly may produce strange behavior. See: - https://cygwin.com/faq.html#faq.using.multiple-copies - GitoxideLabs#1862 (comment) This makes things more robust overall than either preferring the non-shim or just doing a path search for `sh` as was done before that. But it exacerbates GitoxideLabs#1868 (as described there), so if the Git for Windows `sh.exe` shim continues to work as it currently does, then further improvements may be called for here.
592cc5a
to
fc8e5f8
Compare
The change in the previous commit of switching to the non-shim `bash.exe` in `(git root)/usr/bin` causes problems, because the environment may not be correct for shell commands and scripts. In particular, the `PATH` might not enable standard POSIX tools to be found, or the tools that are found may interoperate incorrectly with the shell. The latter caused failures in GitoxideLabs#1862 in an analogous choice of `sh.exe`, which were addressed by preferring the shim when available. See: - GitoxideLabs#1862 (comment) Here, 90 tests started to fail when the test suite was run locally from PowerShell (i.e. not a Git Bash environment) on a Windows 10 system that, in addition to a full Git for Windows installation, contains a separate non-GfW MSYS2 installation whose `bin` directories are in `PATH` even in non-MSYS2 environments. The failures were described, and most of them investigated, as follows: - GitoxideLabs#1864 (comment) - https://gist.github.com/EliahKagan/3c5eebd091e66d8c912fddbce0a064cd - https://gist.github.com/EliahKagan/17066ad1f7b0aa98e4fdf4642abe1d93 Most failures, including all those that were unintuitive, were directly or indirectly due to the `make_remote_repos.sh` fixture script encountering the error: fatal: bad config line 10 in file ./config This happened due to the same incorrect behavior of `>>`, when used by a shell that links to one `msys-2.0.dll` running a program that links to a different `msys-2.0.dll` of another version or build, as caused the failure encountered with the non-shim in GitoxideLabs#1862. (It may be the handful of other failures are also caused by this `>>` problem, but as of now that has not been examined.) This commit temporarily instruments that fixture script so that, when tests are run, the observations and analysis in the last gist above can be confirmed. (These changes are also shown there.)
This makes a few changes to make `shell()` more robust: 1. Check the last two components of the path `git --exec-path` gave, to make sure they are `libexec/git-core`. (The check is done in such a way that the separator may be `/` or `\`, though a `\` separator here would be unexpected. We permit it because it may plausibly be present due to an overriden `GIT_EXEC_PATH` that breaks with Git's own behavior of using `/` but that is otherwise fully usable.) If the directory is not named `git-core`, or it is a top-level directory (no parent), or its parent is not named `libexec`, then it is not reasonable to guess that this is in a directory where it would be safe to use `sh.exe` in the expected relative location. (Even if safe, such a layout does not suggest that a `sh.exe` found in it would be better choice than the fallback of just doing a `PATH` search.) 2. Check the grandparent component (that `../..` would go to) of the path `git --exec-path` gave, to make sure it is recognized name of a platform-specific `usr`-like directory that has been used in MSYS2. This is to avoid traversing up out of less common directory trees that have some different and shallower structure than found in a typical Git for Windows or MSYS2 installation. 3. Instead of using only the `(git root)/usr/bin/sh.exe` non-shim, prefer the `(git root)/bin/sh.exe` shim. If that is not found, fall back to the `(git root)/usr/bin/sh.exe` non-shim, mainly to support the Git for Windows SDK, which doesn't have the shim. The reason to prefer the shim is that it sets environment variables, including prepending `bin` directories that provide tools one would expect to have when using it. Without this, common POSIX commands may be unavailable, or different and incompatible implementations of them may be found. In particular, if they are found in a different MSYS2 installation whose `msys-2.0.dll` is of a different version or otherwise a different build, then calling them directly may produce strange behavior. See: - https://cygwin.com/faq.html#faq.using.multiple-copies - GitoxideLabs#1862 (comment) This makes things more robust overall than either preferring the non-shim or just doing a path search for `sh` as was done before that. But it exacerbates GitoxideLabs#1868 (as described there), so if the Git for Windows `sh.exe` shim continues to work as it currently does, then further improvements may be called for here.
fc8e5f8
to
37d676e
Compare
The change in the previous commit of switching to the non-shim `bash.exe` in `(git root)/usr/bin` causes problems, because the environment may not be correct for shell commands and scripts. In particular, the `PATH` might not enable standard POSIX tools to be found, or the tools that are found may interoperate incorrectly with the shell. The latter caused failures in GitoxideLabs#1862 in an analogous choice of `sh.exe`, which were addressed by preferring the shim when available. See: - GitoxideLabs#1862 (comment) Here, 90 tests started to fail when the test suite was run locally from PowerShell (i.e. not a Git Bash environment) on a Windows 10 system that, in addition to a full Git for Windows installation, contains a separate non-GfW MSYS2 installation whose `bin` directories are in `PATH` even in non-MSYS2 environments. The failures were described, and most of them investigated, as follows: - GitoxideLabs#1864 (comment) - https://gist.github.com/EliahKagan/3c5eebd091e66d8c912fddbce0a064cd - https://gist.github.com/EliahKagan/17066ad1f7b0aa98e4fdf4642abe1d93 Most failures, including all those that were unintuitive, were directly or indirectly due to the `make_remote_repos.sh` fixture script encountering the error: fatal: bad config line 10 in file ./config This happened due to the same incorrect behavior of `>>`, when used by a shell that links to one `msys-2.0.dll` running a program that links to a different `msys-2.0.dll` of another version or build, as caused the failure encountered with the non-shim in GitoxideLabs#1862. (It may be the handful of other failures are also caused by this `>>` problem, but as of now that has not been examined.) This commit temporarily instruments that fixture script so that, when tests are run, the observations and analysis in the last gist above can be confirmed. (These changes are also shown there.)
`gix_command::Prepare` previously used `sh` on Windows rather than first checking for a usable `sh` implementation associated with a Git for Windows installation. This changes it to use the `gix-path` facility for finding what is likely the best 'sh' implementation for POSIX shell scripts that will operate on Git repositories. This increases consistency across how different crates find 'sh', and brings the benefit of preferring the Git for Windows `sh.exe` on Windows when it can be found reliably.
Because `gix-command` uses `gix_path::env::shell()` to find sh, and `gix-credentials` uses `gix-command`.
This makes the path returned by `gix_path::env::shell()` on Windows more usable by: 1. Adding components with `/` separators. While in principle a `\` should work, the path of the shell itself is used in shell scripts (script files and `sh -c` operands) that may not account for the presence of backslashes, and it is also harder to read paths with `\` in contexts where it appears escaped, which may include various messages from Rust code and shell scripts. The path before what we add will already use `/` and never `\`, unless `GIT_EXEC_PATH` has been set to a strange value, because it is based on `git --exec-path`, which by default gives a path with `/` separators. Thus, ensuring that the part we add uses `/` should be sufficient to get a path without `\` in all cases when it is clearly reasonable to do so. This therefore also usually increases stylistic consistency of the path, which is another factor that makes it more user-friendly in messages. This is needed to get tests to pass since changing `gix-command` to use `gix_path::env::shell()` on Windows, where a path is formatted in away that sometimes quotes `\` characters. Their expectations could be adjusted, but it seems likely that various other software, much of which may otherwise be working, has similar expectations. Using `/` instead of `\` works whether `\` is expected to be displayed quoted or not. 2. Check that the path to the shell plausibly has a shell there, only using it if it a file or a non-broken file symlink. When this is not the case, the fallback short name is used instead. 3. The fallback short name is changed from `sh` to `sh.exe`, since the `.exe` suffix is appended in other short names on Windows, such as `git.exe`, as well as being part of the filename component of the path we build for the shell when using the implementation provided as part of Git for Windows. Those changes only affect Windows. This also adds tests for (1) and (2) above, as well as for the expectation that we get an absolute path, to make sure we don't build a path that would be absolute on a Unix-like system but is relative on Windows (a path that starts with just one `/` or `\`). These tests are not Windows-specific, since all these expectations should already hold on Unix-like systems, where currently we are using the hard-coded path `/bin/sh`, which is an absolute path on those systems. (Some Unix-like systems may technically not have `/bin/sh` or it may not be the best path to use for a shell that should be POSIX-compatible, but we are already relying on this, and handling that better is outside the scope of the changes here.)
This makes a few changes to make `shell()` more robust: 1. Check the last two components of the path `git --exec-path` gave, to make sure they are `libexec/git-core`. (The check is done in such a way that the separator may be `/` or `\`, though a `\` separator here would be unexpected. We permit it because it may plausibly be present due to an overriden `GIT_EXEC_PATH` that breaks with Git's own behavior of using `/` but that is otherwise fully usable.) If the directory is not named `git-core`, or it is a top-level directory (no parent), or its parent is not named `libexec`, then it is not reasonable to guess that this is in a directory where it would be safe to use `sh.exe` in the expected relative location. (Even if safe, such a layout does not suggest that a `sh.exe` found in it would be better choice than the fallback of just doing a `PATH` search.) 2. Check the grandparent component (that `../..` would go to) of the path `git --exec-path` gave, to make sure it is recognized name of a platform-specific `usr`-like directory that has been used in MSYS2. This is to avoid traversing up out of less common directory trees that have some different and shallower structure than found in a typical Git for Windows or MSYS2 installation. 3. Instead of using only the `(git root)/usr/bin/sh.exe` non-shim, prefer the `(git root)/bin/sh.exe` shim. If that is not found, fall back to the `(git root)/usr/bin/sh.exe` non-shim, mainly to support the Git for Windows SDK, which doesn't have the shim. The reason to prefer the shim is that it sets environment variables, including prepending `bin` directories that provide tools one would expect to have when using it. Without this, common POSIX commands may be unavailable, or different and incompatible implementations of them may be found. In particular, if they are found in a different MSYS2 installation whose `msys-2.0.dll` is of a different version or otherwise a different build, then calling them directly may produce strange behavior. See: - https://cygwin.com/faq.html#faq.using.multiple-copies - GitoxideLabs#1862 (comment) This makes things more robust overall than either preferring the non-shim or just doing a path search for `sh` as was done before that. But it exacerbates GitoxideLabs#1868 (as described there), so if the Git for Windows `sh.exe` shim continues to work as it currently does, then further improvements may be called for here.
- Allowing `usr` as a `<platform>` prefix is more likely to produce desired than undesired behavior. But due to the ambiguity of the name `usr` on non-Unix systems, maybe this would lead to problems that are relevant to security. The concern is described and the `usr` prefix, which is never needed to find the shell in a Git for Windows installation, is no longer matched. Note that this only affects it as a path component that `libexec/git-core` is found initially to be in. We do still use <prefix>/libexec/git-core/../../../usr/bin/sh.exe if we don't find something we can plausibly use at: `<prefix>/libexec/git-core/../../../bin/sh.exe The helper docstring explains why a security model under which this is reasonable does necessarily entail that it is reasonable to allow a `<prefix>` of `usr`, *even though* in the path with `usr` that we form, it is `usr` in the `<prefix>` position. With that said, and even though it does not help find `sh` from Git for Windows, hopefully future research can establish that it is safe to treat `usr/libexec/git-core` the same as platform prefixes like `mingw64/libexec/git-core` and it can be enabled. - Start refactoring by extracting and renaming the recently introduced helper constants and functions. - Extract most of the `cfg!(windows)` branch in `shell()` itself, even with its helpers already extracted, to make it a helper function as well. - Document the recently introduced helper constants.
The new `auxiliary` module within `gix_path::env` is a sibling of the `git` module and is similarly an implementation detail only. So far the only "auxiliary" program this module finds is `sh`. The module is named `auxiliary` rather than `aux` because Windows has problems with files named like `aux` or `aux.rs`, due to `AUX` being a reserved device name.
This also uses `once_cell::sync::Lazy` for the inferred Git for Windows installation directory based on `git --exec-path` output, though it is only used in one function that is itself only used by a caller that itself caches, so it too is so far effectively just a refactoring.
That was just added in the previous commit. `installation_config()` and `installation_config_prefix()` should never use, as their only strategy, the technique implemented in the newly introduced `git_for_windows_root()` helper, because if the system-scope configuration file is present elsewhere, including due to `GIT_CONFIG_SYSTEM` being set, then that *should* affect the values returned by those functions (except on Apple Git and any other systems where a separate higher "unknown" scope exists and is typically nonempty). Rather, some uses of `installation_config_prefix()`, such as to find other files that are better looked for relative to installed non-configuration files or installation directory itself rather than relative to its configuration file, might in the future be suitable to look fo9r using `git_for_windows_root()`.
They are now in a form where they take the name of the command to be found and look for it in a way that would find many of the binaries shipped with and reasonable to use with Git for Windows. However, as noted in comments, further refinement as well as new tests would be called for if using it for purposes other than to find `sh`. For now it remains private with `gix_path::env` and is only used to find `sh`, as an implementation detail of `gix_path::shell()`.
It didn't mention how expanded use may require directories like `mingw64/bin` to be added to searched locations, and also was somewhat hard to read. This adds that and rephrases.
The tests are not limited to finding `sh`, but this may still not be ready for use in finding other commands, for the commented reasons. These tests are a step toward that, but they are mainly to make sure the search works as expected both when the looked-for command is and is not found in one of the searched `bin` dirs. This is to say that, in the tests, the commands that can be found but in `usr/bin` rather than the first choice of `bin` are to an extent a stand-in for `sh` when searched for in an environment that doesn't have `(git root)/bin` (like the Git for Windows SDK), and the commands that cannot bve found are to an extent a standi-in for `sh` on systems where it cannot be found (such as due to `git` not being installed or installed from MinGit or some other minimal or nonstandard Git installation, or `GIT_EXEC_PATH` being defined and having a value that cannot be used or that can be used but points to a directory structure that does not have a usable `sh`).
Since the code under test is currently compiled on all platforms. The current tests' assertions are not guaranteed to hold outside of Windows systems. (It would be unusual to call the functions they are testing outside of Windows. Those functions are themselves mainly not marked to be conditionally compiled so that their callers uses techniques like `if cfg!(windows)` that aren't technically conditional compilation. Those techniques are themselves valuable because they can sometimes be simpler or more readable than conditional compilation or easier to avoid false-positive diagnostics, and because they allow type checking to occur even when building on other platforms, while still usually being fast because "runtime" conditions that are `false` constants still facilitate removal of dead code as an optimization.) So the tests of those functions are likewise built on all targets, but marked conditonally ignored on non-Windows platforms.
37d676e
to
b70cdb1
Compare
Regarding #1864 (comment) and various other discussion of this in #1864: I've rebased this after #1864 was merged, so there will be no confusion due to their commits being interleaved. (I had originally envisioned that this would come in before that PR, whose changes are largely based on some of the changes here. But either order works fine and should be equally clear.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for all your (fantastic) work on this!
Once again I think the code is very idiomatic and made to be maintainable, hence plentiful comments capture what the code alone would not. Given the subtlety of the matter, doing so definitely is the right way to go, while tests are used where possible.
Please note that I didn't try to understand everything in detail, and I at most skimmed the long reasoning, knowing that I wouldn't be able to meaningfully contribute in an armchair review without the time to actually run this and validate it on Windows. It's a luxury I can afford when you are the author though 😁🙏.
Tasks since #1862 (comment):
sh.exe
when availablesh.exe
(git root)
if it looks like Git for Windows and looks safetest-fixtures-windows
breakage (#1849, #1870) and rebasetest-32bit
breakage (#1874) and rebaseOutscoped:
>>
problem as a bug to MSYS2 or a related projectgix-command
on Windows runs shell commands in non-POSIX mode #1868(On all the above, see "What it means for this PR" in #1862 (comment).)
The original PR description follows.
This makes
git_path::env::shell()
more likely to work correctly on Windows, then attempts to use it ingix_command::Prepare
when running a command with a shell due touse_shell
being set totrue
. Full details are in the commit messages, though the changes are also summarized here.I think this is not ready to merge yet, due to failures that happen when running the tests locally from PowerShell, but that I cannot produce otherwise. I do not know what causes these failures, and I expect I may have difficulty getting this to a point where it would be ready to merge without input. Since the failures happen in a test that was fairly recently added, I figured you might just know what is going on.
I think
shell()
can and should be used forgix-command
. But if not then the reason will be relevant to other potential uses ofshell()
. In that case, assuming they leave enough good uses forshell()
that it is kept, I hope to discover and articulate those reasons in the documentation forshell()
. But I think most likely there is a bug or wrong test expectation somewhere that accounts for the local test failure, and that either the current implementation ofshell()
with the modifications included here, or something much like it, is suitable for broad use.Making
gix_path::env::shell()
more robustThis makes
gix_path::env::shell()
more likely to work correctly on Windows by:Checking that the path to a
sh.exe
associated with Git for Windows actually exists (and is either a regular file or a symbolic link that, when fully dereferenced, gives a file), because if that is not the case, then the fallback of just looking upsh.exe
using aPATH
search is more likely to succeed.This change is needed to use
shell()
more widely without breaking things on systems wheresh.exe
is available and usable but cannot be found in the usual location where Git for Windows supplies it.Using
/
rather than\
directory separators when appendingusr
,bin
, andsh.exe
components to the Git for Windows installation directory path obtained viagit --exec-path
and going up three components. This causes the path throughsh.exe
is run to use all/
separators except in the unusual situation thatGIT_EXEC_PATH
has been set to a path with\
separator. (This is unusual because people don't usually setGIT_EXEC_PATH
. It would probably be best, when setting it on Windows, to use/
separators, but I do not actually know how often people do that.)I believe this is only a minor improvement, in view of this path not being automatically passed through to the shell, as described in #1840 (reply in thread). But this change also makes tests in
gix-command
andgix-credentials
pass that would otherwise require more extensive modification related to\
escaping, when modifyinggix-command
to useshell()
.Using
gix_path::env::shell()
ingix-command
Before the changes in this PR,
gix-command
uses the relative pathsh.exe
on Windows. This works a significant fraction of the time already. In particular, the problems with the relative pathbash.exe
that make it frequently unusable to find a non-WSLbash
on Windows, described in #1359 (comment), do not apply tosh.exe
, since there is no Microsoft-providedsh.exe
related to WSL. However:sh.exe
is not always the best choice when present.sh.exe
that can be found inPATH
. For example, if the onlysh.exe
is supplied by Git for Windows, and neither of the Git for Windows directories that contain ansh.exe
binary are inPATH
, then before the changes in this PR, a shell will not be found unlessshell_program()
is called (every time a command withuse_shell
is run).Therefore, this uses
shell()
, with the improvements described above, ingix-command
.As alluded to above, I think this is not ready yet. All tests pass on CI, as well a when run locally from Git Bash. But when running them locally in PowerShell,
gix-merge::merge blob::platform::merge::with_external
fails. The failure does not requireGIX_TEST_IGNORE_ARCHIVES
to be set, and this PR does not currently modify any test fixture scripts.Windows failures that occur when tests are run from PowerShell but not from Git Bash have usually been due to #1359 and are thus not expected since #1712, where
gix-testtools
findsbash.exe
associated with Git for Windows and uses it to run fixtures. But that fix didn't affect the waygix-command
uses a shell, so it does not provide for the kind of shell invocations occurring here.Still, it doesn't make sense to me why the failure happens, and why it happens only when I run the tests locally from PowerShell. On this system, in the
PATH
, including when accessed from PowerShell, the commandsh
finds is a shim for the same shell that, as of this PR,gix-path
finds andgix-command
uses. Furthermore, the change here makesgix-path
findsh.exe
more like the waygix-testtools
findsbash.exe
, both of which are Bash when provided by Git for Windows.Furthermore, while most of my local testing has been with Git for Windows 2.48.1, I have verified that the test fails the same way on the same system with Git for Windows 2.47.1(2), and that the experiment described below to examine the details of the failure also gives the same results with that version. This is to say that it seems like this failure is not related, even conceptually, to #1849.
The failing assertion is the first of the assertions in this fragment of the failing test:
gitoxide/gix-merge/tests/merge/blob/platform.rs
Lines 291 to 314 in 2efce72
The first three of the cleaned driver lines are supposed to contain
.tmp
, but the first one is justb
, which is expected to apper later. This somehow varies based on how the tests are run, and there are at least three environments for figuring out what's going on: CI, local in Git Bash, and local in PowerShell. It seems like they shouldn't differ, but they do.Locally, it is possible to inspect things in a debugger, but this is harder on CI because while action-tmate can be used to get a reverse shell, on Windows runners the reverse shell's environment differs from the noninteractive environment in which the tests run. Fortunately, I can break the tests in a way that makes them always fail by panicking and printing all the lines:
To be clear, while all of the following are from messages printed in failing tests, the tests fail because of that change. All of them would otherwise pass, except the local run, in Windows, from PowerShell, shown last. Also, I do not know why this is happening. I am hoping you may have some insight into it.
Based off the main branch, on CI, in Ubuntu (would pass)
In the test-fast (ubuntu-latest) run, for comparison:
Based off the main branch, on CI, in Windows (would pass)
In the test-fast (windows-latest) run:
Note how backslashes are being treated as escape characters and removed, which seems like it is not intended, but this does not in and of itself cause a problem with this test: it happens even on the main branch, and even without the changes from this PR.
Based off the main branch, locally, in Windows, from Git Bash (would pass)
Via
cargo nextest run -p gix-merge --no-fail-fast
:Based off the main branch, locally, in Windows, from PowerShell (would pass)
Via
cargo nextest run -p gix-merge --no-fail-fast
:Based off this feature branch, on CI, in Ubuntu (would pass)
In the test-fast (ubuntu-latest) run, for comparison:
That is the same as based off the main branch, which is very much as expected, because the changes on this feature branch do not produce a different value for
gix_path::env::shell()
on systems other than Windows.Based off this feature branch, on CI, on Windows (would pass)
In the test-fast (windows-latest) run:
This is similar to the main branch CI and local Git Bash runs shown above.
Based off this feature branch, locally, in Windows, from Git Bash (would pass)
Via
cargo nextest run -p gix-merge --no-fail-fast
:Based off this feature branch, locally, in Windows, from PowerShell (would fail)
Via
cargo nextest run -p gix-merge --no-fail-fast
:Note how
b
comes first, and the three lines with paths to files named containing.tmp
follow it. This causes this assertion, which is run for the first three lines, to fail in the first iteration (i.e. for the first line):gitoxide/gix-merge/tests/merge/blob/platform.rs
Line 297 in 2efce72
But I do not understand why. I'm not very familiar with the details of
gix-merge
. I'm also not entirely clear on which kinds of shell transformations (from parsing%
items into fields, from shell expansions on unquoted parameter expansion, and from the implicit effect of joining on spaces when passing multiple arguments arising from an unquoted expansion toecho
) are intended to occur in the test command, and with what effects.Is this likely to be caused by different environments across shells run in different ways? Or is it somehow due to the name the shell is called with, even though that does not become its
$0
here and, furthermore, its$0
does not seem to be expanded? What could be going on?