Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After 578a4716f Clang -MD by default produces different dependecies file. #70011

Closed
kbelochapka opened this issue Oct 24, 2023 · 17 comments · Fixed by #71697 or llvm/llvm-project-release-prs#776
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" release:backport release:merged

Comments

@kbelochapka
Copy link
Collaborator

commit 578a471
Author: Arthur Eubanks [email protected]

It appears that this commit unintentionally made -canonical-system-headers==true as default.
If anyone needs the old behavior he needs to specify -no-canonical-prefixes option.
Logically it should be visa-versa, if anyone needs the new behavior, let him use -canonical-prefixes option.

This change in the clang default behavior causes some other tools that rely on the older format of the dependencies file to work incorrectly. (for example we have Visual Studio extension software that relies on the dependencies file)

On Windows platform, if your system headers located a network drive or on drive that was created by the SUBST command
you will get a fully resolved real file path in the dependencies file

For example;
Lets assume that you have a network drive H: on which you have system headers.

before the commit a path to a include file in the dependencies file looked like:

H:\target\include\sdk_version.h

After the commit the include file path looks like that:

\\SERVERNAME\shared_folder\target\include\sdk_version.h
```
 
The situation becomes worse if a combined length of a shared folder on a network server and a include file path on a shared network drive is greater than 260 symbols (MAX_PATH limit on Windows). In this situation you will observe bogus include file paths in the  dependencies file.
(this problem probably should be a subject to another ticket)

@github-actions github-actions bot added the clang Clang issues not falling into any other category label Oct 24, 2023
@EugeneZelenko EugeneZelenko added clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed clang Clang issues not falling into any other category labels Oct 24, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 24, 2023

@llvm/issue-subscribers-clang-frontend

Author: None (kbelochapka)

commit 578a471 Author: Arthur Eubanks <[email protected]>

It appears that this commit unintentionally made -canonical-system-headers==true as default.
If anyone needs the old behavior he needs to specify -no-canonical-prefixes option.
Logically it should be visa-versa, if anyone needs the new behavior, let him use -canonical-prefixes option.

This change in the clang default behavior causes some other tools that rely on the older format of the dependencies file to work incorrectly. (for example we have Visual Studio extension software that relies on the dependencies file)

On Windows platform, if your system headers located a network drive or on drive that was created by the SUBST command
you will get a fully resolved real file path in the dependencies file

For example;
Lets assume that you have a network drive H: on which you have system headers.

before the commit a path to a include file in the dependencies file looked like:

H:\target\include\sdk_version.h

After the commit the include file path looks like that:

\\SERVERNAME\shared_folder\target\include\sdk_version.h
```
 
The situation becomes worse if a combined length of a shared folder on a network server and a include file path on a shared network drive is greater than 260 symbols (MAX_PATH limit on Windows). In this situation you will observe bogus include file paths in the  dependencies file.
(this problem probably should be a subject to another ticket)


</details>

@llvmbot
Copy link
Member

llvmbot commented Oct 24, 2023

@llvm/issue-subscribers-clang-driver

Author: None (kbelochapka)

commit 578a471 Author: Arthur Eubanks <[email protected]>

It appears that this commit unintentionally made -canonical-system-headers==true as default.
If anyone needs the old behavior he needs to specify -no-canonical-prefixes option.
Logically it should be visa-versa, if anyone needs the new behavior, let him use -canonical-prefixes option.

This change in the clang default behavior causes some other tools that rely on the older format of the dependencies file to work incorrectly. (for example we have Visual Studio extension software that relies on the dependencies file)

On Windows platform, if your system headers located a network drive or on drive that was created by the SUBST command
you will get a fully resolved real file path in the dependencies file

For example;
Lets assume that you have a network drive H: on which you have system headers.

before the commit a path to a include file in the dependencies file looked like:

H:\target\include\sdk_version.h

After the commit the include file path looks like that:

\\SERVERNAME\shared_folder\target\include\sdk_version.h
```
 
The situation becomes worse if a combined length of a shared folder on a network server and a include file path on a shared network drive is greater than 260 symbols (MAX_PATH limit on Windows). In this situation you will observe bogus include file paths in the  dependencies file.
(this problem probably should be a subject to another ticket)


</details>

@shafik
Copy link
Collaborator

shafik commented Oct 24, 2023

CC @aeubanks

@aeubanks
Copy link
Contributor

-no-canonical-prefixes means don't read through links, otherwise by default clang does read through links. In terms of dependency files, we should respect that flag as well right? This seems like the correct behavior to me. If you don't want clang to expand links, then pass -no-canonical-prefixes.

@kbelochapka
Copy link
Collaborator Author

I had looked up for the -canonical-prefixes default value and it is ON. Then you are right, if the -canonical-prefixes==ON
then -canonical-system-headers also should be ON as well.

The problem is that, on Windows platform a path to include file in the dependencies file will be not just in the canonical form.
It will be so called real(final) path. It means that, in a situation when an include file is located on network drive or on a SUBST drive, then the path will be expanded to "real" path.

This happens because on Windows, the llvm::sys::fs::real_path() function internally calls
GetFinalPathNameByHandleW() which returns fully resolved (final) path in the UNC form.

For example, you have SUBST drive D: which points to C:\FOLDER, and you have the include file on D:\INC\abc.h.
The "real" path will be C:\FOLDER\INC\abc.h
The path is surely in the canonical form, but this is not what has to be written into a dependencies file.

As I mentioned in my previous comments, sometimes when a combined length of a path pieces goes over 260 limit, we can have a bogus path as an output.

Another problem, this creates inconsistency between any include paths in a preprocessed file and paths in a dependencies file.

@aeubanks
Copy link
Contributor

aeubanks commented Nov 1, 2023

Adding some Windows people for more opinions on canonicalized paths
@mstorsjo @zmodem @rnk (please add other people if I'm forgetting someone)

@rnk
Copy link
Collaborator

rnk commented Nov 1, 2023

In the short term, I think it would be reasonable to default system header canonicalization to false on Windows. In other words, go here:

CmdArgs.push_back("-canonical-system-headers");

And change the default to be false if the host is Windows.

Changing behavior depending on the host can be awkward, since now clang will produce very different looking .d files on different platforms, but I think this actually makes clang's output more consistent, since Windows GetFinalPathNameByHandleW seems to do very different things from Linux realpath.

As a longer term fix, symlinks do exist on Windows. I would try to replicate the issue described at ninja-build/ninja#1330 on Windows by creating a symlink using sysroot.py for use with CMAKE_SYSROOT on Windows, and see if that can be used to reproduce the dirty build issues described there.

We need a set of Windows API calls that: 1. resolve symlinks 2. resolve parent path components (..) and 3. doesn't change the drive letter from the original path, or uses the currently active drive letter for relative paths. And, I don't know how to do that, maybe someone else does.

aeubanks added a commit to aeubanks/llvm-project that referenced this issue Nov 2, 2023
Canonicalizing paths on Windows leads to unexpected things like changing
drive letters. As a short term fix, do not canonicalize system headers
on Windows by default.

Fixes llvm#70011
@benlangmuir
Copy link
Collaborator

This change also caused a performance regression when using -MD because getting a canonical path can be expensive. I'm seeing up to 15% slowdown in compiling a file that just includes ~1200 system headers. I measured on Darwin, but I suspect all platforms will hit some slowdown. I think we need to make this opt-in or find a different solution to the original problem.

@aeubanks
Copy link
Contributor

aeubanks commented Nov 8, 2023

I'll add -fno-canonical-system-headers to match gcc to toggle this behavior

@benlangmuir
Copy link
Collaborator

@aeubanks Having -f[no-]canonical-system-headers seems fine, but we should not make the default to be calling realpath on every system header when that causes a compile time regression.

@aeubanks
Copy link
Contributor

aeubanks commented Nov 8, 2023

Yes, having it off by default sounds good.

Actually I'll just revert the patch to start with and then rework it

@benlangmuir
Copy link
Collaborator

Thanks @aeubanks ! Feel free to tag me for a review when you rework it.

aeubanks added a commit to aeubanks/llvm-project that referenced this issue Nov 8, 2023
… when -canonical-prefixes"

This reverts commit 578a471.

This causes multiple issues. Compile time slowdown due to more path canonicalization, and weird behavior on Windows.

Fixes llvm#70011.
aeubanks added a commit that referenced this issue Nov 8, 2023
… when -canonical-prefixes" (#71697)

This reverts commit 578a471.

This causes multiple issues. Compile time slowdown due to more path
canonicalization, and weird behavior on Windows.

Will reland under a separate flag `-f[no-]canonical-system-headers` to
match gcc in the future and further limit when it's passed by default.

Fixes #70011.
benlangmuir pushed a commit to benlangmuir/llvm-project that referenced this issue Nov 9, 2023
… when -canonical-prefixes" (llvm#71697)

This reverts commit 578a471.

This causes multiple issues. Compile time slowdown due to more path
canonicalization, and weird behavior on Windows.

Will reland under a separate flag `-f[no-]canonical-system-headers` to
match gcc in the future and further limit when it's passed by default.

Fixes llvm#70011.

(cherry picked from commit 955dd88)

rdar://118178759
benlangmuir pushed a commit to swiftlang/llvm-project that referenced this issue Nov 9, 2023
… when -canonical-prefixes" (llvm#71697)

This reverts commit 578a471.

This causes multiple issues. Compile time slowdown due to more path
canonicalization, and weird behavior on Windows.

Will reland under a separate flag `-f[no-]canonical-system-headers` to
match gcc in the future and further limit when it's passed by default.

Fixes llvm#70011.

(cherry picked from commit 955dd88)

rdar://118178759
@aeubanks aeubanks added this to the LLVM 17.0.X Release milestone Nov 13, 2023
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status Nov 13, 2023
@aeubanks
Copy link
Contributor

/cherry-pick 955dd88

@llvmbot
Copy link
Member

llvmbot commented Nov 13, 2023

Failed to cherry-pick: 955dd88

https://github.com/llvm/llvm-project/actions/runs/6854110849

Please manually backport the fix and push it to your github fork. Once this is done, please add a comment like this:

/branch <user>/<repo>/<branch>

@aeubanks
Copy link
Contributor

/branch aeubanks/llvm-project/backport

@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2023

/pull-request llvm/llvm-project-release-prs#776

aeubanks added a commit to aeubanks/llvm-project that referenced this issue Nov 14, 2023
… when -canonical-prefixes" (llvm#71697)

This reverts commit 578a471.

This causes multiple issues. Compile time slowdown due to more path
canonicalization, and weird behavior on Windows.

Will reland under a separate flag `-f[no-]canonical-system-headers` to
match gcc in the future and further limit when it's passed by default.

Fixes llvm#70011.
@aeubanks
Copy link
Contributor

/branch aeubanks/llvm-project/backport

@tru tru moved this from Needs Triage to Needs Review in LLVM Release Status Nov 20, 2023
@tru tru moved this from Needs Review to Done in LLVM Release Status Nov 20, 2023
tru pushed a commit that referenced this issue Nov 20, 2023
… when -canonical-prefixes" (#71697)

This reverts commit 578a471.

This causes multiple issues. Compile time slowdown due to more path
canonicalization, and weird behavior on Windows.

Will reland under a separate flag `-f[no-]canonical-system-headers` to
match gcc in the future and further limit when it's passed by default.

Fixes #70011.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment