-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Offload device1 #142696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Offload device1 #142696
Conversation
This comment has been minimized.
This comment has been minimized.
☔ The latest upstream changes (presumably #143026) made this pull request unmergeable. Please resolve the merge conflicts. |
We'll still need #143684 to properly recognize our GPU hardware and run the binary on end-user hardware, but here I'll only add codegen tests, so it should work fine for CI. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
When rust provides LLVM bitcode files to lld and the bitcode contains function summaries as used for thin lto, lld defaults to using thin lto. This prevents some optimizations that are only applied for fat lto. We solve this by not creating function summaries when fat lto is enabled. The bitcode for the module is just directly written out. An alternative solution would be to set the `ThinLTO=0` module flag to signal lld to do fat lto. The code in clang that sets this flag is here: https://github.com/llvm/llvm-project/blob/560149b5e3c891c64899e9912e29467a69dc3a4c/clang/lib/CodeGen/BackendUtil.cpp#L1150 The code in LLVM that queries the flag and defaults to thin lto if not set is here: https://github.com/llvm/llvm-project/blob/e258bca9505f35e0a22cb213a305eea9b76d11ea/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp#L4441-L4446
The job Click to see the possible cause of the failure (guessed by this bot)
|
related (also wip) rustc-dev-guide update: rust-lang/rustc-dev-guide#2524 |
r? @oli-obk
Here I'll continue the device side code generation, and park the second commit for now.
I first want to land the mvp in the other PR.