-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add additional inline
intents
#3778
base: master
Are you sure you want to change the base?
Conversation
b2c407c
to
be7ed44
Compare
be7ed44
to
87ef134
Compare
I like this but am kind of mentally conflicted about the name "trampoline." I can't tell if it's a term that's already in use, or if it's something that was made up to match the metaphor. Ultimately would be fine with the name and am not going to block this on bikeshedding, but, kind of wanted to express my thoughts in case someone else feels more strongly and isn't sure where others stand. |
IIUC after this RFC we would have these 8 inline levels (including the
|
See https://en.wikipedia.org/wiki/Trampoline_(computing) -- it's used for lots of things. Probably the closest meaning to this one is the calling convention one, where a trampoline is a function with calling convention A that rearranges the stack/registers then calls another function with calling convention B. If there's a better name, though, I'd be happy to switch it. @kennytm There might be another if you count implicitly cross-crate-inline as different from We might be able to replace (Also, both |
it's a function with a common trivial path, but which sometimes needs to call | ||
out to a more complicated version, like how `Vec::push` is usually trivial but | ||
occasionally needs to reallocate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big thumbs up for some way of expressing this intent, usually but not always together with hint::cold_path()
. Currently the best way to express this is by putting #[cold]
or #[inline(never)]
on the uncommon, more complicated code path. But both of those options imply some incorrect/undesirable things.
In LLVM, `#[inline]` sets the [`inlinehint` function attribute](https://llvm.org/docs/LangRef.html#function-attributes), | ||
so `inline(rarely)` could skip doing that, and thus comparatively slightly | ||
discourage inlining it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even without inlinehint
LLVM's inlining heuristic can be mostly characterized as "yes", so I'm not sure if I'd describe this behavior as "inline rarely". This is not just a naming concern -- I don't know off-hand where I'd use an attribute that works this way. Not adding any #[inline]
attribute and turning on ThinLTO mostly covers the "I don't want to encourage inlining in general but it's fine if it happens" scenarios for me without the costs of emitting multiple copies of the function in different CGUs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so that suggests that something like hint::cold_path()
but without being as strong a statement as to mark the function as actually cold
? I guess hint::discourage_mir_inlining()
would be enough to do this (probably under a less implementation-focused name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you actually aiming at rustc's MIR inliner when proposing this? Not LLVM's inliner? I guess I don't understand what effect this option is supposed to achieve, what gap it's supposed to fill. Would it make the callee's body available for inlining in principle even without LTO (by emitting it into multiple CGUs), similar to #[inline]
but without the extra encouragement for actually inlining it? Or would #[inline(rarely)]
only make the inliner less eagier in cases where an unannotated callee would be available for inlining anyway, similar to #[inline(never)]
but not as absolute?
[drawbacks]: #drawbacks | ||
|
||
These are still up to the programmer to get right, so | ||
- they might just make analysis paralysis worse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also the proposed inline(usually)
(rust-lang/rust#130679) which I think is very well-motivated but adds to this problem. It would be great if we could just make the user-facing inline(always)
work that way and keep an internal attribute like #[rustc_force_inline]
for the cases (intrinsics) that need inlining even in opt-level=0 builds, but it's not clear if that will work out in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to keep conversation up to date, #[rustc_force_inline]
has landed and the intent is to use it to implement intrinsics: rust-lang/rust#134082.
Perhaps |
I feel like the In Combination section could do with a code example - I'm finding it hard to track exactly which attribute would go on what function. Also |
I find myself thinking about the mention of analysis paralysis as a downside. I'd really like to understand 1) to what extent you expect to see these used commonly in codebases, 2) to what extent these work better than e.g. our existing heuristics for recognizing trampolines, 3) some kind of examples of these helping with codegen or compilation time or similar. Or, to put this a different way: I think I would not want to approve this without first seeing the results of some kind of compiler experiment showing what would improve if we marked a bunch of functions with this, and weighing that against the very real cost of having multiple new variations of inline for people to consider. |
The current trampoline detection is extremely simple, as it's focused only on avoiding MIR bloat. It's essentially "if this is only two blocks -- the first being a call that returns to the second -- then this is a trampoline and cannot inline anything that makes that no longer true". That means that it it's not detected even in "simple" cases like "well we The place I think where it would be useful to encourage inlining the Similarly, rust-lang/rust#136718 was experimenting with making Could we write a bunch of more complex heuristics as we identify more of these cases? Maybe. But it's be nice to just say what I mean directly in a bunch of them.
I'd be happy to take the feedback in this thread as a "tentative interest; please make an experiment and we'll see". |
Often we'll get PRs using `inline(always)` "because it's just calling something | ||
else so of course it should be inlined", for example. But because of the | ||
bottom-up nature of inlining, that's a bad thing to do because if the callee | ||
happens to get inlined, then it'll "always" inline that callee too, which might | ||
not be what was actually desired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph is unclear to me; in particular, it refers to “the callee” and “that callee” and I am not sure which two (or one?) functions are being referred to. I am familiar with the idea that LLVM inlining proceeds as, if A calls B calls C, "inline C into B, and only then decide if B’s code would be good to inline into A", and I think that’s the general scenario this is referring to, but the paragraph should be comprehensible to people who haven’t met that model, and it’s not clear which of these functions we are supposing is #[inline(always)]
.
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
In most cases, plain `#[inline]` is fine, especially with PGO and LTO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usual coding advice I see is “In most cases, you don’t need any inlining annotation at all”. Could you give more context for what situations this sentence applies to? For example, is this actually “in libraries” (the situation where inlining is otherwise at the mercy of the small-function heuristic)?
This is intended for functions which quickly "bounce" the caller off to some | ||
other implementation, after doing some initial checks or transformations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could usefully be a little more explicit:
This is intended for functions which quickly "bounce" the caller off to some | |
other implementation, after doing some initial checks or transformations. | |
This is intended for functions which quickly "bounce" the caller off to some | |
other implementation which should not necessarily be inlined itself, | |
after doing some initial checks or transformations. |
not be what was actually desired. | ||
|
||
At the same time, sometimes it's useful to put `inline` on things to make the | ||
definition available to to LLVM, but where it probably shouldn't actually be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definition available to to LLVM, but where it probably shouldn't actually be | |
definition available to LLVM, but where it probably shouldn't actually be |
(total novice here) It seems confusing to be offered one possible reason to encourage inlining and no others. Why is trampoline chosen as a reason that is worth distinguishing and not others? Also, I tend to think that the "trampoline cases" should be the more easy cases for the compiler to decide to inline without hints. But if that were an inline hint blessed by the language, I might feel less inclined to trust the compiler by default. More broadly, I wonder if this feature encourages folks to add inline based on unverified intuition. Also, I kinda worry about the gray area in defining trampoline - How many lines of code leading up to the "trampoline call" is too many? Some inline hints might not neatly map to a single reason, but this feature kinda suggests that they should. Just a note that this vaguely reminds me of |
Alternative bikeshed colors for |
Another |
This proposes adding
#[inline(trampoline)]
and#[inline(rarely)]
to hint the compiler additional information about why you're marking the function for inlining, to help it hopefully make better choices without the overly-strong hammers ofalways
ornever
.Rendered