Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add additional inline intents #3778

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

scottmcm
Copy link
Member

@scottmcm scottmcm commented Feb 23, 2025

This proposes adding #[inline(trampoline)] and #[inline(rarely)] to hint the compiler additional information about why you're marking the function for inlining, to help it hopefully make better choices without the overly-strong hammers of always or never.

Rendered

@scottmcm scottmcm added the T-lang Relevant to the language team, which will review and decide on the RFC. label Feb 23, 2025
@scottmcm scottmcm force-pushed the more-inline-options branch from b2c407c to be7ed44 Compare February 23, 2025 01:54
@scottmcm scottmcm force-pushed the more-inline-options branch from be7ed44 to 87ef134 Compare February 23, 2025 02:07
@clarfonthey
Copy link

I like this but am kind of mentally conflicted about the name "trampoline." I can't tell if it's a term that's already in use, or if it's something that was made up to match the metaphor.

Ultimately would be fine with the name and am not going to block this on bikeshedding, but, kind of wanted to express my thoughts in case someone else feels more strongly and isn't sure where others stand.

@kennytm
Copy link
Member

kennytm commented Feb 23, 2025

IIUC after this RFC we would have these 8 inline levels (including the rustc_* ones):

Attribute LLVM function attribute MIR inliner effect
#[rustc_force_inline = "reason"] alwaysinline compile error if cannot inline
#[inline(always)] alwaysinline always inline
#[inline] inlinehint -Zinline-mir-hint-threshold=100
- - -Zinline-mir-threshold=50
#[inline(trampoline)] -? forces caller_is_inline_forwarder?
#[inline(rarely)] -? an even lower threshold?
#[rustc_no_mir_inline] - never inline
#[inline(never)] noinline never inline

@scottmcm
Copy link
Member Author

scottmcm commented Feb 23, 2025

I like this but am kind of mentally conflicted about the name "trampoline." I can't tell if it's a term that's already in use, or if it's something that was made up to match the metaphor.

See https://en.wikipedia.org/wiki/Trampoline_(computing) -- it's used for lots of things. Probably the closest meaning to this one is the calling convention one, where a trampoline is a function with calling convention A that rearranges the stack/registers then calls another function with calling convention B.

If there's a better name, though, I'd be happy to switch it.


@kennytm There might be another if you count implicitly cross-crate-inline as different from #[inline].

We might be able to replace rustc_no_mir_inline with inline(rarely), though -- block it in mir inlining unless all the arguments are Operand::Const, say. That'd be enough for the two cases on my machine in library right now.

(Also, both rustc_no_mir_inlines are also marked #[inline], so I think they're actually LLVM inlinehint.)

Comment on lines +48 to +50
it's a function with a common trivial path, but which sometimes needs to call
out to a more complicated version, like how `Vec::push` is usually trivial but
occasionally needs to reallocate.
Copy link

@hanna-kruppe hanna-kruppe Feb 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big thumbs up for some way of expressing this intent, usually but not always together with hint::cold_path(). Currently the best way to express this is by putting #[cold] or #[inline(never)] on the uncommon, more complicated code path. But both of those options imply some incorrect/undesirable things.

Comment on lines +87 to +89
In LLVM, `#[inline]` sets the [`inlinehint` function attribute](https://llvm.org/docs/LangRef.html#function-attributes),
so `inline(rarely)` could skip doing that, and thus comparatively slightly
discourage inlining it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even without inlinehint LLVM's inlining heuristic can be mostly characterized as "yes", so I'm not sure if I'd describe this behavior as "inline rarely". This is not just a naming concern -- I don't know off-hand where I'd use an attribute that works this way. Not adding any #[inline] attribute and turning on ThinLTO mostly covers the "I don't want to encourage inlining in general but it's fine if it happens" scenarios for me without the costs of emitting multiple copies of the function in different CGUs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so that suggests that something like hint::cold_path() but without being as strong a statement as to mark the function as actually cold? I guess hint::discourage_mir_inlining() would be enough to do this (probably under a less implementation-focused name).

Copy link

@hanna-kruppe hanna-kruppe Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you actually aiming at rustc's MIR inliner when proposing this? Not LLVM's inliner? I guess I don't understand what effect this option is supposed to achieve, what gap it's supposed to fill. Would it make the callee's body available for inlining in principle even without LTO (by emitting it into multiple CGUs), similar to #[inline] but without the extra encouragement for actually inlining it? Or would #[inline(rarely)] only make the inliner less eagier in cases where an unannotated callee would be available for inlining anyway, similar to #[inline(never)] but not as absolute?

[drawbacks]: #drawbacks

These are still up to the programmer to get right, so
- they might just make analysis paralysis worse
Copy link

@hanna-kruppe hanna-kruppe Feb 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also the proposed inline(usually) (rust-lang/rust#130679) which I think is very well-motivated but adds to this problem. It would be great if we could just make the user-facing inline(always) work that way and keep an internal attribute like #[rustc_force_inline] for the cases (intrinsics) that need inlining even in opt-level=0 builds, but it's not clear if that will work out in the end.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to keep conversation up to date, #[rustc_force_inline] has landed and the intent is to use it to implement intrinsics: rust-lang/rust#134082.

@Jules-Bertholet
Copy link
Contributor

Perhaps inline should work like the diagnostic:: namespace, such that an unrecognized intent is a warning & no-op instead of an error?

@FHTMitchell
Copy link

I feel like the In Combination section could do with a code example - I'm finding it hard to track exactly which attribute would go on what function. Also NonZero::new seems to come out of nowhere.

@joshtriplett
Copy link
Member

I find myself thinking about the mention of analysis paralysis as a downside. I'd really like to understand 1) to what extent you expect to see these used commonly in codebases, 2) to what extent these work better than e.g. our existing heuristics for recognizing trampolines, 3) some kind of examples of these helping with codegen or compilation time or similar.

Or, to put this a different way: I think I would not want to approve this without first seeing the results of some kind of compiler experiment showing what would improve if we marked a bunch of functions with this, and weighing that against the very real cost of having multiple new variations of inline for people to consider.

@scottmcm
Copy link
Member Author

scottmcm commented Feb 26, 2025

The current trampoline detection is extremely simple, as it's focused only on avoiding MIR bloat. It's essentially "if this is only two blocks -- the first being a call that returns to the second -- then this is a trampoline and cannot inline anything that makes that no longer true". That means that it it's not detected even in "simple" cases like "well we assert! something first", or where there's a dropped generic parameter, because that means more blocks which might increase the MIR size.

The place I think inline(trampoline) would be most useful is things like

https://github.com/rust-lang/rust/blob/ac91805f3179fc2225c60e8ccf5a1daa09d43f3d/library/core/src/slice/mod.rs#L3861-L3869

where it would be useful to encourage inlining the assert into the caller, even against the "but that might lead to MIR bloat" heuristic, rather than the normal behaviour of first trying (subject to other heuristics still) to inline ptr::swap_nonoverlapping into [T]::swap_with_slice even if that makes it too big to then inline into callers.

Similarly, rust-lang/rust#136718 was experimenting with making slice::Iter::next be something like if T::IS_ZST { self.next_zst() } else { self.next_not_zst() }, which doesn't really work today because both next_zst and next_not_zst are both small enough that we end up inlining them, but then the combined thing is too big to inline, making it hard to optimize out that IS_ZST check. So it'd be nice to put a inline(trampoline) on that to make the ZST check nearly-certain to inline into the caller, which can then optimize it out and make it obviously good to inline the one next implementation it actually cared about. (TBH, could even try putting inline(discouraged) on the next_zst version because iterating over slices of things certain to be ZSTs is rare enough that it's probably not worth inlining that in generic MIR ever, but it would still be good for LLVM to inline it.)

Could we write a bunch of more complex heuristics as we identify more of these cases? Maybe. But it's be nice to just say what I mean directly in a bunch of them.

I think I would not want to approve this without first seeing the results of some kind of compiler experiment

I'd be happy to take the feedback in this thread as a "tentative interest; please make an experiment and we'll see".

Comment on lines +18 to +22
Often we'll get PRs using `inline(always)` "because it's just calling something
else so of course it should be inlined", for example. But because of the
bottom-up nature of inlining, that's a bad thing to do because if the callee
happens to get inlined, then it'll "always" inline that callee too, which might
not be what was actually desired.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph is unclear to me; in particular, it refers to “the callee” and “that callee” and I am not sure which two (or one?) functions are being referred to. I am familiar with the idea that LLVM inlining proceeds as, if A calls B calls C, "inline C into B, and only then decide if B’s code would be good to inline into A", and I think that’s the general scenario this is referring to, but the paragraph should be comprehensible to people who haven’t met that model, and it’s not clear which of these functions we are supposing is #[inline(always)].

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

In most cases, plain `#[inline]` is fine, especially with PGO and LTO.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usual coding advice I see is “In most cases, you don’t need any inlining annotation at all”. Could you give more context for what situations this sentence applies to? For example, is this actually “in libraries” (the situation where inlining is otherwise at the mercy of the small-function heuristic)?

Comment on lines +43 to +44
This is intended for functions which quickly "bounce" the caller off to some
other implementation, after doing some initial checks or transformations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could usefully be a little more explicit:

Suggested change
This is intended for functions which quickly "bounce" the caller off to some
other implementation, after doing some initial checks or transformations.
This is intended for functions which quickly "bounce" the caller off to some
other implementation which should not necessarily be inlined itself,
after doing some initial checks or transformations.

not be what was actually desired.

At the same time, sometimes it's useful to put `inline` on things to make the
definition available to to LLVM, but where it probably shouldn't actually be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
definition available to to LLVM, but where it probably shouldn't actually be
definition available to LLVM, but where it probably shouldn't actually be

@camsteffen
Copy link

(total novice here)

It seems confusing to be offered one possible reason to encourage inlining and no others. Why is trampoline chosen as a reason that is worth distinguishing and not others? Also, I tend to think that the "trampoline cases" should be the more easy cases for the compiler to decide to inline without hints. But if that were an inline hint blessed by the language, I might feel less inclined to trust the compiler by default. More broadly, I wonder if this feature encourages folks to add inline based on unverified intuition.

Also, I kinda worry about the gray area in defining trampoline - How many lines of code leading up to the "trampoline call" is too many? Some inline hints might not neatly map to a single reason, but this feature kinda suggests that they should.

Just a note that this vaguely reminds me of #[allow(..), reason=".."].

@jdahlstrom
Copy link

jdahlstrom commented Mar 4, 2025

Alternative bikeshed colors for trampoline: wrapper, forwarding, delegate, delegating

@edwloef
Copy link

edwloef commented Mar 4, 2025

Alternative bikeshed colors for trampoline: wrapper, forwarding, delegate, delegating

Another trampoline bikeshed: #[inline(shallow)]?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.