|
| 1 | +- Feature Name: `inline_intents` |
| 2 | +- Start Date: 2025-02-22 |
| 3 | +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) |
| 4 | +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) |
| 5 | + |
| 6 | +# Summary |
| 7 | +[summary]: #summary |
| 8 | + |
| 9 | +Add `#[inline(trampoline)]` and `#[inline(rarely)]` to give additional control |
| 10 | +over the inlining decisions made by the compiler. |
| 11 | + |
| 12 | +# Motivation |
| 13 | +[motivation]: #motivation |
| 14 | + |
| 15 | +Right now it's pretty common to just slap `#[inline]` on things without thinking |
| 16 | +about it all that hard about it, and the existing controls aren't great. |
| 17 | + |
| 18 | +Often we'll get PRs using `inline(always)` "because it's just calling something |
| 19 | +else so of course it should be inlined", for example. But because of the |
| 20 | +bottom-up nature of inlining, that's a bad thing to do because if the callee |
| 21 | +happens to get inlined, then it'll "always" inline that callee too, which might |
| 22 | +not be what was actually desired. |
| 23 | + |
| 24 | +At the same time, sometimes it's useful to put `inline` on things to make the |
| 25 | +definition available to to LLVM, but where it probably shouldn't actually be |
| 26 | +inlined in general, only in particular special cases (perhaps when one of the |
| 27 | +arguments is a small constant, for example). |
| 28 | + |
| 29 | +It would thus be nice to give additional options to the user to let them |
| 30 | +describe *why* they wanted a function inlined, and hopefully be able to make |
| 31 | +better decisions in the backend as a result. |
| 32 | + |
| 33 | + |
| 34 | +# Guide-level explanation |
| 35 | +[guide-level-explanation]: #guide-level-explanation |
| 36 | + |
| 37 | +In most cases, plain `#[inline]` is fine, especially with PGO and LTO. |
| 38 | +However, if you've measured and things are making a poor choice, there are some |
| 39 | +options you can use to hint the compiler towards what you want. |
| 40 | + |
| 41 | +## `inline(trampoline)` |
| 42 | + |
| 43 | +This is intended for functions which quickly "bounce" the caller off to some |
| 44 | +other implementation, after doing some initial checks or transformations. |
| 45 | + |
| 46 | +For example, this is useful in a safe function which does some safety checks, |
| 47 | +then calls an `unsafe` version of the function to do the actual work. Or maybe |
| 48 | +it's a function with a common trivial path, but which sometimes needs to call |
| 49 | +out to a more complicated version, like how `Vec::push` is usually trivial but |
| 50 | +occasionally needs to reallocate. |
| 51 | + |
| 52 | +## `inline(rarely)` |
| 53 | + |
| 54 | +This is intended for functions which normally shouldn't be inlined, but where |
| 55 | +exceptions exist so you don't want full `inline(never)`. |
| 56 | + |
| 57 | +For example, maybe this is a vectorized loop that you wouldn't want copied into |
| 58 | +every caller, but you know that using it for short arrays is common, and thus |
| 59 | +want the back-end to be able to inline and fully unroll it in those cases. |
| 60 | + |
| 61 | +## In combination |
| 62 | + |
| 63 | +These can work particularly well together. |
| 64 | + |
| 65 | +For example, perhaps the public function is `inline(trampoline)` so that the |
| 66 | +initial `NonZero::new` check can be inlined into the caller (where it's more |
| 67 | +easily optimized out) but that calls a private `inline(rarely)` function which |
| 68 | +takes `NonZero` to avoid needing extra checks internally. |
| 69 | + |
| 70 | +Or perhaps you write an `inline(trampoline)` function that picks a strategy |
| 71 | +based on the types and arguments, then dispatches to one of many possible |
| 72 | +`inline(rarely)` implementations. |
| 73 | + |
| 74 | + |
| 75 | +# Reference-level explanation |
| 76 | +[reference-level-explanation]: #reference-level-explanation |
| 77 | + |
| 78 | +Like `inline`, these "do nothing" in a spec sense. |
| 79 | + |
| 80 | +The only language change is thus in allowing the two new tokens in the attribute |
| 81 | +(in addition to `never` and `always`). |
| 82 | + |
| 83 | +## Implementation options |
| 84 | + |
| 85 | +⚠ These are not normative, just to illustrate possibilities. ⚠ |
| 86 | + |
| 87 | +In LLVM, `#[inline]` sets the [`inlinehint` function attribute](https://llvm.org/docs/LangRef.html#function-attributes), |
| 88 | +so `inline(rarely)` could skip doing that, and thus comparatively slightly |
| 89 | +discourage inlining it. |
| 90 | + |
| 91 | +In MIR inlining, we already attempt to deduce trampolines as of [#127113]. |
| 92 | +This would let people opt-in to that behaviour, even in places it's not obvious. |
| 93 | +Then we could allow inlining trampolines into trampolines, but avoid inlining |
| 94 | +non-trampolines into a trampoline. And we have an internal attribute |
| 95 | +`rustc_no_mir_inline` which blocks MIR inlining, so `inline(rarely)` could also |
| 96 | +do that (or maybe just make the threshold very restrictive). |
| 97 | + |
| 98 | +[#127113]: https://github.com/rust-lang/rust/pull/127113 |
| 99 | + |
| 100 | + |
| 101 | +# Drawbacks |
| 102 | +[drawbacks]: #drawbacks |
| 103 | + |
| 104 | +These are still up to the programmer to get right, so |
| 105 | +- they might just make analysis paralysis worse |
| 106 | +- they might be worse than having something like a PGO-based system instead |
| 107 | +- they might turn out to not actually help as hoped |
| 108 | +- they might lead to more bugs in the tracker when they don't do what people thought |
| 109 | +- they might be the wrong pivots and we'll just end up needing more |
| 110 | + |
| 111 | +However, at worst we just make them allowed but not do anything, so at worst the |
| 112 | +cost of having them is very small: a compiler just parses-then-ignores them. |
| 113 | + |
| 114 | + |
| 115 | +# Rationale and alternatives |
| 116 | +[rationale-and-alternatives]: #rationale-and-alternatives |
| 117 | + |
| 118 | +The hope here is that by giving *intent* we can tune how these work in a better |
| 119 | +way than is possible with the existing knobs. |
| 120 | + |
| 121 | +The existing options are insufficient because `always` and `never` are too strong. |
| 122 | +With LLVM's bottom-up inlining logic, `always` isn't the right answer the vast |
| 123 | +majority of the time, and if LLVM really doesn't want to inline it, most of the |
| 124 | +time there's a good reason for that. Similarly if it really thinks that |
| 125 | +something would be good to inline, the majority of the time blocking that with |
| 126 | +`never` isn't really want you want. |
| 127 | + |
| 128 | +For example, the library wants to *discourage* inlining of the UB check |
| 129 | +functions, but doesn't want `never` because if all the arguments are constants |
| 130 | +inlining it to const-fold away all the checks is good. (It'd be silly to force |
| 131 | +`NonNull::new(ptr::without_provenance(2))` to have an actual call to |
| 132 | +`precondition_check(2_ptr)`.) |
| 133 | + |
| 134 | +And the blanket impl of `Into::into` that just calls `From::from` wants to |
| 135 | +"always" be inlined, but because of how inlining works it'd be wrong for it to |
| 136 | +use `always` since it doesn't necessarily want to inline the whole subtree. |
| 137 | + |
| 138 | +Thus today usually all that one can say is `#[inline]`, and hope that the |
| 139 | +compiler does the right thing. Often it does, but sometimes it also makes poor |
| 140 | +decisions that can result in binary bloat or poor performance. With these to |
| 141 | +nudge it one way or the other, hopefully that will avoid the downsides of the |
| 142 | +existing really big hammers, while also not over-promising for what will always |
| 143 | +just be a hint, not a promise. |
| 144 | + |
| 145 | +`inline(rarely)` could instead be done with something in the body to increase |
| 146 | +the cost, perhaps marking it `#[inline]` but then calling a hypothetical magic |
| 147 | +`core::hint::discourage_inlining()`. But to make it an inlining candidate at |
| 148 | +all means it still needs the attribute, so it seems nicer to let it just be |
| 149 | +mentioned in the attribute without needing to inspect the body, given that |
| 150 | +the attribute is already parameterized. |
| 151 | + |
| 152 | + |
| 153 | +# Prior art |
| 154 | +[prior-art]: #prior-art |
| 155 | + |
| 156 | +Various languages have inlining controls, but I'm unaware of any with either of |
| 157 | +these specific intents. |
| 158 | + |
| 159 | +[LLVM](https://llvm.org/docs/LangRef.html#function-attributes) |
| 160 | +has `alwaysinline` vs `inlinehint` vs `noduplicate` vs `noinline`. |
| 161 | + |
| 162 | +[GCC](https://gcc.gnu.org/onlinedocs/gcc/Inline.html) |
| 163 | +has `inline` and `__attribute__((always_inline))` and `extern inline`. |
| 164 | + |
| 165 | +[DotNet](https://learn.microsoft.com/en-us/dotnet/api/system.runtime.compilerservices.methodimploptions) |
| 166 | +has `NoInlining` vs `AggressiveInlining` hints. |
| 167 | + |
| 168 | + |
| 169 | +# Unresolved questions |
| 170 | +[unresolved-questions]: #unresolved-questions |
| 171 | + |
| 172 | +The exact behaviour of how these affect program compilation will likely continue |
| 173 | +to be tweaked even after they stabilize, assuming this were to be accepted. |
| 174 | + |
| 175 | +The goal of the RFC process is to pick tokens that are sufficiently evocative |
| 176 | +of what intent the coder is expressing by using them; the details happen later. |
| 177 | + |
| 178 | + |
| 179 | +# Future possibilities |
| 180 | +[future-possibilities]: #future-possibilities |
| 181 | + |
| 182 | +There are always more possible intents that one could imagine. For example, one |
| 183 | +for "this is small and uninteresting" could make sense, which could even have |
| 184 | +more non-semantic effects like emitting less debug info. But for now it's less |
| 185 | +obvious that that's worth distinguishing, since just things being small (like |
| 186 | +basic accessor functions often are) is enough to already trigger making it |
| 187 | +cross-crate inlinable even without the attribute, and things that are small |
| 188 | +and simple are reliably inlined without extra hinting anyway. |
| 189 | + |
0 commit comments