Skip to content

RFC: Add an attribute for raising the alignment of various items #3806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

Jules-Bertholet
Copy link
Contributor

@Jules-Bertholet Jules-Bertholet commented May 1, 2025

Port C alignas to Rust.

Rendered

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label May 2, 2025
@traviscross traviscross added I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. I-lang-radar Items that are on lang's radar and will need eventual work or consideration. labels May 2, 2025
Comment on lines +96 to +100
The `align` attribute is a new inert, built-in attribute that can be applied to
ADT fields, `static` items, function items, and local variable declarations. The
attribute accepts a single required parameter, which must be a power-of-2
integer literal from 1 up to 2<sup>29</sup>. (This is the same as
`#[repr(align(…))]`.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2^29 limit is way too high. The consistency with #[repr(align(..))] is a good default but alignments larger than a page or two have never worked properly in local variables (rust-lang/rust#70143) and in statics (rust-lang/rust#70022, rust-lang/rust#70144). While there are some use cases for larger alignments on types (if they're heap allocated) and an error on putting such a type in a local or static is ugly, for this new attribute we could just avoid the problem from the start.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a struct field, both GCC and clang supported _Alignas(N) for N ≤ 228 (268435456).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug with local variables (rust-lang/rust#70143) seems to have been fixed everywhere except Windows, and just waiting on someone to fix it there as well in LLVM. (And even on Windows where the issue is not fixed, the only effect is to break the stack overflow protection, bringing it down to the same level as many Tier 2 targets.)

So the only remaining issue is with statics, where it looks like a target-specific max alignment might be necessary. Once implemented, that solution can be used to address align as well.

Overall, I don't think any of this is sufficient motivation to impose a stricter maximum on #[align].

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that fixing the soundness issue for locals just means that putting a local with huge alignment in a stack frame is very likely to trigger the stack overflow check and abort the program. There is no use case for such massively over-aligned locals or statics, which is why those soundness issues been mostly theoretical problems and why the only progress toward fixing them over many years has been side effects of unrelated improvements (inline stack checks).

The only reason why the repr(align(..)) limit is so enormous is because it’s plausibly useful for heap allocations. Adding a second , lower limit for types put in statics and locals nowadays is somewhat tricky to design and drive to consensus (e.g., there’s theoretical backwards compatibility concerns) and not a priority for anyone, so who knows when it’ll happen. For #[align] we have the benefit of hindsight and could just mostly side-step the whole mess. I don’t see this as “needlessly restricting the new feature” but rather as “not pointlessly expanding upon an existing soundness issue for no reason”.

Copy link
Member

@programmerjake programmerjake May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no use case for such massively over-aligned locals or statics

one use case I can think of is having a massive array that is faster because it's properly aligned so the OS can use huge pages (on x86_64, those require alignment $2^{19}$ or $2^{30}$), reducing TLB pressure. admittedly, that would only realistically be useful for statics or heap-allocated/mmap-ed memory.

Copy link

@hanna-kruppe hanna-kruppe May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To use huge pages for static data, you'd want to align the ELF segment containing the relevant sections (or equivalent in other formats), so the right tool there is a linker script or similar platform-specific mechanism. Over-aligning individual statics is a bad alternative:

  1. It risks wasting a lot more (physical!) memory than expected if you end up with multiple statics in the program doing it and there's not enough other data to fill the padding required between them or they go in different sections.
  2. If the linker/loader ignores the requested section alignment then that leads to UB if you used Rust-level #[align(N)]/#[repr(align(N))] and the code was optimized under that assumption.
  3. While aligning statics appears easier and more portable than linker scripts, the reality is that platform/toolchain support for this is spotty anyway, so you really ought to carefully consider when and where to apply this trick.

In any case, I'm sure I'm technically wrong to claim that nobody could ever come up with a use case for massively over-aligned statics. But there's a reason why Linux and glibc have only started supporting it at all in the last few years, and other environments like musl-based Linux and Windows apparently doesn't support it at all (see discussion in aforementioned issues).

Comment on lines 88 to 91
In Rust, a type’s size is always a multiple of its alignment. However, there are
other languages that can interoperate with Rust, where this is not the case
(WGSL, for example). It’s important for Rust to be able to represent such
structures.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me how this would work while keeping Rust's "size is multiple of align" rule intact. I guess if it's about individual fields in a larger aggregate that maintains the rule in total? I don't know anything about WGSL so an example would be appreciated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s exactly it. The WSGL example was taken from this comment on Internals: https://internals.rust-lang.org/t/pre-rfc-align-attribute/21004/20

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a worked example would indeed help readers of the RFC on this point.

Copy link
Contributor

@kpreid kpreid May 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a concrete example of implementing Rust-WGSL compatibility using the #[align] attribute defined in this RFC. These structs have the same layout, and together demonstrate both inserting required padding (between foo and bar), and allowing a following field to be placed where a wrapper type would demand padding (baz immediately after bar):

// WGSL
struct Example {     // size = 32, alignment = 16
    foo: vec3<f32>,  // offset = 0, size = 12
    bar: vec3<f32>,  // offset = 16, size = 12
    baz: f32,        // offset = 28, size = 4
}
// Rust
#[repr(linear)] // as defined in this RFC; repr(C) in current Rust
#[derive(Debug, Copy, Clone, bytemuck::Pod, bytemuck::Zeroable)]
pub(crate) struct Example {
    #[align(16)]
    foo: [f32; 3],

    // #[align] below causes 4 bytes of padding to be inserted here to satisfy it.

    #[align(16)]
    bar: [f32; 3],

    baz: f32,      // If we used a wrapper for bar, this field would be at offset 32, wrongly
}

It is often possible to order structure fields to fill gaps so that no inter-field padding is needed — such as if the fields in this example were declared in the order {foo, baz, bar} — and this is preferable when possible to avoid wasted memory, but the advantage of using #[align] in this scenario is that when used systematically, it can imitate WGSL's layout and thus will be correct even if the field ordering is not optimal.

(Please feel free to use any of the above text in the RFC.)

@traviscross
Copy link
Contributor

We discussed this in the lang call today. We were feeling generally favorable about this, but all need to read it more closely.

Comment on lines 378 to 395
1. What should the syntax be for applying the `align` attribute to `ref`/`ref
mut` bindings?

- Option A: the attribute goes inside the `ref`/`ref mut`.

```rust
fn foo(x: &u8) {
let ref #[align(4)] _a = *x;
}
```

- Option B: the attribute goes outside the `ref`/`ref mut`.

```rust
fn foo(x: &u8) {
let #[align(4)] ref _a = *x;
}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever we do, I'd expect it to be the same as for mut. So it's probably not worth deferring this question, as we need to handle it there.

As for where to put it, it seems like a bit of a coin toss. Anyone have a good argument for which way it should go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m comfortable deferring it because I see no use-case for it, and I don’t want to hold up the RFC on something with no use case.

Copy link
Contributor

@traviscross traviscross May 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but still, I repeat my question, as we need to answer it for mut in any case, about whether there are good arguments for on which side of mut the attribute should appear.

Copy link
Contributor Author

@Jules-Bertholet Jules-Bertholet May 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mut case does have actual use-cases, so I think we should handle the issue in the context of that, not this RFC.

Copy link
Contributor Author

@Jules-Bertholet Jules-Bertholet May 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wait, I think there may be a misunderstanding here. By “the same as for mut”, are you referring to combining mut with ref/ref mut?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This RFC specifies this is allowed (quoting from an example in the RFC):

let (a, #[align(4)] b, #[align(2)] mut c) = (4u8, 2u8, 1u8);

My question is whether there are good arguments about whether we should prefer that, or should instead prefer:

let (a, #[align(4)] b, mut #[align(2)] c) = (4u8, 2u8, 1u8);

The RFC should discuss any reasons we might want to prefer one over the other.


Separately, and secondarily, my feeling is that if we chose

let #[align(..)] mut a = ..;

then we would also choose:

let #[align(..)] ref a = ..;

And if we instead chose

let mut #[align(..)] a = ..;

then we would choose:

let ref #[align(..)] a = ..;

So my feeling is that in settling the question of how to syntactically combine #[align(..)] and mut, we are de facto settling the question of how to combine #[align(..)] with any other binding mode token.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t agree that we would necessarily want to make the same choice in both cases. I actually think it depends on how mut and ref/ref mut should be combined.

If the combination looks like

let ref (mut x) = …;
let ref mut (mut x) = …;

Then we should also do

let ref (#[align()] x) = …;
let ref mut (#[align()] x) = …;

But if it looks like

let mut ref x = …;
let mut ref mut x = …;

Then we should do

let #[align()] ref x = …;
let #[align()] ref mut x = …;

Copy link
Contributor

@traviscross traviscross May 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that event, and in your model, that would still leave us deciding between:

let ref (mut #[align(..)] x) = ..; // 1
// vs
let ref (#[align(..)] mut x) = ..; // 2

And between:

let #[align(..)] mut ref x = ..; // 3
// vs
let mut #[align(..)] ref x = ..; // 4

I would estimate that we'd comfortably favor 1, 3 over 2, 4.

There are also, of course, these possibilities:

let #[align(..)] ref (mut x) = ..; // 5
let mut ref #[align(..)] x = ..; // 6

If in this RFC we pick #[align(..) mut x, that would rule out for me option 1 if we later did ref (mut x) (and I wouldn't pick option 2 anyway). If we pick mut #[align(..)] x, that would rule out for me option 3 if we later did mut ref x (and I wouldn't pick option 4 anyway).

That is, even in this future possibility, I'm going to want to keep all of the binding mode tokens either to the left or to the right of the attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll elaborate in the RFC, but my preference is for 2 or 3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve added a section on this to the RFC.

@traviscross
Copy link
Contributor

traviscross commented May 19, 2025

In my view, this RFC is well motivated and has a lot of correct details. Thanks to @Jules-Bertholet for writing this up. I think we should do it with one adjustment.

Proposal: Accept this RFC as written, modulo that:

As far as this RFC goes, we allow #[align(..)] on bindings with all binding modes, and we specify the syntactic grammar as:

IdentifierPattern -> OuterAttribute* `ref`? `mut`? IDENTIFIER ( `@` PatternNoTopAlt )?

Semantically, for the moment, we would disallow attributes other than align.

If maybe we want to hold back on allowing align semantically with non-move binding modes, I propose we consider that at stabilization time rather than as a carve-out in this RFC.

The current text contains an argument about how this question should be tied in with a potential decision we might later make about syntax for reference bindings where the binding itself is mutable, but I don't buy it. We might or might not ever do that, and even if we did, it wouldn't change my own feeling that all the tokens that control the binding mode should appear together in the grammar. I find the grammar above just too compelling.

In terms of whether OuterAttribute* should appear before or after the binding mode tokens, I've tried some code with it both ways, and I agree it makes more sense at the start of the IdentifierPattern.

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented May 19, 2025

Team member @traviscross has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels May 19, 2025
@Jules-Bertholet
Copy link
Contributor Author

Two, the text specifies disallowing align with _ (non-)bindings, let #[align(..)] _ = ... This feels more like a warning to me, so I propose that we allow this and consider warning on it.

Strongly disagree. I’ve updated the RFC

@traviscross
Copy link
Contributor

Thanks, yes, that's right. Adjusted the proposal.

@nikomatsakis
Copy link
Contributor

@rfcbot reviewed

This makes sense to me. It seems like a capability we should have. I don't like using #[repr(align(22))] everywhere; it's too verbose and it's unclear what other "repr" attributes might make sense. We can always move that way later if needed.

I do recall us talking about #[align] in some other context where repr seemed odd -- I think on function items? I didn't notice whether they are covered by this RFC.

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Jun 4, 2025
@rfcbot
Copy link
Collaborator

rfcbot commented Jun 4, 2025

🔔 This is now entering its final comment period, as per the review above. 🔔

@folkertdev
Copy link

Functions are indeed included, in this section https://github.com/Jules-Bertholet/rfcs/blob/align-attr/text/3806-align-attr.md#on-function-items, and the stabilization PR I have up for alignment of functions is, for now, blocked on the decision here for whether to use #[repr(align(N))] or #[align(N)] or #[align = N]. This RFC proposes #[align(N)].

@tmandry
Copy link
Member

tmandry commented Jun 10, 2025

@rfcbot reviewed

The part of this proposal that gives me the most pause is how it applies to async fn. It applies to the function returning the future, which is different from how #[inline] works, applying to the poll function. The justification given is reasonable: it should apply to the address of the named function as it does with other functions; any inconsistency here would lead to accidental UB. But it doesn't leave me completely satisfied; the inconsistency with #[inline] could just as well lead to confusion and accidental UB.

I think the RFC makes the right choice between the two options, but it leaves me wishing for a third.

repr(C) currently has two contradictory meanings: “a simple, linear layout algorithm that works the same everywhere” and “an ABI matching that of the target’s standard C compiler”. This RFC does not aim to resolve that conflict; that is being discussed as part of #3718. Henceforth, we will use repr(C_for_real) to denote “match the system C compiler”, and repr(linear) to denote “simple, portable layout algorithm”; but those names are not normative.

This implies to me an unresolved question that needs to be resolved ahead of stabilization: What should the actual behavior be for repr(C)? Please add this to the RFC.

@Jules-Bertholet
Copy link
Contributor Author

Jules-Bertholet commented Jun 10, 2025

This implies to me an unresolved question that needs to be resolved ahead of stabilization: What should the actual behavior be for repr(C)? Please add this to the RFC.

Already there, under heading “MSVC”.

Comment on lines +534 to +536
- The `align(…)` and `repr(align(…))` attributes currently accept only integer
literals as parameters. In the future, they could support `const` expressions
as well.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... do not expect this to be a realistic possibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ allows it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put differently: there is no precedent in rust for this sort of const evaluation in attributes, and it really does not fit into the current attribute architecture.

But "future possibilities" don't really need to think too much about (short-term) feasibility I think.

Comment on lines +323 to +353
Compared to the wrapper type approach, the `align` attribute adds additional
flexibility, because it does not force the insertion of padding. If we don't
adopt this feature, `bindgen` will continue to generate suboptimal bindings, and
users will continue to be forced to choose between suboptimal alignment and
additional padding.

## `#[align(…)]` vs `#[repr(align(…))]`

One potential alternative would be to use `#[repr(align(…))]` everywhere,
instead of introducing a new attribute.

Benefits of this alternative:

- No new attribute polluting the namespace.
- Requesting a certain alignment is spelled the same everywhere.
- `#[repr(…)]` on fields might accept additional options in the future, for
specifying layout and padding more precisely.
- `#[repr(…)]` on function items could also accept `instruction_set(…)` as an
argument, replacing the existing attribute of that name.

Drawbacks:

- `#[repr(align(…))]` is a longer and noisier syntax.
- `#[repr(…)]` on non-ADTs would never accept the same set of options as on
ADTs. On field definitions, it might accept additional options to precisely
control layout; on function items, it might accept `instruction_set(…)`, if we
were to overturn the precedent of that being a standalone attribute. On
statics and local variables, I doubt it would ever accept anything else at
all.
- `#[align(…)]` *only* aligns, while `#[repr(align(…))]` also pads to a multiple
of the alignment. Having different syntax makes that distinction more clear.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This glosses over a truth of the compiler: the existing layout code is very complex (certainly unnecessarily so, I think... I am anticipating getting in the mood to burn it down and rewrite it soon) and it is not clear how this sort of non-padding alignment obligation will be introduced to the compiler without, frankly, introducing increased risk of miscompilations. I almost wish this "alignment" attribute was not called align at all, as the idea of non-padding alignment is fairly new to the language and the compiler. There have been occasional instances of overalignment in the compiler, but these have never deliberately attempted to not also create padding. This will be something truly different.

I believe this is a consequence of no exploratory implementation of this RFC having been attempted, or this RFC would probably be considerably less ambitious in proposing adding this to so many places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because this RFC proposes something, that does not imply a deadline for shipping every part of it. Also:

  • #[align] on repr(C)/repr(linear) is likely blocked on resolving that mess, so will need to wait anyway.
  • Everywhere else, #[align] not adding padding is just an optimization; an MVP could always over-pad.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not disagree. But my impression was that the lang team wanted to reduce the frequency of having RFCs that ship in oddly-shaped bulges. If it is also their understanding that this will ship in oddly-shaped bulges, then I am content.

Comment on lines +73 to +74
- It changes the type of the item, which may not be allowed if it is part of the
crate's public API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ehn. Technically true but Deref implementations exist, which allows one to often escape this breakage since naming a static typically makes it a subexpression subject to Deref coercions... I wonder if there's a case where that doesn't apply? Maybe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must also note that I have voiced, repeatedly, that attributes do not compose very well in Rust compared to things more deeply embedded in the type, and the idea that this will compose as well as it does in C seems doubtful. I have made a separate note in one case of such. Nonetheless it is obviously the case that there are many positions where this attribute is desirable to have but there is no real way to introduce it to the type, or such is not reasonable. I wish this had been more incremental, in starting with such cases first.

Comment on lines +360 to +380
## `#[align(…)]` on function parameters

We could choose to allow this. However, this RFC specifies that it should be
rejected, because users might incorrectly think the attribute affects ABI when
it does not. C and C++ make the same choice.

To give an example of what could go wrong, consider the following function:

```rust
fn example(#[align(1024)] very_large_value: [u64; 8192]) {
// use `very_large_value` by reference
}
```

Calling this function will most likely involve first passing `very_large_value`
on the stack or by pointer, and then copying the entire array to a new place on
the stack in order to align it. This implicit extra stack copy is not present
for `#[align(…)]`ed locals. Forbidding this, and requiring users to make the
move/copy explicit, avoids the performance footgun.

We could always lift this limitation in the future.
Copy link
Member

@workingjubilee workingjubilee Jun 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not rate "most likely". This is a certainty for every up-alignment case, only excluding those where the ABI already specifies that minimum alignment due to internal details. The compiler will not be able to fix this except via inlining, because #[align] will be an attribute that is not part of the function type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not rate "most likely". This is a certainty for every up-alignment case

For any normal platform, yes. But not necessarily for every possible imaginable hypothetical platform that could theoretically be a compile target for an implementation of Rust. That’s why I wrote “most likely”

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhh, yes, but tagged dataflow or belt-machine architectures would "fix" this by not even having a notion of alignment for their operands, which sort of makes it worse, in a way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhh, yes, but tagged dataflow or belt-machine architectures would "fix" this by not even having a notion of alignment for their operands, which sort of makes it worse, in a way.

for belt-machine architectures, and generally for more standard register machines, instruction operands generally don't have alignment; what has alignment even on belt-machines is memory accesses.

for Rust, since all values are theoretically in memory (except for inline asm internals) and having stuff in registers instead is only a optimization (it only happens when you can't tell if the value is in memory or not), everything needs to worry about alignment.

Comment on lines +15 to +47
## Bindings to C and C++

[C](https://en.cppreference.com/w/c/language/_Alignas) and
[C++](https://en.cppreference.com/w/cpp/language/alignas) provide an `alignas`
modifier to set the alignment of specific struct fields. To represent such
structures in Rust, `bindgen` is sometimes forced to add explicit padding
fields:

```c
// C code
#include <stdint.h>
struct foo {
uint8_t x;
_Alignas(128) uint8_t y;
uint8_t z;
};
```

```rust
// Rust bindings generated by `bindgen`
#[repr(C, align(128))]
pub struct foo {
pub x: u8,
pub __bindgen_padding_0: [u8; 127usize],
pub y: u8,
pub z: u8,
}
```

The `__bindgen_padding_0` field makes the generated bindings more confusing and
less ergonomic. Also, it is unsound: the padding should be using `MaybeUninit`.
And even then, there is no guarantee of ABI compatibility on all potential
targets.
Copy link
Member

@workingjubilee workingjubilee Jun 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this RFC does not engage with is the tail padding dilemma. In C++, the compiler is allowed to take an object that might look like this:

struct VeryPadded {
    float x;
    _Alignas(16) float y;
};

struct FitsFieldsInPadding: VeryPadded {
    float z;
};

And place FitsFieldsInPadding's z field right after the y field. Then C++ will often wind up performing the upcasting at some point, dealing in references to VeryPadded. This makes it easy for Rust code to then clobber the subobjects when writing updates in terms of the supertype, breaking the C++ side.

As this proposal does not engage with that issue, indeed it specifies we will continue to treat references to VeryPadded as fully-padded, this feels like it weakens the motivation for it with regards to C++.

This is a complete non-issue for C, of course, and C alone is arguably enough motivation. I just think we have a rather large gap for the other case. A padding-filled gap, even. So if we're hoping this improves C++ interop that much, then we have not done enough.

Copy link
Contributor Author

@Jules-Bertholet Jules-Bertholet Jun 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RFC does not aim to address C++-specific issues. This “tail-padding dilemma” is out of scope. (My first instict is that we should have a new kind of reference that doesn’t give access to tail padding. I suspect there is overlap with partial borrows. But again, that’s not for this RFC)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RFC lists several motivations, and I do not know the mind of T-lang as to what motivations they decided based on. If this doesn't fully address related problems in interop with C++ layout because that problem requires additional considerations or possibly-breaking changes in order to have the desired gain in ergonomics, then you have put it into scope by listing it as a motivation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC discusses the C/C++ alignas attribute. It does not mention any other aspect of C or C++. C++ has a gazillion features; this RFC certainly does not aim to cover interop with all of them, and I don’t understand why you would think otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not disagree, but interop only with the subset of C++ expressible directly in C, without engaging with the nuances of how C++ layout composes differently in that language, seems like it's just interop with C.

Which is fine, and probably enough.

Comment on lines +91 to +96
## Interoperating with systems that have types where size is not a multiple of alignment

In Rust, a type’s size is always a multiple of its alignment. However, there are
other languages that can interoperate with Rust, where this is not the case
([WGSL](https://www.w3.org/TR/WGSL/#alignment-and-size), for example). It’s
important for Rust to be able to represent such structures.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but representation is not enough. Sometimes, our dynamic semantics must change, like with the C++ overlapping-subobject case.

Comment on lines +233 to +243
The effect of the attribute is to force the `static` to be stored with at least
the specified alignment. The attribute does not force padding bytes to be added
after the `static`. For `static`s inside `unsafe extern` blocks, if the `static`
does not meet the specified alignment, the behavior is undefined. (This UB is
analogous to the UB that can result if the static item is not a valid value of
its type. The question of whether the UB can occur even if the item is unused,
has the same answer for both cases.)

The `align` attribute may also be applied to thread-local `static`s created with
the `thread_local!` macro; the attribute affects the alignment of the underlying
value, not that of the outer `std::thread::LocalKey`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An interesting detail is that the most likely way we represent this to code generators may also be interpreted by some code generators (such as LLVM) in some cases as a constraint that prevents their overalignment. This seems fine, just "fun facts about LLVM with Jubilee": https://llvm.org/docs/LangRef.html#global-variables

An explicit alignment may be specified for a global, which must be a power of 2. If not present, or if the alignment is set to zero, the alignment of the global is set by the target to whatever it feels convenient. If an explicit alignment is specified, the global is forced to have exactly that alignment. Targets and optimizers are not allowed to over-align the global if the global has an assigned section.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One consequence of the above is that in C, statics can be aligned to less than what their type requires. That is not and won't be possible in rust, because it would break references to values stored in statics (which assume the target is sufficiently aligned for its type).

Anyway, the alignment on statics is its own thing. I have a prototype of that, but it'll definitely need T-lang's separate approval. (from my perspective this RFC is mostly about having the attribute at all: the concrete locations where occurs will still need separate design and ultimately a green light from T-lang).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as long as everyone shares this understanding, I'm okay with that, but that's why it seems slightly surprising that "future possibilities" is "things that are almost certainly not possible in the compiler" instead of like half of the RFC's contents.

@workingjubilee
Copy link
Member

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Jun 14, 2025
@rfcbot
Copy link
Collaborator

rfcbot commented Jun 14, 2025

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

tgross35 added a commit to tgross35/rust that referenced this pull request Jun 19, 2025
…e, r=jdonszelmann

use `#[align]` attribute for `fn_align`

Tracking issue: rust-lang#82232

rust-lang/rfcs#3806 decides to add the `#[align]` attribute for alignment of various items. Right now it's used for functions with `fn_align`, in the future it will get more uses (statics, struct fields, etc.)

(the RFC finishes FCP today)

r? `@ghost`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. I-lang-radar Items that are on lang's radar and will need eventual work or consideration. P-lang-drag-2 Lang team prioritization drag level 2. T-lang Relevant to the language team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.