Skip to content

Conversation

@The-King-of-Toasters
Copy link
Contributor

@The-King-of-Toasters The-King-of-Toasters commented Oct 19, 2025

Maintaining the POSIX stat bits for Zig is a pain. Not only is struct stat incompatable between architectures, but maddingly annoying so; timestamps are specified as machine longs or fixed-width ints, members can be signed or unsigned. The libcs deal with this by introducing the own version of struct stat and copying the kernel structure members to
it. In the case of glibc, they did it twice thanks to the largefile transition!

In practice, the project needs to maintain three versions of struct stat:

  • What the kernel defines.
  • What musl wants for struct stat.
  • What glibc wants for struct stat64. Make sure to use fstatat64!

And it's not as simple as running zig translate-c. In #21440 I had to create test programs in C and use pahole to dump the structure of stat for each arch, and I was constantly running into issue regarding padding and signed/unsigned ints. The fact that so many target checks in the linux and posix tests exist is most likely due to writing to padding bits and failing later.

The solution to this madness is statx(2):

  • It takes a single structure that is the same for all arches AND libcs.
  • It uses a custom timestamp format, but it is 64-bit ready.
  • It gives the same info as fstatat(2) and more!
  • Unlike fstatat(2), you can request a subset of the info required based on passing a mask.

It's so good that modern Linux arches (e.g. riscv) don't even implement stat, with the libcs using a generic struct stat and copying from struct statx.

Therefore, this PR rips out all the stat bits from std.os.linux and std.c. std.posix.Stat is now void, and calling std.posix.*stat is an compile-time error. A wrapper around statx has been added to std.os.linux, and callers have been upgraded to use it. Tests have also been updated to use statx where possible.

While I was here, I converted the mask and file attributes to be packed struct bitfields. A nice side effect is checking that you actually received the members you asked for via Statx.mask, which I have used by adding asserts at specific callsites.

Progress towards #21738

Also removes the blank lines between members, and a comptime sizeOf
check.
@The-King-of-Toasters
Copy link
Contributor Author

Forgot to compile the compiler and added a call to MappedFile. Perhaps fs.File.Stat should add these members? I imagine it would be useful to know.

Copy link
Contributor

@rootbeer rootbeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me. I've got some questions, and ideas, but feel free to ignore them.

.{ .major = 2, .minor = 28, .patch = 0 });
const sys = if (use_c) std.c else std.os.linux;

var stx = std.mem.zeroes(Statx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little surprised to see this wrapper is stack allocating and zero'ing a statx buffer, instead of taking a pointer to a statx buffer and letting the caller decide if/how/when to allocate and zero it. AFAICT, none of the other syscall wrappers in here do anything quite like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because

  • Most call sites used posix.*stat.
  • posix.*stat did the same thing.

My understanding is that the wrappers act like posix in that they translate errors and allocate structures for you. Happy to change if not.

test "linkat with different directories" {
if ((builtin.cpu.arch == .riscv32 or builtin.cpu.arch.isLoongArch()) and builtin.os.tag == .linux and !builtin.link_libc) return error.SkipZigTest; // No `fstatat()`.
if (builtin.cpu.arch.isMIPS64()) return error.SkipZigTest; // `nstat.nlink` assertion is failing with LLVM 20+ for unclear reasons.
fn getLinkInfo(fd: posix.fd_t) !struct { posix.ino_t, posix.nlink_t } {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@The-King-of-Toasters
Copy link
Contributor Author

The-King-of-Toasters commented Oct 20, 2025

@rootbeer I'll take your comments and update the PR later today (Sydney time).
@alexrp How would you handle the failure in glibc_runtime_check.zig? Perhaps adding the stat function definitions but making the struct stat part a byte buffer instead.

@alexrp
Copy link
Member

alexrp commented Oct 20, 2025

@alexrp How would you handle the failure in glibc_runtime_check.zig? Perhaps adding the stat function definitions but making the struct stat part a byte buffer instead.

That seems reasonable for this test.

@The-King-of-Toasters The-King-of-Toasters force-pushed the statx branch 2 times, most recently from a13f0bf to 09ec5b7 Compare October 20, 2025 06:53
Maintaining the POSIX `stat` bits for Zig is a pain. Not only is `struct
stat` incompatable between architectures, but maddingly annoying so;
timestamps are specified as machine longs or fixed-width ints, members
can be signed or unsigned. The libcs deal with this by introducing the
own version of `struct stat` and copying the kernel structure members to
it. In the case of glibc, they did it twice thanks to the largefile
transition!

In practice, the project needs to maintain three versions of `struct
stat`:
- What the kernel defines.
- What musl wants for `struct stat`.
- What glibc wants for `struct stat64`. Make sure to use `fstatat64`!

And it's not as simple as running `zig translate-c`. In ziglang#21440 I had to
create test programs in C and use `pahole` to dump the structure of
`stat` for each arch, and I was constantly running into issue regarding
padding and signed/unsigned ints. The fact that so many target checks in
the `linux` and `posix` tests exist is most likely due to writing to
padding bits and failing later.

The solution to this madness is `statx(2)`:
- It takes a single structure that is the same for all arches AND libcs.
- It uses a custom timestamp format, but it is 64-bit ready.
- It gives the same info as `fstatat(2)` and more!
- Unlike `fstatat(2)`, you can request a subset of the info required
  based on passing a mask.

It's so good that modern Linux arches (e.g. riscv) don't even implement
`stat`, with the libcs using a generic `struct stat` and copying from
`struct statx`.

Therefore, this commit rips out all the `stat` bits from `std.os.linux`
and `std.c`. `std.posix.Stat` is now `void`, and calling
`std.posix.*stat` is an compile-time error. A wrapper around `statx` has
been added to `std.os.linux`, and callers have been upgraded to use it.
Tests have also been updated to use `statx` where possible.

While I was here, I converted the mask and file attributes to be packed
struct bitfields. A nice side effect is checking that you actually
recieved the members you asked for via `Statx.mask`, which I have used
by adding `assert`s at specific callsites.

In the future I expect types like `mode_t`/`dev_t` to be audited and
removed, as they aren't being used to define members of `struct stat`.
Commit #fc7a5f2 moved many of the `_t` types up a level, but didn't
remove them from arch_bits. Since `Stat` is gone, all but `time_t` can
be removed.
@alexrp
Copy link
Member

alexrp commented Oct 20, 2025

Would you mind opening a mirror PR on Codeberg for LoongArch and RISC-V CI?

@alexrp
Copy link
Member

alexrp commented Oct 20, 2025

Also, it doesn't have to be done in this PR, but it would be nice to decouple std.c.dev_t (& friends) completely from the corresponding types in std.os.linux since, for some of them, they're not the same size or even representation.

@The-King-of-Toasters
Copy link
Contributor Author

Would you mind opening a mirror PR on Codeberg for LoongArch and RISC-V CI?

Done.

@alexrp
Copy link
Member

alexrp commented Oct 22, 2025

The std.fs changes here will cause some merge conflicts with #25592. They look relatively minor to me, but probably best for @andrewrk to confirm whether we should hold off on merging this until #25592 lands.

@bernardassan
Copy link

@The-King-of-Toasters could you adapt the structure I used in #25394 for the typed flags changes in IoUring.zig for Statx.Attr and Statx.Mask

pub const Mask = packed struct(u32) {
and
pub const Attr = packed struct(u64) {
? In my opinion, that has better namespacing, and the doc comment allows users familiar with STATX_ATTR_* and STATX_MASK_* constants to easily find it when searching for them on https://ziglang.org/documentation/master/std/ which I think seems to be the main motivation for most of the identifiers been UPPER_CASE in linux.zig. We can do better in Zig 😄 than C, and this would also help to reduce conflicts when #25394 is been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants