Skip to content

Introduce a new Buffer trait. #1290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 26, 2025
Merged

Introduce a new Buffer trait. #1290

merged 1 commit into from
Feb 26, 2025

Conversation

sunfishcode
Copy link
Member

I'm experimenting with a Buffer trait similar to #908, however I've run into a few problems. See the questions in examples/new_read.rs for details.

@notgull @SUPERCILEX

Copy link
Contributor

@SUPERCILEX SUPERCILEX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with this approach, I feel like there's a pretty good chance compiler folks will improve the error messages for us if we file bug reports. The story around passing in a mutable array but needing a slice instead has annoyed me for quite some time (I usually just as_mut_slice everything). It was pretty confusing to figure out initially, so I'm sure better error messages would be welcome.

@sunfishcode sunfishcode mentioned this pull request Jan 27, 2025
21 tasks
@sunfishcode sunfishcode added the semver bump Issues that will require a semver-incompatible fix label Jan 30, 2025
@sunfishcode sunfishcode force-pushed the sunfishcode/new-read branch 4 times, most recently from f2eaa94 to 2278976 Compare February 4, 2025 12:35
@SUPERCILEX
Copy link
Contributor

Looking at our APIs, I think we'd want to consider adding uninit methods to roughly these methods (the filtering is not great, sorry):

@sunfishcode
Copy link
Member Author

sunfishcode commented Feb 25, 2025

Ok, I've now updated everything in rustix to use the new Buffer trait, except:

  • readlink, readlinkat, ptsname, ttyname - These return CString, which is a little different than a buffer of u8.
  • The ancillary data API for sendmsg/recvmsg. These are tricky. And not needed due to Add uninit buffer ancillary APIs #1108.
  • IoSliceMut, because we reuse std's IoSliceMut type.
  • IoSliceRaw, because it's potentially tricky.

I'm not opposed to having them use Buffer, I've just run out of steam for doing them myself. If anyone would like to see them migrated, PRs would be welcome.

impl<T> Buffer<T> for &mut [MaybeUninit<T>] {}
impl<T, const N: usize> Buffer<T> for &mut [MaybeUninit<T>; N] {}
#[cfg(feature = "alloc")]
impl<T> Buffer<T> for &mut Vec<MaybeUninit<T>> {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want this one? I feel like that's almost always wrong and you meant to use spare capacity. Maybe there's a use case I'm missing?

// auto-derefed in a `impl Buffer<u8>`, so we add this `impl` so that our users
// don't have to add an extra `*` in these situations.
#[cfg(feature = "alloc")]
impl<T> private::Sealed<T> for &mut Vec<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very footgunny... is the thinking that users might create a big buffer initialized to zero and then pass that in? I can see that making sense, but it still scares me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this because rustix's current APIs already support using Vec like this, because Vec has a DerefMut to &mut [T]. This just ensures that any existing users doing this don't need to add an extra * to keep working.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

@SUPERCILEX
Copy link
Contributor

Thank you so much for polishing this idea! We'll see if the pitch forks come out, but I'm very excited that MaybeUninit is no longer a second class citizen.

@SUPERCILEX
Copy link
Contributor

My thoughts on the APIs that weren't updated.

readlink, readlinkat, ptsname, ttyname - These return CString, which is a little different than a buffer of u8.

The "problem" here is that all of these methods grow the buffer that's passed in. So I think these methods are fundamentally different and thus there's no need to try and shoehorn them into using the buffer trait.

The ancillary data API for sendmsg/recvmsg

This seems fine because these APIs are like inotify::Reader and RawDir: they produce well typed values and are using the buffer as scratch space rather than returning it to you for your consumption. You gain nothing by passing in non-MaybeUninit buffers and simply shouldn't be allowed to do that, so not supporting the buffer trait is fine.

IoSlice

I have no opinion on these right now. I agree that more thought will be needed, so punting and creating separate methods in 1.x and then fully replacing in a distant v2 seems fine.

Using the idea from #908 and #1110, introduce a new `Buffer` trait for
reading into possibly uninitialized buffers, and use it in `read`,
`recv`, `epoll::wait`, `kqueue`, `getxattr`, and several other
functions.

This eliminates the need for separate `*_uninit` functions
@sunfishcode sunfishcode changed the title Prototype a new Buffer trait. Introduce a new Buffer trait. Feb 26, 2025
@sunfishcode
Copy link
Member Author

I found another confusing error message when testing with this patch:

error[E0521]: borrowed data escapes outside of closure
   --> src/sys/linux_macos.rs:124:35
    |
118 |     let listxattr_func = if deref {
    |         -------------- `listxattr_func` declared here, outside of the closure body
...
124 |     let vec = allocate_loop(|buf| listxattr_func(&*path, buf))?;
    |                              ---  ^^^^^^^^^^^^^^^^^^^^^^^^^^^ `buf` escapes the closure body here
    |                              |
    |                              `buf` is a reference that is only valid in the closure body

Offhand, I'm not sure what it means.

The code is in https://github.com/sunfishcode/xattr (which is a fork of https://github.com/Stebalien/xattr with fixes for rustix 1.0.0).

@sunfishcode
Copy link
Member Author

I found a way to work around that error in that particular case, so I'm going to merge this and proceed with the next prerelease. If that error or other confusing errors pop up in more places, we can reevaluate.

@sunfishcode sunfishcode merged commit 45ee8d2 into main Feb 26, 2025
45 checks passed
@sunfishcode sunfishcode deleted the sunfishcode/new-read branch February 26, 2025 13:54
@sunfishcode
Copy link
Member Author

This is now prereleased in rustix 1.0.0-prerelease.1.

@SUPERCILEX
Copy link
Contributor

What was the fix to that error? My guess is that again you'd need to reborrow.

@kevinmehall
Copy link
Contributor

The MaybeUninit impl doesn't play well with retry_on_intr because the borrow checker can't see that the closure won't be called again after it has returned Ok with the borrow from the buffer.

let mut event_buf = [MaybeUninit::<epoll::Event>::uninit(); 4];
let (events, _) = retry_on_intr(|| epoll::wait(epoll_fd, &mut event_buf, None)).unwrap();
error: captured variable cannot escape `FnMut` closure body
  --> src/platform/linux_usbfs/events.rs:93:40
   |
92 |     let mut event_buf = [MaybeUninit::<epoll::Event>::uninit(); 4];
   |         ------------- variable defined here
93 |     let (events, _) = retry_on_intr(|| epoll::wait(epoll_fd, &mut event_buf, None)).unwrap();
   |                                      - ^^^^^^^^^^^^^^^^^^^^^^^^^^^---------^^^^^^^
   |                                      | |                          |
   |                                      | |                          variable captured here
   |                                      | returns a reference to a captured variable which escapes the closure body
   |                                      inferred to be a `FnMut` closure
   |
   = note: `FnMut` closures only have access to their captured variables while they are executing...
   = note: ...therefore, they cannot allow references to captured variables to escape

I think you pretty much always want to ignore EINTR on epoll::wait.

This works, and isn't too bad, though:

let events = match epoll::wait(epoll_fd, &mut event_buf, None) {
    Ok((events, _)) => events,
    Err(Errno::INTR) => &mut [],
    Err(e) => panic!("epoll::wait failed: {e}"),
};

@SUPERCILEX
Copy link
Contributor

Yup, it's the reborrow bug again. You need to use event_buf.as_mut_slice() instead.

@SUPERCILEX
Copy link
Contributor

@sunfishcode
Copy link
Member Author

What was the fix to that error? My guess is that again you'd need to reborrow.

Moving the let listxattr_func = ... into the inner closure: Stebalien/xattr@2fa7d57 fixes the error.

@sunfishcode
Copy link
Member Author

* https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=cc44131bdcd2981de98bf3e2dbb2ebd3

This example gets the same error message as the error in xattr, however the code is a little different. This example is fixed by adding &mut *, while the error in xattr is not.

@sunfishcode
Copy link
Member Author

I filed #1356 to add an example with standalone testcases of all the various interesting error messages we've seen with Buffer, along with ways to fix them.

@kevinmehall
Copy link
Contributor

The snippet I posted uses &mut [MaybeUninit<T>] where epoll::wait returns (&mut [T], &mut [MaybeUninit<T>]), while the example in #1356 demonstrates retry_on_intr with a &mut [T] buffer where the Output would be usize if the example function used it. I don't think re-borrowing helps with the MaybeUninit case, because it needs the full lifetime for the return value.

As I said, it's not necessarily bad to have to use match instead of retry_on_intr here, but it negates some of the value of retry_on_intr.

sunfishcode added a commit that referenced this pull request Mar 4, 2025
Add the error message described [here], and a workaround.

[here]: #1290 (comment)
@sunfishcode
Copy link
Member Author

@kevinmehall Ah, thanks for pointing that out. I've now filed #1375 to add an example that uses MaybeUninit with retry_on_intr. I unfortunately was also unable to find a better workaround, so the advice I went with is, use an explicit loop instead of using retry_on_intr.

sunfishcode added a commit that referenced this pull request Mar 4, 2025
Add the error message described [here], and a workaround.

[here]: #1290 (comment)
@SUPERCILEX
Copy link
Contributor

@kevinmehall Ah, I didn't realize this was specific to the MaybeUninit variants. Indeed, this time the compilation error is correct in that it prevents a potential soundness issue if rety_on_iter did something liek this:

fn uh_oh(f: impl FnMut() -> T) -> (T, T) { (f(), f()) }

let mut event_buf = [MaybeUninit::<u8>::uninit(); 4];
let (mut_ref_one, mut_ref_two) = uh_oh(|| &mut event_buf);

Honestly I wonder if we should just get rid of retry_on_intr—it doesn't handle EAGAIN and I usually prefer the loop style since it lets you break for different reasons.

@sunfishcode
Copy link
Member Author

Agreed that retry_on_intr is often not useful, however it is occasionally handy. It doesn't check for EAGAIN because that usually means the code should return to its event loop rather than spinning until data arrives.

@sunfishcode
Copy link
Member Author

I've now written a blog post about the Buffer trait: https://blog.sunfishcode.online/writingintouninitializedbuffersinrust/

@SUPERCILEX
Copy link
Contributor

Great blog post, thanks for sharing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver bump Issues that will require a semver-incompatible fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants