Skip to content

made readdir stateful #1738

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

made readdir stateful #1738

wants to merge 2 commits into from

Conversation

jounathaen
Copy link
Member

@jounathaen jounathaen commented Jun 3, 2025

This fixes #1589 by saving the read state in the backends of the sys_getdents64 syscall.

Currently, implemented only for RAMFS and FUSE - mostly because I couldn't find an example on how to use the ROMFS.

In general, I'm not a huge fan of this, as we basically reimplement the old POSIX interface instead of designing a better interface ourselves, but this avoids complications and breaking changes in the standard libraries (Rust and Newlib).

Notable Quirks/Issues:

  • We don't have Inode numbers for RAMFS. All entries have Inode nr. 1. This is unlikely to be a problem because I can't imagine an application utilizing Inodes being seriously executed in an unikernel.
  • As the RAMFS is currently based on paths, we can't tell the application which file type a direntry has. This is also unlikely to cause a problem, as Linux has introduced this field "recently" (21 years ago) and states that "All applications must properly handle a return of DT_UNKNOWN." (https://linux.die.net/man/2/getdents64)
  • This would be a great use-case for a unit test, but it is quite tricky to get file system related things working in a unit test. This is because the file system relies on async code, but the async executor also tries to execute other maintenance tasks which are not present in a unit-test environment.

@jounathaen jounathaen changed the title Readdir made readdir stateful Jun 3, 2025
@jounathaen jounathaen force-pushed the readdir branch 2 times, most recently from 0d0a2d1 to 691484e Compare June 3, 2025 17:15
@jounathaen jounathaen marked this pull request as ready for review June 3, 2025 17:16
@mkroening mkroening self-requested a review June 3, 2025 17:17
@mkroening mkroening self-assigned this Jun 3, 2025
Copy link
Member

@mkroening mkroening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I have a few comments. :D


#[hermit_macro::system]
#[unsafe(no_mangle)]
pub unsafe extern "C" fn sys_getdents64(
fd: FileDescriptor,
dirp: *mut Dirent64,
// rename: bytes/size/length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue for keeping the name consistent with Linux:

ssize_t getdents64(int fd, void dirp[.count], size_t count);

/// Filename (null-terminated)
pub d_name: PhantomData<c_char>,
}
impl core::fmt::Debug for Dirent64 {
fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
let cstr = unsafe { CStr::from_ptr((&raw const self.d_name).cast::<c_char>()) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not sound, since the API allows creating the struct without a filename. This would then read out of bounds.

Comment on lines +552 to +582
#[repr(u8)]
#[derive(Debug, Clone, Copy)]
// Same numbers as a normal Linux (no guarantee)
pub enum FileType {
DtUnknown = 0,
DtFifo = 1,
DtChr = 2,
DtDir = 4,
DtBlk = 6,
DtReg = 8,
DtLnk = 10,
DtSock = 12,
DtWht = 14,
}
impl TryFrom<u8> for FileType {
type Error = &'static str;
fn try_from(value: u8) -> Result<Self, Self::Error> {
match value {
0 => Ok(Self::DtUnknown),
1 => Ok(Self::DtFifo),
2 => Ok(Self::DtChr),
4 => Ok(Self::DtDir),
6 => Ok(Self::DtBlk),
8 => Ok(Self::DtReg),
10 => Ok(Self::DtLnk),
12 => Ok(Self::DtSock),
14 => Ok(Self::DtWht),
_ => Err("inv. Discriminant"),
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have these constants in fs::FileType, which is also what we use in sys/dirent.h.

@@ -324,6 +323,7 @@ pub unsafe extern "C" fn sys_opendir(name: *const c_char) -> FileDescriptor {
if let Ok(name) = unsafe { CStr::from_ptr(name) }.to_str() {
crate::fs::opendir(name).unwrap_or_else(|e| -num::ToPrimitive::to_i32(&e).unwrap())
} else {
error!("sys opendir invalid name");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if kernel logging is useful here, since this would be an application error, not a kernel error and the application should log and report errors.

Regardless, this is probably a separate commit or even PR, right?

pub struct Dirent64 {
/// 64-bit inode number
pub d_ino: u64,
/// 64-bit offset to next structure
/// Field without meaning. Kept for BW compatibility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d_off is a filesystem-specific value with no specific meaning to user space, though on older filesystems it used to be the distance from the start of the directory to the start of the next linux_dirent

(getdents(2))

I am fine with making this field meaningless, I just wanted to be conscious about this. Could there be a benefit of using absolute offsets in user space instead of iterating incrementally using d_reclen?

if dirp.is_null() || count == 0 {
error!("dirp is null");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this makes sense as kernel error log. It is an application error, not a kernel error.

Comment on lines +133 to +134
let _ = self.0.clone(); // Dummy instruction to avoid warning for the moment
unimplemented!("")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let _ = self.0.clone(); // Dummy instruction to avoid warning for the moment
unimplemented!("")
let _ = &self.0; // Dummy statement to avoid warning for the moment
unimplemented!()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stlankes, I think we can also remove DirectoryReader completely and panic on sys_opendir, which is deprecated, right?

nameptr,
namelen,
);
nameptr.add(namelen).write_bytes(0, 1); // zero termination
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nameptr.add(namelen).write_bytes(0, 1); // zero termination
nameptr.add(namelen).write(0); // zero termination

Comment on lines +450 to +452
/// Any other offset than 0 is not supported. (Mostly because it doesn't make any sense, as
/// userspace applications have no way of knowing valid offsets)
async fn lseek(&self, offset: isize, _whence: SeekWhence) -> io::Result<isize> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User space could know offsets through seeking without moving the head, which returns the current position. Not sure if it makes sense to expose that now, though. I don't really understand newlib's implementation of telldir, seekdir and rewinddir yet.

Can we check that whence is SeekWhence::Set?

nameptr,
dirent.namelen as usize,
);
nameptr.add(dirent.namelen as usize).write_bytes(0, 1); // zero termination
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nameptr.add(dirent.namelen as usize).write_bytes(0, 1); // zero termination
nameptr.add(dirent.namelen as usize).write(0); // zero termination

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

getdents64 should be stateful
2 participants