Skip to content

feat: Make Rust installation persistent#14

Open
vadorovsky wants to merge 2 commits intomainfrom
persistent-rustup
Open

feat: Make Rust installation persistent#14
vadorovsky wants to merge 2 commits intomainfrom
persistent-rustup

Conversation

@vadorovsky
Copy link
Member

@vadorovsky vadorovsky commented Apr 14, 2025

Remove Rust toolchain from the container image. Instead, create bind mounts for ~/.cargo and ~/.rustup, bootstrap the toolchain in a separate step, so it stays persistent.

Fixes: #10


This change is Reviewable

@vadorovsky vadorovsky force-pushed the protobuf branch 3 times, most recently from e857ffb to e3894b1 Compare April 26, 2025 19:25
@vadorovsky vadorovsky force-pushed the protobuf branch 2 times, most recently from 585aafe to efa5d62 Compare April 27, 2025 13:57
@vadorovsky vadorovsky force-pushed the persistent-rustup branch 2 times, most recently from 625f7ab to 0be5faa Compare April 27, 2025 15:53
@vadorovsky vadorovsky force-pushed the persistent-rustup branch 2 times, most recently from b1dd8de to e57df09 Compare April 28, 2025 02:42
@vadorovsky vadorovsky force-pushed the persistent-rustup branch from 3872594 to 1da323c Compare May 5, 2025 21:00
@vadorovsky vadorovsky force-pushed the protobuf branch 2 times, most recently from 86a3ab8 to 1dff455 Compare May 6, 2025 22:55
@vadorovsky vadorovsky force-pushed the persistent-rustup branch from 1da323c to 3fc17e0 Compare May 6, 2025 23:43
@vadorovsky vadorovsky force-pushed the persistent-rustup branch from 3fc17e0 to 2a00c3b Compare May 12, 2025 10:09
@vadorovsky vadorovsky force-pushed the persistent-rustup branch from 2a00c3b to 599a0b4 Compare May 12, 2025 14:45
@vadorovsky vadorovsky force-pushed the persistent-rustup branch from 599a0b4 to 8dec3a0 Compare May 13, 2025 18:36
@vadorovsky vadorovsky force-pushed the persistent-rustup branch from 8dec3a0 to 2b0f4c7 Compare May 14, 2025 13:19
@vadorovsky vadorovsky changed the title Make Rust installation persistent feat: Make Rust installation persistent May 14, 2025
@vadorovsky vadorovsky force-pushed the persistent-rustup branch from 2b0f4c7 to 36cdf6d Compare May 14, 2025 13:23
@vadorovsky vadorovsky force-pushed the persistent-rustup branch from 36cdf6d to b691776 Compare May 27, 2025 13:07
@vadorovsky vadorovsky requested a review from tamird May 27, 2025 13:08
@vadorovsky vadorovsky changed the base branch from protobuf to main May 27, 2025 13:08
@vadorovsky vadorovsky force-pushed the persistent-rustup branch from b691776 to a6e18a2 Compare May 27, 2025 13:16
@tamird
Copy link
Collaborator

tamird commented May 27, 2025

src/main.rs line 246 at r2 (raw file):

    interactive: bool,
    /// Indicates whether rustup has to be bootstraped.
    bootstrap_rust: bool,

nit: rustup

Copy link
Collaborator

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 2 of 4 files at r2.
Reviewable status: 2 of 4 files reviewed, 11 unresolved discussions (waiting on @vadorovsky)


src/main.rs line 246 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

nit: rustup

oops, didn't mean to publish this.

i think this can be a method rather than a field.


src/main.rs line 294 at r2 (raw file):
I think you want to use try_exists

Warning: this method may be error-prone, consider using try_exists() instead! It also has a risk of introducing time-of-check to time-of-use (TOCTOU) bugs.

https://doc.rust-lang.org/stable/std/path/struct.Path.html#method.exists


-- commits line 2 at r1:
rationale should be in commit message


src/main.rs line 485 at r2 (raw file):

    triple: &Triple,
    override_cc_with_cross: bool,
) -> Vec<(Cow<'static, OsStr>, Cow<'static, OsStr>)> {

why? strictly worse, the caller can always collect.


src/main.rs line 593 at r2 (raw file):

    pb.set_style(pb_style.clone());

    let mut target_file = tokio::fs::File::create(target_file).await?;

do this before you start showing progress?


src/main.rs line 622 at r2 (raw file):

        "https://github.com/cargo-bins/cargo-binstall/releases/latest/download/cargo-binstall-x86_64-unknown-linux-musl.full.tgz",
        &cargo_binstall_file
    ).await?;

pointless ? and Ok below


src/main.rs line 826 at r2 (raw file):

        }

        if ctx.bootstrap_rust {

help me understand why this logic goes together - i.e. why always both, what if cargo is present but rustup is not? also, why do you need to install cargo manually? doesn't rustup do that?


src/main.rs line 900 at r2 (raw file):

        fs::create_dir_all(&ctx.cargo_dir)?;
        fs::create_dir_all(&ctx.rustup_dir)?;
    }

can be unconditional

Code quote:

    if ctx.bootstrap_rust {
        fs::create_dir_all(&ctx.cargo_dir)?;
        fs::create_dir_all(&ctx.rustup_dir)?;
    }

src/main.rs line 1085 at r2 (raw file):

    proc_mount(&ctx.rootfs_dir)?;
    dev_mount(&ctx.rootfs_dir)?;
    bind_mount(&ctx.rootfs_dir, env::current_dir()?, "/src")

just noticing this is always passed ctx.rootfs_dir as the first argument, seems you could reduce some repetition with a closure or something.


src/main.rs line 1122 at r2 (raw file):

    cmd.args(args)
        .stdout(Stdio::inherit())
        .stderr(Stdio::inherit())

aren't these the default?

Code quote:

        .stdout(Stdio::inherit())
        .stderr(Stdio::inherit())

src/main.rs line 1136 at r2 (raw file):

    //
    // In order to allow callers to pass slices, we accept iterators yielding
    // `&'a (K, V)` and call `Command::env` in a loop.

isn't this trivial to solve via iterator::map?

Code quote:

    // That constraint is quite inconvenient for us, because it would require
    // every caller of `run_bootstrap_command` to pass an owned container type.
    //
    // In order to allow callers to pass slices, we accept iterators yielding
    // `&'a (K, V)` and call `Command::env` in a loop.

Copy link
Collaborator

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 4 files at r2, all commit messages.
Reviewable status: 3 of 4 files reviewed, 12 unresolved discussions (waiting on @vadorovsky)


tests/tests.rs line 148 at r3 (raw file):

    icedragon_cmd(&current_dir, "run", None, &[])
        .args(["btf", "--help"])
        .assert_success();

it's already not great that the tests depend on network, but this is also putting it in the hands of a third party to break you. just write a trivial main and cargo install that?

Code quote:

    let current_dir = env::current_dir().unwrap();
    icedragon_cmd(&current_dir, "cargo", None, &[])
        .args(["install", "btfdump"])
        .assert_success();
    icedragon_cmd(&current_dir, "run", None, &[])
        .args(["btf", "--help"])
        .assert_success();

@vadorovsky vadorovsky force-pushed the persistent-rustup branch from a6e18a2 to 409714b Compare May 28, 2025 06:20
Copy link
Member Author

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 12 unresolved discussions (waiting on @tamird)


-- commits line 2 at r1:

Previously, tamird (Tamir Duberstein) wrote…

rationale should be in commit message

Done.


src/main.rs line 246 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

oops, didn't mean to publish this.

i think this can be a method rather than a field.

If I make it a method, then the stat syscall (or whatever Rust std ends up doing when calling exists/try_exists), which touches the filesystem, is going to be called every time when we call such method, instead of being called just one time.


src/main.rs line 294 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I think you want to use try_exists

Warning: this method may be error-prone, consider using try_exists() instead! It also has a risk of introducing time-of-check to time-of-use (TOCTOU) bugs.

https://doc.rust-lang.org/stable/std/path/struct.Path.html#method.exists

Done


src/main.rs line 485 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why? strictly worse, the caller can always collect.

Done, although in this context it doesn't gain us anything (there is just one caller, which has to collect, because in this PR we end up having multiple commands using the same env).


src/main.rs line 593 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

do this before you start showing progress?

Done.


src/main.rs line 622 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

pointless ? and Ok below

Can't do that here - download_file returns Result<tokio::fs::file, _> - the returned file is not needed here, but other callers of download_file need it. I need to explicitly ignore the returned object and handle the error.


src/main.rs line 826 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

help me understand why this logic goes together - i.e. why always both, what if cargo is present but rustup is not? also, why do you need to install cargo manually? doesn't rustup do that?

Rustup installs cargo, but not cargo-binstall. I'm including cargo-binstall to be able to install cargo-hack and other projects, that publish binaries, without having to build them from source every time. I can add a comment about it.

These steps go together, because I'm installing Rustup toolchain and cargo-binstall into bind mounted volumes (~/.cargo and ~/.rustup). Properly initialized ~/.cargo should contain binaries installed by rustup (cargo, rustc etc.) and cargo-binstall which I'm installing manually. A fresh volume doesn't contain either.

In the last push, I changed the destination to which I'm moving the rustup-init.sh script - it makes more sense to keep it in the volume.


src/main.rs line 900 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can be unconditional

Then it will make bunch of unnecessary filesystem checks / statx calls, when we know exactly that we don't have to.


src/main.rs line 1085 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

just noticing this is always passed ctx.rootfs_dir as the first argument, seems you could reduce some repetition with a closure or something.

Good point, but separate PR?


src/main.rs line 1122 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

aren't these the default?

Done.


src/main.rs line 1136 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

isn't this trivial to solve via iterator::map?

Good point, I could make it work with

cmd.envs(envs.iter().map(|e| e.to_owned()));

tests/tests.rs line 148 at r3 (raw file):

it's already not great that the tests depend on network

Yeah, I know, but my goal is to make sure that we don't break the basic goal of this project (being able to cross-compile LLVM and Rust crates). Perhaps we could make it better by including the tested projects as submodules. Happy to try that in a separate PR.

just write a trivial main and cargo install that?

Good idea, done.

@vadorovsky vadorovsky force-pushed the persistent-rustup branch from 409714b to 1c55dca Compare May 28, 2025 06:23
Currently, Rust toolchain is the part of the container image, which
doesn't guarantee that any changes made to it are persistent.

For example, if someone does:

```
icedragon run -- rustup component add llvm-tools-preview
```

Or:

```
icedragon cargo -- install cargo-hack
```

The installed artifacts are wiped out as soon as a new container image
is published on ghcr.io, because icedragon downloads it and replaces
the whole sysroot with the new image contents.

To make artifacts installed with `cargo` or `rustup` persistent,
remove Rust toolchain from the container image and instead, create bind
mounts for `~/.cargo` and `~/.rustup` and bootstrap the toolchain in
a separate step. That makes the lifecycle of the Rust toolchain
independent from the lifecycle of the main image.

Fixes: #10
@vadorovsky vadorovsky force-pushed the persistent-rustup branch from 1c55dca to f47df11 Compare May 28, 2025 12:41
Copy link
Collaborator

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r4, 5 of 6 files at r5, all commit messages.
Reviewable status: 5 of 6 files reviewed, 8 unresolved discussions (waiting on @vadorovsky)


src/main.rs line 246 at r2 (raw file):

Previously, vadorovsky (Michal Rostecki) wrote…

If I make it a method, then the stat syscall (or whatever Rust std ends up doing when calling exists/try_exists), which touches the filesystem, is going to be called every time when we call such method, instead of being called just one time.

ok, so make that a oncecell? the point is that you shouldn't allow this to float freely from the other values.


src/main.rs line 622 at r2 (raw file):

Previously, vadorovsky (Michal Rostecki) wrote…

Can't do that here - download_file returns Result<tokio::fs::file, _> - the returned file is not needed here, but other callers of download_file need it. I need to explicitly ignore the returned object and handle the error.

see comment below, it is very weird that we're just throwing this file away.


src/main.rs line 826 at r2 (raw file):

Previously, vadorovsky (Michal Rostecki) wrote…

Rustup installs cargo, but not cargo-binstall. I'm including cargo-binstall to be able to install cargo-hack and other projects, that publish binaries, without having to build them from source every time. I can add a comment about it.

These steps go together, because I'm installing Rustup toolchain and cargo-binstall into bind mounted volumes (~/.cargo and ~/.rustup). Properly initialized ~/.cargo should contain binaries installed by rustup (cargo, rustc etc.) and cargo-binstall which I'm installing manually. A fresh volume doesn't contain either.

In the last push, I changed the destination to which I'm moving the rustup-init.sh script - it makes more sense to keep it in the volume.

OK, but why are you checking for the presence of /.cargo, and not the presence of cargo-binstall?


src/main.rs line 900 at r2 (raw file):

Previously, vadorovsky (Michal Rostecki) wrote…

Then it will make bunch of unnecessary filesystem checks / statx calls, when we know exactly that we don't have to.

eh, ok. it seems smelly to me that this logic is here. the point is that you want to ensure rustup is present, so you should check and create directories at that time, and same for cargo-binstall.


src/main.rs line 1085 at r2 (raw file):

Previously, vadorovsky (Michal Rostecki) wrote…

Good point, but separate PR?

surely you can just prepend a commit?


src/main.rs line 1136 at r2 (raw file):

Previously, vadorovsky (Michal Rostecki) wrote…

Good point, I could make it work with

cmd.envs(envs.iter().map(|e| e.to_owned()));

wait hold up, aren't you just trying to go from &(K, V) to (&K, &V)? why would you need to use ToOwned? This comment you've added is only mildly comforting, but something like |(K, V)| { (K, V) } would be much more convincing. Or perhaps use .copied().


src/main.rs line 852 at r5 (raw file):

            // [0] https://github.com/cargo-bins/cargo-binstall
            s.spawn(|| {
                let cargo_binstall_tarball = download_dir.join("cargo-binstall.tgz");

now realizing this path is duplicated here and in download_cargo_binstall. can we instead have that function always return the file handle, and internally decide whether or not to download the tarball?


src/main.rs line 1196 at r5 (raw file):

    let mut cmd = ctx.cmd.command()?;
    cmd.stdout(Stdio::inherit()).stderr(Stdio::inherit());

also defaults...?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make the Rust toolchains and binaries installed with Cargo persistent

2 participants