Skip to content

Conversation

cgwalters
Copy link
Collaborator

@cgwalters cgwalters commented Oct 17, 2025

This mirrors to-disk, except instead of having the bootc container install itself, we're adding osbuild/bib in the middle.

I think this is pretty close to working, but right now it dies deep in the osbuild stack failing to fetch the image.

But this all works with to-disk where bootc-in-container fetches it.

Assisted-by: Claude Code
Closes: #9

This mirrors `to-disk`, except instead of having the bootc container install
itself, we're adding osbuild/bib in the middle.

I think this is pretty close to working, but right now it dies deep
in the osbuild stack failing to fetch the image.

But this all works with `to-disk` where bootc-in-container fetches
it.

Assisted-by: Claude Code
Signed-off-by: Colin Walters <[email protected]>
@cgwalters
Copy link
Collaborator Author

I think this is pretty close to working, but right now it dies deep in the osbuild stack failing to fetch the image.

There's a truly astounding set of layers of indirection here; basically I think it's:

host - podman container (image to be installed) - qemu (running kernel from image, mounting host container storage via virtiofs) - bootc-image-builder container (fetched from host storage, and itself being passed the host container storage again) - osbuild+bubblewrap + osbuild-container stuff

I think what's going wrong is in the last steps, but I (and AI tooling) got lost in those layers and easily debugging it.

@achilleas-k
Copy link

There's a truly astounding set of layers of indirection here

Yeah, that sounds impossible to reason with. It might be nicer to untangle the manifest generation from the osbuild call instead of doing a one-shot call to bootc-image-builder. Personally, the number of layers in just calling bootc-image-builder on its own makes me uncomfortable and ties my mind in knots when I'm debugging it. Most times I just run it outside a container when troubleshooting, but that has its own caveats because of some of the things we had to do to make it specifically compatible with running in podman.

Using image-builder-cli instead of bootc-image-builder would also be simpler.

Ideally the end result should be:

  • host has podman container to be installed
  • host container storage gets mounted into qemu VM
  • image-builder CLI gets called to generate a manifest by reading the host container storage
  • call osbuild directly on the manifest to build the image

That way we should only have one layer of indirection when generating the manifest and calling osbuild (and osbuild can do its own internal indirections as needed).

@cgwalters
Copy link
Collaborator Author

There's an implementation detail in all of this that is important (actually a ~soft requirement IMO): The generated disk image has filesystems written using the target kernel. This is the rationale behind a lot of the project design today, and it's why the ephemeral VM uses the target container image.

One possibility here though is to arrange things so that we use the bootc-image-builder userspace with the target kernel. I think as long as we copied over /usr/lib/modules from the target image too, it'd likely work.

That would drop out one level of indirection (running b-i-b as a container in the ephemeral VM); however it would come at the large cost of being a new way to run b-i-b that is really pretty different mechanically.


Using image-builder-cli instead of bootc-image-builder would also be simpler.

OK sure that's easy enough to try out, but why would it be simpler?

Do you mean pulling it as a container or would you mean we install it as an RPM layered on top of the target image?

@achilleas-k
Copy link

OK sure that's easy enough to try out, but why would it be simpler?

Fewer layers. Using the tools directly instead of launching containers in containers in VMs.

Do you mean pulling it as a container or would you mean we install it as an RPM layered on top of the target image?

Ideally without its container to limit the indirection layers. Either install the RPM, copy in the binary from the host, whatever works. It's all already running in a VM, I don't see why it needs to keep going deeper into more layers of indirection.

@cgwalters
Copy link
Collaborator Author

Fewer layers. Using the tools directly instead of launching containers in containers in VMs.

Oh you mean running image-builder-cli directly on the host with sudo as the docs suggest? But avoiding requiring sudo (and also in the fully general case needing to have the target kernel write the filesystems, so building fedora+btrfs systems works from a RHEL host etc.) is the rationale for this project.

Also involving sudo directly on the host has the problem that it will require a dance to fetch content from the user's container storage - I suspect this hasn't come up simply because b-i-b and i-b-cli both require root.

And finally just to repeat, IMO anyone building operating systems wants to be able to test them - I mean it's right there in the bib docs https://github.com/osbuild/bootc-image-builder/ which go "sudo podman" and then in the next paragraph mention running with qemu.

I mean I get why i-b-cli exists, but for all of the reasons there I'd prefer to have bcvk be the entrypoint for at least the bootc world that helps streamline (virt) provisioning.

@achilleas-k
Copy link

Oh you mean running image-builder-cli directly on the host with sudo as the docs suggest?

Sorry, I misspoke. I meant running ib-cli inside the container, but without its container. so basically running something like:

$ image-builder manifest ... > manifest.json
$ osbuild --export image --output-directory ./output manifest.json

but all inside the qemu VM. So we only have the one layer of indirection (the VM).

@cgwalters
Copy link
Collaborator Author

but all inside the qemu VM.

But which userspace would we use to get those tools? Remember there can be skew between the target (container) image and the image-builder tooling today - unless we do something to avoid that skew.

@achilleas-k
Copy link

So right now bootc-image-builder gets podman pulled inside the VM, right? Or is it used from the host container store?

If we want to run things without podman in the VM, perhaps bcvk could copy/bind things in from the host. If ib-cli (and osbuild) become bcvk dependencies, so it can assume they're available when you run the tool, it could also control version compatibility between itself and those tools, and copy them into the VM to run them.

The major downside is that distributing bcvk becomes harder.

@cgwalters
Copy link
Collaborator Author

So right now bootc-image-builder gets podman pulled inside the VM, right? Or is it used from the host container store?

Well, both - we mount the host container store as an AIS (additional image store), but the layers aren't actually copied into the VM's storage.

perhaps bcvk could copy/bind things in from the host.

Yes, we already do that for the host virtualization stack.

If ib-cli (and osbuild) become bcvk dependencies, so it can assume they're available when you run the tool, it could also control version compatibility between itself and those tools, and copy them into the VM to run them.

I'd have no problem making them soft dependencies. If the preferred way to run i-b-cli is as a host tool and not a container, then yeah we can totally make that work.

But it does have the downside that we're not testing b-i-b.

Also in general, since I want to also streamline Anaconda (which is most naturally an ISO/PXE env today) it would be a different case....but, yeah doable.


(There's a whole side thread here of course in whether it'd make sense to have i-b-cli to soft-depend on bcvk a as a default mode of operation when targeting a bootc image and virt is available. Of course in theory we could also do the "use target kernel" thing even for package-based installs by fetching the kernel RPM first and bootstrapping from it)

@cgwalters
Copy link
Collaborator Author

OK I've explained to a running AI tool some more of the architecture here and (after fixing some of its bad ideas) it (we) came up with this patch for osbuild/images:

diff --git i/pkg/container/client.go w/pkg/container/client.go
index 9ea4de6..e7b17a0 100644
--- i/pkg/container/client.go
+++ w/pkg/container/client.go
@@ -353,7 +353,9 @@ func (cl *Client) getImageRef(id string, local bool) (types.ImageReference, erro
                if id != "" {
                        imageName = id
                }
-               options := fmt.Sprintf("containers-storage:[overlay@%s+/run/containers/storage]%s", cl.store, imageName)
+               // Use simple reference format - let containers/storage use its configured options
+               // This allows STORAGE_OPTS and storage.conf to work correctly
+               options := fmt.Sprintf("containers-storage:%s", imageName)
                return alltransports.ParseImageName(options)
        }
 
@@ -375,7 +377,14 @@ func (cl *Client) resolveContainerImageArch(ctx context.Context, ref types.Image
 }
 
 func (cl *Client) getLocalImageIDFromDigest(instance digest.Digest) (string, error) {
-       store, err := storage.GetStore(storage.StoreOptions{GraphRoot: cl.store})
+       // Use DefaultStoreOptions to respect STORAGE_OPTS and storage.conf
+       // This allows additional image stores to work correctly
+       opts, err := storage.DefaultStoreOptions()
+       if err != nil {
+               return "", err
+       }
+       opts.GraphRoot = cl.store
+       store, err := storage.GetStore(opts)
        if err != nil {
                return "", err
        }

which gets us farther. (I would love to know why we need to vendor the container-libs stack here, is it really buying us much over e.g. forking off skopeo?)

But the next problem here is I think how osbuild wants to copy the input images to the osbuild "store" (?) - that was always unnecessary for bib as we don't expose any caching by default, but it fails hard in this environment as I'm not provisioning ephemeral FS storage. Though hummm...I guess I could change ephemeral so that e.g. /var/tmp is actually the same as the pod's /var/tmp....

@cgwalters
Copy link
Collaborator Author

cgwalters commented Oct 17, 2025

perhaps bcvk could copy/bind things in from the host.

Yes though it's actually more complicated than this because again the default stack for bcvk ephemeral is:

host ➡️ podman ➡️ nested bwrap container of host ➡️ qemu

The "nested bwrap container of host" is where qemu runs - and qemu runs just fine unprivileged, we just give it /dev/kvm as needed.

Whereas if we had image-builder-cli on the host, we'd have to not just run the host userspace as a container we'd have to run it as a VM - like supermin encourages.

I guess nothing stops us here from adding something like bcvk host-ephemeral or so - it doesn't obviously relate directly to bootc or containers though. Besides supermin there's also projects like https://github.com/amluto/virtme etc. which do this. I mean really it's not too much more than setting up qemu with the host kernel and virtiofsd pointing at the host root...although actually there are a ton of details around whether one picks up just the host /usr or everything else. A virtme/host-ephemeral style flow clearly can work more nicely when the host is itself image based because then we can always start from a known pristine state.

@achilleas-k
Copy link

But it does have the downside that we're not testing b-i-b.

Most of b-i-b is being rolled into ib-cli now anyway. b-i-b will remain as a project released as a container, but the plan for it is to be a thin shell around ib-cli functionality. The code paths for producing a (bootc) manifest with ib-cli and b-i-b mostly live in osbuild/images anyway. What we will be testing in ib-cli and b-i-b is how well it works in the containerised environments we expect them to run.

If the preferred way to run i-b-cli is as a host tool and not a container, then yeah we can totally make that work.

Perhaps it's not the preferred way, but in my mind it's the default way. My take on the whole thing is that there's a convenience to distributing the tool as (and running it in) a container. But here, we're getting the same convenience and isolation by running it in a VM. Layering too deeply, especially when you want to start reading resources from the host, ends up making life harder (as evident by the problems we're seeing here).

@achilleas-k
Copy link

(I would love to know why we need to vendor the container-libs stack here, is it really buying us much over e.g. forking off skopeo?)

It's really not. And it actually makes our life a bit harder with storage drivers that aren't available in EL (needing to disable the btrfs driver for example). I'd like to actually do this at some point, when I have the time.

But the next problem here is I think how osbuild wants to copy the input images to the osbuild "store" (?) - that was always unnecessary for bib as we don't expose any caching by default

I have good news for you: osbuild/osbuild#2222

Yes though it's actually more complicated than this because again the default stack for bcvk ephemeral is:

host ➡️ podman ➡️ nested bwrap container of host ➡️ qemu

The "nested bwrap container of host" is where qemu runs - and qemu runs just fine unprivileged, we just give it /dev/kvm as needed.

Oh, that's a bit more complicated than I originally thought. I'll have to look at the actual code a bit closer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrap bootc-image-builder

2 participants