Skip to content

Conversation

@aaronlehmann
Copy link

@aaronlehmann aaronlehmann commented Apr 1, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

Without this change, specifying Compression: imagebuildah.Zstd in imagebuildah's `BuildOptions fails, so it is not possible to push cache to a registry with zstd compression.

Note this is only applicable to OCI manifests.

How to verify it

Specify Compression: imagebuildah.Zstd when building an image using imagebuildah.BuildDockerfiles,.

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

n/a

Does this PR introduce a user-facing change?

None

@openshift-ci openshift-ci bot added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 1, 2024
@aaronlehmann
Copy link
Author

aaronlehmann commented Apr 1, 2024

If a test is required for this change, I'd appreciate suggestions about how to best add test coverage. It doesn't seem possible to test via the CLI because the Compression parameter isn't wired up as a CLI flag.

	if c.Flag("compress").Changed {
		logrus.Debugf("--compress option specified but is ignored")
	}

	compression := define.Gzip
	if iopts.DisableCompression {
		compression = define.Uncompressed
	}

@rhatdan
Copy link
Member

rhatdan commented Apr 2, 2024

@giuseppe @mtrmac @nalind PTAL

@rhatdan
Copy link
Member

rhatdan commented Apr 2, 2024

@flouthoc PTAL

image.go Outdated
// how to decompress them, we can't try to compress layers with zstd.
return "", "", errors.New("media type for zstd-compressed layers is not defined")
omediaType = v1.MediaTypeImageLayerZstd
dmediaType = "application/vnd.docker.image.rootfs.diff.tar.zstd"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by comment: I'm uncomfortable with hardcoded magic strings. Shouldn't this be defined as a constant in c/image?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if this should happen, it would be in c/image/v5/manifest.

(Buildkit does seem to use this constant, but it is not a part of the public API.)

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must refuse to create v2s2 images with zstd compression.

AFAICS that means that the whole of NewImageSource needs to be restructured around respecting i.preferredManifestType, instead of always building data in both formats.

There may be various ways to do it:

  • Add some kind of “manifest type” abstraction, and call it everywhere the code currently computes both values
  • Continue building both formats all the time, but somehow represent “this format failed”, and check that at the very end (might be less invasive, but also rather more convoluted)
  • Maybe some other design.

image.go Outdated
return "", "", errors.New("media type for xz-compressed layers is not defined")
case archive.Zstd:
// Until the image specs define a media type for zstd-compressed layers, even if we know
// how to decompress them, we can't try to compress layers with zstd.
Copy link
Contributor

@mtrmac mtrmac Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is accurate: Docker images can’t represent Zstd images layers.

We can’t very well just silently ignore that problem and create something that doesn’t work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to defer to you and refuse to create schema2 images with zstd compression. But I'm curious to understand why this is a problem - BuildKit seems happy to create them using this media type, and they seem to work fine with Docker and kaniko (but not buildah...).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One place is our own https://github.com/containers/image/blob/1c648b405e0763472db0bcc189136ddb6a5b61c6/manifest/docker_schema2.go#L173 . In retrospect, being that strict might have been a mistake, but either way that code refusing unexpected MIME types has been there for many years.

(IIRC, Docker is not checking MIME types at all when pulling; that’s the other extreme option.)

So we would need to think about a migration mechanism, and giving users enough tools to manage it. For a format feature which is, AFAICS, undocumented. Meanwhile, “if you want to use Zstd, migrate to OCI” is a simple story that works. for all implementations.

@aaronlehmann
Copy link
Author

We must refuse to create v2s2 images with zstd compression.

AFAICS that means that the whole of NewImageSource needs to be restructured around respecting i.preferredManifestType, instead of always building data in both formats.

To keep the change relatively contained, what would you think about returning an empty dmediatype from computeLayerMIMEType, and making NewImageSource return an error if the media type corresponding to manifestType is empty. Something like:

		// Figure out if we need to change the media type, in case we've changed the compression.
		omediaType, dmediaType, err = computeLayerMIMEType(what, i.compression)
		if err != nil {
			return nil, err
		}
		// The layer media type must be defined for the type of manifest we are producing.
		if (omediaType == "" && manifestType == v1.MediaTypeImageManifest) ||
			(dmediaType == "" && manifestType == manifest.DockerV2Schema2MediaType) {
			return nil, fmt.Errorf("manifest type %q does not support selected compression format", manifestType)
		}

Refactoring with a manifest type abstraction seems like a worthy idea as well. I can't see a reason why this function is constructing both in parallel (not sure if I'm missing something). I'm not sure I'm the best person to do this refactor though, as I'm brand new to the codebase.

@mtrmac
Copy link
Contributor

mtrmac commented Apr 2, 2024

I’ll let actual Buildah maintainers to decide how this should be organized.

@aaronlehmann
Copy link
Author

@rhatdan: How would you like to proceed here?

@aaronlehmann
Copy link
Author

Hoping for some direction on where to go with this PR. I'm happy to make the changes if the maintainers have thoughts on what makes most sense.

@mtrmac
Copy link
Contributor

mtrmac commented Apr 11, 2024

My preference would be

Add some kind of “manifest type” abstraction, and call it everywhere the code currently computes both values

but that’s the option that requires more work, and I’m not authoritative to commit Buildah to that direction, nor can I spend very much time helping with that work.

@aaronlehmann
Copy link
Author

@rhatdan: Still looking for guidance on this one...

@rhatdan
Copy link
Member

rhatdan commented May 3, 2024

@giuseppe @nalind PTAL

@aaronlehmann
Copy link
Author

@giuseppe @nalind: Any thoughts on this one? I'm happy to restructure it as necessary, but I'm looking for direction on this.

@aaronlehmann
Copy link
Author

@giuseppe @nalind @rhatdan: I'd really like to get this upstreamed; can you please let me know how you'd like me to approach the change?

@giuseppe giuseppe added the No New Tests Allow PR to proceed without adding regression tests label Jun 21, 2024
@giuseppe
Copy link
Member

could you please define "application/vnd.docker.image.rootfs.diff.tar.zstd" in containers/image/v5/manifest as @mtrmac pointed out?

@mtrmac
Copy link
Contributor

mtrmac commented Jun 21, 2024

could you please define "application/vnd.docker.image.rootfs.diff.tar.zstd" in containers/image/v5/manifest as @mtrmac pointed out?

@giuseppe To this day we are explicitly rejecting unknown MIME types for v2s2 layers. (I could see an argument that that has been a mistake, but here we are anyway.) Any v2s2 image with zstd are not going to be accepted by any currently existing version of Podman.

Why should we start creating such images now, instead of using the OCI format exclusively?

@giuseppe
Copy link
Member

@mtrmac ok thanks, that makes sense.

@aaronlehmann can you use the OCI format?

@aaronlehmann
Copy link
Author

@giuseppe: Using the OCI format is completely fine for me; I don't care about the Docker format. However, the issue is that NewImageSource constructs OCI and Docker manifests simultaneously, and it's not clear what to do for the Docker format if we're using zstd compression. Do I need to refactor the code to make it possible to produce either type of manifest separately?

@aaronlehmann
Copy link
Author

@giuseppe: Any thoughts?

@giuseppe
Copy link
Member

@mtrmac do you have any suggestions for @aaronlehmann ?

@mtrmac
Copy link
Contributor

mtrmac commented Jun 28, 2024

… so we have come full circle :)

My preference is #5452 (comment) , to split the manifest creation code into per-format implementations, and to only use one instead of both.

@rhatdan
Copy link
Member

rhatdan commented Sep 13, 2024

@aaronlehmann still working on this?

@aaronlehmann
Copy link
Author

@rhatdan: Happy to come back to this if I can get some clarity over the direction to go for the solution. Would something like #5452 (comment) be merged, or do we need to completely refactor with a "manifest type" abstraction as described in #5452 (comment)?

@rhatdan
Copy link
Member

rhatdan commented Sep 16, 2024

If miloslav prefers 5452, I would go along with him. He is far smarter then me.

aaronlehmann pushed a commit to aaronlehmann/buildah that referenced this pull request Sep 17, 2024
Currently, NewImageSource creates a Docker schema2 manifest and an OCI
manifest at the same time. This precludes functionality that isn't
supported by both manifest types, for example zstd compression.
Refactoring this to create only the desired manifest type solves this
and also cleans up the code by separating manifest-type-specific code
into distinct implementations of a "manifest builder".

See discussion in containers#5452.
aaronlehmann pushed a commit to aaronlehmann/buildah that referenced this pull request Sep 17, 2024
Currently, NewImageSource creates a Docker schema2 manifest and an OCI
manifest at the same time. This precludes functionality that isn't
supported by both manifest types, for example zstd compression.
Refactoring this to create only the desired manifest type solves this
and also cleans up the code by separating manifest-type-specific code
into distinct implementations of a "manifest builder".

See discussion in containers#5452.
@aaronlehmann
Copy link
Author

Now that #5743 is merged, I've updated this to only support zstd in the OCI manifest path.

@aaronlehmann
Copy link
Author

@mtrmac: Does this look good now that it's specific to OCI manifests?

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this LGTM.

Buildah maintainers, do you want tests added? The PR has a “no new tests” label but I can’t tell why.

@flouthoc
Copy link
Collaborator

@aaronlehmann do you think a test can be added for this, do you need help with tests ?

@aaronlehmann
Copy link
Author

@flouthoc: Yes, some help would be great! Please see my comment here: #5452 (comment)

@aaronlehmann
Copy link
Author

@flouthoc: Any ideas about what a test should look like?

@flouthoc
Copy link
Collaborator

I am thinking if we can wire up --compress then it will be easier to test in bud.bats, I can't see any specific reason to leave --compress as no-op. @nalind @rhatdan @mtrmac WDYT ?

@mtrmac
Copy link
Contributor

mtrmac commented May 28, 2025

072ee88 suggests that --compress is supposed to apply to inputs to a build, not to outputs.

I didn’t investigate further, is that incorrect?

@aaronlehmann
Copy link
Author

@flouthoc: Any thoughts on @mtrmac's question about --compress semantics?

@flouthoc
Copy link
Collaborator

There is no intuitive way to test this without exposing a user exposed feature or hidden CLI flag, since this does not affects default behaviour of buildah, maybe we can merge this without test. @nalind any thoughts ?

@aaronlehmann
Copy link
Author

@nalind WDYT?

@aaronlehmann
Copy link
Author

@nalind @flouthoc: Sorry to keep nagging... anything we can do to move this forward?

@nalind
Copy link
Member

nalind commented Jun 11, 2025

The change itself LGTM, but without a test of some kind, I fear we wouldn't notice if it broke. There's one on the tip of https://github.com/nalind/buildah/tree/aaronlehmann-support-zstd-commit, if you'd like to cherry-pick it.

Aaron Lehmann and others added 2 commits June 11, 2025 22:42
Without this change, specifying `Compression: imagebuildah.Zstd` in
`imagebuildah`'s `BuildOptions fails, so it is not possible to push
cache to a registry with zstd compression.

Note this is only applicable to OCI manifests.

Signed-off-by: Aaron Lehmann <[email protected]>
Add a unit test that commits OCI layouts with various types of
compression specified, and verifies that the layers end up written with
the desired compression and media type descriptors.

Signed-off-by: Nalin Dahyabhai <[email protected]>
@aaronlehmann aaronlehmann force-pushed the support-zstd-commit branch from 4b62612 to 873e545 Compare June 11, 2025 22:43
@aaronlehmann
Copy link
Author

Thanks so much! I went ahead and cherry-picked this test (and also rebased the PR).

@packit-as-a-service
Copy link

Ephemeral COPR build failed. @containers/packit-build please check.

@aaronlehmann aaronlehmann force-pushed the support-zstd-commit branch from 58995dd to 2eb666c Compare June 11, 2025 22:53
@aaronlehmann
Copy link
Author

@flouthoc: I think this is ready to go now. Please let me know if you'd like any other changes.

Copy link
Collaborator

@flouthoc flouthoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
/lgtm
/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 12, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aaronlehmann, flouthoc, mtrmac

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 1b39f59 into containers:main Jun 12, 2025
37 checks passed
@aaronlehmann
Copy link
Author

🎉 Thank you!

@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Sep 11, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

approved kind/feature Categorizes issue or PR as related to a new feature. lgtm locked - please file new issue/PR No New Tests Allow PR to proceed without adding regression tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants