Skip to content

create-legacy-oscontainer: use runvm to build legacy oscontainer #3111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 20, 2022

Conversation

jmarrero
Copy link
Member

@jmarrero jmarrero commented Sep 22, 2022

Initial effort to fix: openshift/os#1009

@openshift-ci
Copy link

openshift-ci bot commented Sep 22, 2022

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@jmarrero
Copy link
Member Author

jmarrero commented Sep 22, 2022

Trying to get this running locally first, right now hitting python module issues:

  File "/usr/lib/coreos-assembler/oscontainer-deprecated-legacy-format.py", line 22, in <module>
    from cosalib import cmdlib
  File "/usr/lib/coreos-assembler/cosalib/cmdlib.py", line 15, in <module>
    import yaml
ModuleNotFoundError: No module named 'yaml'
Traceback (most recent call last):
  File "/usr/lib/coreos-assembler/cmd-upload-oscontainer", line 120, in <module>
    subprocess.check_call(cosa_argv +
  File "/usr/lib64/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)

maybe it's time to just farm out the buildah/podman call to the vm.

@cgwalters
Copy link
Member

Short term I'm OK with just taking the hit of dragging python into supermin.

(Hmm...i wonder if we could actually just mount the host container's /usr into the supermin over 9p VM and use that...)

@jmarrero
Copy link
Member Author

I see, I'll try that next I was trying to modify oscontainer.py instead of calling buildah, to call something like

buildah_base_argv = ['. /usr/lib/coreos-assembler/cmdlib.sh; prepare_build && . /usr/lib/coreos-assembler/cmdlib.sh && runvm -- buildah']

But keep getting

  File "/usr/lib64/python3.10/subprocess.py", line 1845, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '. /usr/lib/coreos-assembler/cmdlib.sh; prepare_build && . /usr/lib/coreos-assembler/cmdlib.sh && runvm -- buildah'

@jmarrero
Copy link
Member Author

jmarrero commented Sep 23, 2022

I think I am hitting some kind of argument limit when doing this via string because I get:

oscontainer-deprecated-legacy-format.py: error: unrecognized arguments: CoreOS kernel-rt-core ostree rpm-ostree ignition systemd runc cri-o /srv/tmp/repo ece6f5adbc58edd21135a922e9adb0c224cd0409fe284cdb6c2fe59483ba4b8a rhcos:412.86.202209021413-0

when I add the {arguments} to the call which look like:

['sudo', '--preserve-env=container,DISABLE_TLS_VERIFICATION,SSL_CERT_DIR,SSL_CERT_FILE,REGISTRY_AUTH_FILE,OSCONTAINER_CERT_DIR', '/bin/sh', '-c', '. /usr/lib/coreos-assembler/cmdlib.sh; prepare_build && . /usr/lib/coreos-assembler/cmdlib.sh && runvm -- /usr/lib/coreos-assembler/oscontainer-deprecated-legacy-format.py --workdir=./tmp build --from=registry.access.redhat.com/ubi8/ubi:latest  --display-name="Red Hat Enterprise Linux CoreOS" --labeled-packages="kernel kernel-rt-core ostree rpm-ostree ignition systemd runc cri-o" --digestfile=tmp/oscontainer-digest --push /srv/tmp/repo ece6f5adbc58edd21135a922e9adb0c224cd0409fe284cdb6c2fe59483ba4b8a rhcos:412.86.202209021413-0']

Without the arguments that have spaces, it starts the build in the VM. Still debugging what is going on.

@jmarrero jmarrero force-pushed the upload-oscontainer branch 7 times, most recently from b8785f3 to 3f14ee4 Compare September 30, 2022 21:31
@jmarrero
Copy link
Member Author

quite close, just need to handle the actual upload. The image is now being built on the vm.

error pushing image "jmarrerotest:412.86.202209302130-0" to "docker://jmarrerotest:412.86.202209302130-0": trying to reuse blob sha256:b38cb92596778e2c18c2bde15f229772fe794af39345dd456c3bf6702cc11eef at destination: checking whether a blob sha256:b38cb92596778e2c18c2bde15f229772fe794af39345dd456c3bf6702cc11eef exists in docker.io/library/jmarrerotest: errors:
denied: requested access to the resource is denied
error parsing HTTP 401 response body: unexpected end of JSON input: ""

Just working on moving the upload outside the vm as discussed with Colin.

@dustymabe
Copy link
Member

dustymabe commented Sep 30, 2022

Just working on moving the upload outside the vm as discussed with Colin.

If the image is in the local builds dir and also in the meta.json then you should be able to use cosa push-container-manifest for this. See https://github.com/coreos/fedora-coreos-pipeline/blob/7aea3fe533f19acc7ce76cac372d8d0f65f1d4ea/jobs/release.Jenkinsfile#L184-L187

@jmarrero jmarrero force-pushed the upload-oscontainer branch 3 times, most recently from 6faf985 to d453535 Compare October 3, 2022 19:34
@jmarrero jmarrero marked this pull request as ready for review October 3, 2022 19:37
@jmarrero
Copy link
Member Author

jmarrero commented Oct 3, 2022

tested this on a pod and locally, it created a oci-archive with the name I pass to the command under --name.
for example:

cosa upload-oscontainer --name rhctest

Now generates:

skopeo inspect --config oci-archive:rhctest | grep version 
            "io.buildah.version": "1.27.0",
            "io.openshift.build.version-display-names": "machine-os=Red Hat Enterprise Linux CoreOS",
            "io.openshift.build.versions": "machine-os=412.86.202209021413-0",
            "version": "412.86.202209021413-0"

The meta.json shows the entry too:

  "name": "rhcos",
  "oscontainer": {
    "digest": "sha256:0498f3bf675eb3700bd8c789da2707e71a73a1da26e67d4a250ed74827c76011",
    "image": "rhctest"
  },

@jmarrero jmarrero force-pushed the upload-oscontainer branch 2 times, most recently from e538b77 to bf241cd Compare October 4, 2022 16:21
@jmarrero
Copy link
Member Author

jmarrero commented Oct 4, 2022

@dustymabe you mean to use it in the pipeline right? If so then this PR now creates the oci-archive. So in the pipeline I need to call cosa upload-oscontainer and then cosa push-container-manifest. If that is the case I will raise a PR with that next.

@dustymabe
Copy link
Member

@dustymabe you mean to use it in the pipeline right?

Yes

If so then this PR now creates the oci-archive. So in the pipeline I need to call cosa upload-oscontainer and then cosa push-container-manifest. If that is the case I will raise a PR with that next.

We might want to rename cosa upload-oscontainer since it's not doing that any longer. cosa create-legacy-oscontainer maybe?

We also should consider how we want the containers in the registry to look. For example the resulting container from our FCOS pipeline is manifest listed with all architectures referenced. We run this in the release job (after all arches have finished building) rather than in each build job for each architecture.

Just something to think about.

@jmarrero
Copy link
Member Author

jmarrero commented Oct 4, 2022

Renamed, I think we can influence how it looks with the --name we pass to cosa create-legacy-oscontainer and when we push it. We can discuss it on the PR doing the push I think.

@jmarrero jmarrero force-pushed the upload-oscontainer branch from 7b17c9c to 860acbc Compare October 4, 2022 21:03
cgwalters
cgwalters previously approved these changes Oct 5, 2022
Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM!

@cgwalters
Copy link
Member

So the hard rename i.e. requiring the pipeline change seems to run straight into https://issues.redhat.com/browse/MCO-392

IOW, if we merge this change as is, won't it break the current (legacy) RHCOS pipeline for 4.12? Or I guess the answer is: we can just hold this until we do a hard cutover to the new pipeline?

@jmarrero
Copy link
Member Author

#3119 failed in the same way. So I am now confident this PR is not the culprit. With the memory updates it passes more tests after the flake retry but still fails 2.

@cgwalters
Copy link
Member

OK so now we're down to a few ext tests failing: ext.config.root-reprovision.filesystem-only and ext.config.root-reprovision.luks etc. That really can't be related to this PR at all. I'm OK to merge over red if it helps unblock the pipeline migration.

@jlebon
Copy link
Member

jlebon commented Oct 12, 2022

Re. CI issues, coreos/coreos-ci-lib#118 and #3121 should help.

@cgwalters
Copy link
Member

@dustymabe question on this PR, I think I'm just reiterating things here but my understanding is basically we can't merge this until the new pipeline is ready.

Alternatively, perhaps we could try merging it, and adjust the old pipeline to invoke it? I think in theory the old pipeline should be able to run through this flow too.

Or yet another alternative, create a coreos-assembler build from this PR, and test out using it in the new pipeline? (And I guess still merge it only when the new pipeline is ready?)

@dustymabe
Copy link
Member

One thing that came out of deeper review of this PR yesterday is #3122

@dustymabe
Copy link
Member

dustymabe commented Oct 12, 2022

@dustymabe question on this PR, I think I'm just reiterating things here but my understanding is basically we can't merge this until the new pipeline is ready.

Alternatively, perhaps we could try merging it, and adjust the old pipeline to invoke it? I think in theory the old pipeline should be able to run through this flow too.

Or yet another alternative, create a coreos-assembler build from this PR, and test out using it in the new pipeline? (And I guess still merge it only when the new pipeline is ready?)

I think it's not a question of if the new pipeline is ready (i.e. we have the new pipeline running today in the dev namespace), but rather a question of if the old consumers (that expect the old behavior here) still exist (i.e. have we fully switched to the new pipeline yet).

So we could adapt the old pipeline code if we wanted.. or just somehow gate the piece in the new pipeline that performs this action on some feature in COSA existing (i.e. a custom build of COSA would trigger the build/upload, but by default main COSA wouldn't until this got merged).

@jlebon
Copy link
Member

jlebon commented Oct 17, 2022

I'm working on sorting out the container bits for the new pipeline and would like to get this in. I can take care of the pipeline modifications for it.

I don't want to be touching the old pipeline code though. So my strawman is to do something similar to what we did with the buildprep -> buildfetch migration:

  1. keep src/cmd-upload-oscontainer-deprecated-legacy-format as is (and the cmd-upload-oscontainer symlink)
  2. create new src/cmd-create-legacy-oscontainer
  3. keep src/oscontainer-deprecated-legacy-format.py as is (and the oscontainer.py symlink)
  4. create new src/create-legacy-oscontainer.py
  5. open draft PR that nukes the bits from 1. and 3.

Thoughts?

@dustymabe
Copy link
Member

I've no objections.

@cgwalters
Copy link
Member

Yes, SGTM. Every single time we've tried to do an "API break" like renaming a command in a one shot with coordinated commits to cosa and the pipeline I think it's been more pain than gain. Better to take the short term tech debt and use reminders to ourselves to remove the deprecated bits safely later.

@jmarrero jmarrero marked this pull request as draft October 18, 2022 14:36
@jmarrero jmarrero changed the title oscontainer-deprecated-legacy-format: use runvm not nested containers create-legacy-oscontainer: use runvm to build legacy oscontainer Oct 18, 2022
@jmarrero jmarrero marked this pull request as ready for review October 18, 2022 15:48
@jmarrero
Copy link
Member Author

/retest-required

jlebon
jlebon previously approved these changes Oct 18, 2022
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments, but overall LGTM.

Now that it follows the classic "build an archive and put in images" pattern, it might be nice to rename the command to cosa buildextend-legacy-oscontainer instead. But anyway, we've had a lot of naming discussions already 🖌️ in this thread, so let's leave that for a separate PR!

arch = "x86_64"
}
return arch
return coreosarch.CurrentRpmArch()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: this could be a separate commit with its own rationale.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done by running go vendor after the schema update. I can split that to it's own commit if it helps. But this was not something I manually changed.

@@ -0,0 +1,7 @@
#!/usr/bin/env bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, can we name this script src/create-legacy-oscontainer.sh?

Copy link
Member Author

@jmarrero jmarrero Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure thing

docs/cosa.md Outdated
@@ -72,3 +72,4 @@ Those less commonly used commands are listed here:
| [tag](https://github.com/coreos/coreos-assembler/blob/main/src/cmd-tag) | Operate on the tags in `builds.json`
| [test-coreos-installer](https://github.com/coreos/coreos-assembler/blob/main/src/cmd-test-coreos-installer) | Automate an end-to-end run of coreos-installer with the metal image
| [upload-oscontainer](https://github.com/coreos/coreos-assembler/blob/main/src/cmd-upload-oscontainer) | Upload an oscontainer (historical wrapper for `cosa oscontainer`)
| [create-legacy-oscontainer](https://github.com/coreos/coreos-assembler/blob/main/src/cmd-create-legacy-oscontainer) | Create an oscontainer oci-archive (historical wrapper for `cosa oscontainer`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

Suggested change
| [create-legacy-oscontainer](https://github.com/coreos/coreos-assembler/blob/main/src/cmd-create-legacy-oscontainer) | Create an oscontainer oci-archive (historical wrapper for `cosa oscontainer`)
| [create-legacy-oscontainer](https://github.com/coreos/coreos-assembler/blob/main/src/cmd-create-legacy-oscontainer) | Create an oscontainer in legacy format (i.e. not OSTree-native)

?

@jlebon
Copy link
Member

jlebon commented Oct 18, 2022

Were you able to test this locally in a flow using FORCE_UNPRIVILEGED=1? You can also add it to CI in .cci.jenkinsfile.

This introduces a new command to create a oci-archive of the
legacy oscontainer that will be pushed with
`cosa push-container-manifest` by the pipeline.
These changes come from calling:
`make schema && cd mantle && go vendor`
@jmarrero
Copy link
Member Author

Were you able to test this locally in a flow using FORCE_UNPRIVILEGED=1? You can also add it to CI in .cci.jenkinsfile.

I have tested it on a pod in our cluster and locally without setting that flag.

In the pod I built rhcos and then ran the command in it successfully.

@jmarrero
Copy link
Member Author

Were you able to test this locally in a flow using FORCE_UNPRIVILEGED=1? You can also add it to CI in .cci.jenkinsfile.

I have tested it on a pod in our cluster and locally without setting that flag.

In the pod I built rhcos and then ran the command in it successfully.

did a quick test and it works locally with FORCE_UNPRIVILEGED=1

@jlebon jlebon enabled auto-merge (rebase) October 20, 2022 20:20
@jlebon jlebon merged commit daff388 into coreos:main Oct 20, 2022
@jlebon
Copy link
Member

jlebon commented Oct 24, 2022

Note this command is renamed in #3133 to cosa buildextend-legacy-oscontainer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4.12: legacy upload-oscontainer not tested in unprivileged context
5 participants