Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncing with upstream v3.13.0 #300

Draft
wants to merge 2,908 commits into
base: msft-main
Choose a base branch
from
Draft

Syncing with upstream v3.13.0 #300

wants to merge 2,908 commits into from

Conversation

sprt
Copy link
Collaborator

@sprt sprt commented Jan 27, 2025

Merge Checklist
  • Followed patch format from upstream recommendation: https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md#patch-format
    • Included a single commit in a given PR - at least unless there are related commits and each makes sense as a change on its own.
  • Aware about the PR to be merged using "create a merge commit" rather than "squash and merge" (or similar)
  • The upstream/missing label (or upstream/not-needed) has been set on the PR.
Summary
Test Methodology

GabyCT and others added 30 commits November 6, 2024 20:07
This PR adds the install kata tools step as part of the k8s stability workflow.
To avoid the failures saying that certain kata components are not installed it.

Signed-off-by: Gabriela Cervantes <[email protected]>
gha: Add install kata tools as part of the stability workflow
This PR makes the root dir absolute after resolving the
default root dir symlink. 

Fixes: kata-containers#10499

Signed-off-by: Silenio Quarti <[email protected]>
runtime: Files are not synced between host and guest VMs
As discussed on the AC call, we are lacking maintainers for the
metrics tests. As a starting point for potentially phasing them
out, we discussed starting with removing the test for stratovirt
as a non-core hypervisor and a job that is problematic in leaving
behind resources that need cleaning up.

Signed-off-by: stevenhorsman <[email protected]>
As discussed in the AC meeting, we don't have a maintainer,
(or users?) of runk, and the CI is unstable, so giving we can't
support it, we shouldn't waste CI cycles on it.

Signed-off-by: stevenhorsman <[email protected]>
This PR adds the get artifacts which are needed when installing kata
tools in stability workflow to avoid failures saying that artifacts
are missing.

Signed-off-by: Gabriela Cervantes <[email protected]>
We are generating a simple CDI spec with device and
global containerEdits to test the CDI crate.

Signed-off-by: Zvonko Kaiser <[email protected]>
Fedora F40 removed python3 from the base container, to avoid such issues
let's rely on the latest and greates official python container.

Fixes: kata-containers#10497

Signed-off-by: Lukáš Doktor <[email protected]>
…ratovirt-metrics-tests

metrics: Skip metrics on stratovirt
…eate-container-timeout-log

tests: k8s: Update image pull timeout error
Trustee's deployment must set the correct https_proxy as env var on the
container that will talk to the ITA / ITTS server, otherwise the kbs
service won't be able to start, causing then issues in our CI.

Signed-off-by: Fabiano Fidêncio <[email protected]>
Signed-off-by: Krzysztof Sandowicz <[email protected]>
ci.ocp: Use the official python:3 container for sanity
GHA is migrating ubuntu-latest to Ubuntu 24 so
let's hardcode the current 22.04 LTS.

https://github.blog/changelog/2024-11-05-notice-of-breaking-changes-for-github-actions/

Signed-off-by: Aurélien Bombo <[email protected]>
This test was meant to show support for pulling images with v1 manifest schema versions.

The nginxhttps image has been modified in https://hub.docker.com/r/ymqytw/nginxhttps/tags such that we are no longer able to pull it:

$ docker pull ymqytw/nginxhttps:1.5
Error response from daemon: missing signature key

We may remove this test since schema version 1 manifests are deprecated per
https://docs.docker.com/engine/deprecated/#pushing-and-pulling-with-image-manifest-v2-schema-1 :
"These legacy formats should no longer be used, and users are recommended to update images to use current formats, or to upgrade to more
current images". This schema version was used by old docker versions. Further OCI spec
https://github.com/opencontainers/image-spec/blob/main/manifest.md#image-manifest-property-descriptions only supports schema version 2.

Signed-off-by: Saul Paredes <[email protected]>
This reverts commit f15e16b, as we
don't have to do this since we're relying on the
`static_sandbox_resource_mgmt` feature, which gives us the correct
amount of memory and CPUs to be allocated.

Signed-off-by: Fabiano Fidêncio <[email protected]>
…emove_manifest_v1_test

tests: remove manifest v1 test
…t-proxy-nightmare-for-tdx

ci: tdx: kbs: Ensure https_proxy is taken in consideration
…untu-latest-fix

gha: Hardcode ubuntu-22.04 instead of latest
The Clear Linux rootfs is not being tested anywhere, and it seems Intel
doesn't have the capacity to review the PRs related to this (combined
with the lack of interested from the rest of the community on reviewing
PRs that are specific to this untested rootfs).

With this in mind, I'm suggesting we drop Clear Linux support and focus
on what we can actually maintain.

Signed-off-by: Fabiano Fidêncio <[email protected]>
Remove second declaration of GO_HOME in roofs-build ubuntu script.

Signed-off-by: Nikos Ch. Papadopoulos <[email protected]>
gha: Get artifacts when installing kata tools in stability workflow
Use regorous engine's add_data method to add state to the policy.
This data can later be accessed inside rego context through the data namespace.

Support state modifications (json-patches) that may be returned as a result from policy evaluation.

Also initialize a policy engine data slice "pstate" dedicated for storing state.

Fixes kata-containers#10087

Signed-off-by: Saul Paredes <[email protected]>
Make sure all container sandbox names match the sandbox name of the first container.

Signed-off-by: Saul Paredes <[email protected]>
…striction-for-qemu-tdx

Reapply "runtime: confidential: Do not set the max_vcpu to cpu"
As discussed in the CI working group,
we are temporarily skipping the SNP CI
to unblock the remaining workflow.
Will revert after fixing the SNP runner.

Signed-Off-By: Adithya Krishnan Kannan <[email protected]>
fidencio and others added 30 commits January 8, 2025 14:07
Right now we've been only building releases from virtiofsd, but we'll
need to pin a specific commit till v1.14.0 is out, thus let's add the
needed machinery to do so.

Signed-off-by: Fabiano Fidêncio <[email protected]>
Together with the bump, let's also bump the rust version needed to build
the package, with the caveat that virtiofsd doesn't actually use a
pinned version as part of their CI, so we're bumping to whatever is the
version on `alpine:rust` (which is used in their CI).

It's important to note that we're using a version which brings in one
extra patch apart from the release, as the next virtiofsd release will
happen at the end of February, 2025.

Signed-off-by: Fabiano Fidêncio <[email protected]>
This reverts commit 9aea745.

Signed-off-by: Fabiano Fidêncio <[email protected]>
The bump to kernel 6.12 seems to have reduced the latency in
the metrics test, so increase the ranges for the minimal value,
to account for this.

Signed-off-by: stevenhorsman <[email protected]>
…logbench-latency-minimal-range-increase

metrics: Increase latency minimum range
…e-oom-test-for-mariner

tests: Re-enable oom tests for mariner
…irtiofsd

virtiofsd: Update to its v1.13.0 ( + one patch) release :-)
Because az client restricts the name to be less than 64 characters. In
some cases (e.g. KATA_HYPERVISOR=qemu-runtime-rs) the generated name
will exceed the limit. This changed the function to shorten the name:

* SHA1 is computed from metadata then compound the cluster's name
* metadata as plain-text are passed as --tags

Fixes: kata-containers#9850
Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
After the kernel version bump, in the latest nightly run
https://github.com/kata-containers/kata-containers/actions/runs/12681309963/job/35345228400
The sequential read throughput result was 79.7% of the expected (so failed)
and the sequential write was 84% of the expected, so was fairly close,
so increase their minimum ranges to make them more robust.

Signed-off-by: stevenhorsman <[email protected]>
We hit a failure with:
```
time="2025-01-09T09:55:58Z" level=warning msg="Failed Minval (0.017600 > 0.015000) for [network-iperf3]"
```
The range is very big, but in the last 3 test runs I reviewed we have got a minimum value of 0.015s
and a max value of 0.052, so there is a ~350% difference possible
so I think we need to have a wide range to make this stable.

Signed-off-by: stevenhorsman <[email protected]>
…er_name

tests/gha-run-k8s-common: shorten AKS cluster name
Since
qemu/qemu@be93fd5,
which is included in QEMU since version 9.2.0, the options for the
`device_add` QMP command need to be typed correctly.

This makes it so that instead of `"on"`, the value is set to `true`,
matching QEMU's expectations.

This has been tested on QEMU 9.2.0 and QEMU 9.1.2, so before and after
the change.

The compatibility with incorrectly typed options  for the `device_add`
command is deprecated since version 6.2.0 [^1].

[^1]:  https://qemu-project.gitlab.io/qemu/about/deprecated.html#incorrectly-typed-device-add-arguments-since-6-2

Signed-off-by: Moritz Sanft <[email protected]>
…ics-latency-minimum-range-fixes

metrics: Increase latency test range
…ix-boolean-opts

runtime: use actual booleans for QMP `device_add` boolean options
Use Mariner 3.0 (a.k.a., Azure Linux 3.0) as the Guest CI image.

Signed-off-by: Dan Mihai <[email protected]>
The earlier implementation relied on using a specific mount-path prefix - `/sealed`
to determine that the referenced secret is a sealed secret.
However that was restrictive for certain use cases as it forced
the user to always use a specific mountpath naming convention.

This commit introduces an alternative implementation to relax the
restriction. A sealed secret can be mounted in any mount-path.
However it comes with a potential performance penality. The
implementation loops through all volume mounts and reads the file
to determine if it's a sealed secret or not.

Fixes: kata-containers#10398

Signed-off-by: Pradipta Banerjee <[email protected]>
This update addresses an issue with token verification for SE and SNP
introduced in the last update by kata-containers#10541.
Bumping the project to the latest commit resolves the issue.

Signed-off-by: Hyounggyu Choi <[email protected]>
The existing encoding was base64 and it fails due to
confidential-containers/guest-components@8749486

Signed-off-by: Pradipta Banerjee <[email protected]>
Use "set -x" only when the user specified DEBUG=1.

Signed-off-by: Dan Mihai <[email protected]>
agent: alternative implementation for sealed_secret as volume
…et-rootfs-build

rootfs: reduced console output by default
…iner3-guest

image: bump mariner guest version to 3.0
On s390x, some tests for trusted storage occasionally failed due to:

```bash
etcdserver: request timed out
```

or

```bash
Internal error occurred: resource quota evaluation timed out
```

These timeouts were not observed previously on k3s but occur
sporadically on kubeadm. Importantly, they appear to be temporary
and transient, which means they can be ignored in most cases.

To address this, we introduced a new wrapper function, `retry_kubectl_apply()`,
for `kubectl create`. This function retries applying a given manifest up to 5
times if it fails due to a timeout. However, it will still catch and handle
any other errors during pod creation.

Fixes: kata-containers#10651

Signed-off-by: Hyounggyu Choi <[email protected]>
Skip logging empty lines of text from the Guest console output, if
there are any such lines.

Without this change, the Guest console log from CLH + /dev/pts/0 has
twice as many lines of text. Half of these lines are empty.

Fixes: kata-containers#10737

Signed-off-by: Dan Mihai <[email protected]>
…retry-trusted-storage

tests: Introduce retry_kubectl_apply() for trusted storage
Bump VERSION and helm-chart versions

Signed-off-by: Zvonko Kaiser <[email protected]>
…ty-pty-lines

runtime: skip empty Guest console output lines
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.