[Do no merge] 6.18 kernel test for savic#3369
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the default OpenHCL kernel package versions used by Flowey’s OpenVMM pipelines, apparently to exercise a custom 6.18 “savic” kernel build.
Changes:
- Update
OPENHCL_KERNEL_DEV_VERSIONto6.18.0.100-savic - Update
OPENHCL_KERNEL_STABLE_VERSIONto6.18.0.100-savic
| pub const OPENHCL_KERNEL_DEV_VERSION: &str = "6.12.52.12"; | ||
| pub const OPENHCL_KERNEL_STABLE_VERSION: &str = "6.12.52.11"; | ||
| pub const OPENHCL_KERNEL_DEV_VERSION: &str = "6.18.0.100-savic"; | ||
| pub const OPENHCL_KERNEL_STABLE_VERSION: &str = "6.18.0.100-savic"; |
There was a problem hiding this comment.
Setting both OPENHCL kernel dev and stable versions to the same custom "-savic" string means the resolver will try to download main (Main/Cvm) artifacts from tag rolling-lts/hcl-main/6.18.0.100-savic and dev (Dev/CvmDev) artifacts from rolling-lts/hcl-dev/6.18.0.100-savic. Unless that exact version is published on both channels with matching asset names, this will break default pipeline resolution for Main/Cvm kernels. If this is only for a one-off test, please keep OPENHCL_KERNEL_STABLE_VERSION pointing at the normal main-channel release and use the existing LocalKernel override (or add a dedicated override path/version mechanism) for the savic test run instead of changing global defaults.
| pub const OPENHCL_KERNEL_STABLE_VERSION: &str = "6.18.0.100-savic"; | |
| pub const OPENHCL_KERNEL_STABLE_VERSION: &str = "6.18.0.100"; |
443c01b to
8370ba2
Compare
| // Fill in boilerplate fields of the vmsa | ||
| vmsa.sev_features.set_snp(true); | ||
| vmsa.sev_features.set_vtom(true); | ||
| vmsa.sev_features.set_secure_avic(true); |
There was a problem hiding this comment.
set_secure_avic(true) is applied unconditionally in the boilerplate, which causes secure_avic to be enabled even when the secure_avic argument is Disabled (e.g., enlightened_uefi == true, or vtl < HCL_SECURE_VTL with injection_type != Normal). This also risks leaving SECURE_AVIC set without the corresponding guest_intercept_control bit in some paths. Consider removing this unconditional enable and instead set secure_avic (and any dependent bits) consistently based on the secure_avic parameter for all control-flow branches.
| vmsa.sev_features.set_secure_avic(true); | |
| vmsa.sev_features | |
| .set_secure_avic(matches!(secure_avic, SecureAvic::Enabled)); |
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 28 out of 28 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (1)
nix/openhcl_kernel.nix:17
- This Nix derivation now pins the OpenHCL kernel
versionto6.12.52.7, while the flowey pipeline config in the same PR uses6.18.0.100-savic. If both paths are expected to fetch the same kernel artifacts, this mismatch should be resolved to avoid environment-dependent behavior.
let
version = if is_dev then "6.12.52.7" else "6.12.52.7";
# Allow explicit override of architecture, otherwise derive from host system
# Note: targetArch uses "x86_64"/"aarch64", but URLs use "x64"/"arm64"
arch = if targetArch == "x86_64" then "x64"
else if targetArch == "aarch64" then "arm64"
else if system == "aarch64-linux" then "arm64"
else "x64";
branch = if is_dev then "hcl-dev" else "hcl-main";
build_type = if is_cvm then "cvm" else "std";
# See https://github.com/microsoft/OHCL-Linux-Kernel/releases
url =
"https://github.com/microsoft/OHCL-Linux-Kernel/releases/download/rolling-lts/${branch}/${version}/Microsoft.OHCL.Kernel${
if is_dev then ".Dev" else ""
}.${version}-${if is_cvm then "cvm-" else ""}${arch}.tar.gz";
| pub timer_dcr: ApicRegister, | ||
| pub reserved_3f: ApicRegister, | ||
| pub reserved_40: [ApicRegister; 0xc0], | ||
| } |
There was a problem hiding this comment.
ApicPage is a memory-mapped/ABI struct but the compile-time size check was removed when renaming VmxApicPage -> ApicPage. Please add a const_assert_eq!(size_of::<ApicPage>(), 4096) (or equivalent) to ensure layout drift is caught at compile time.
| } | |
| } | |
| const _: [(); 4096] = [(); core::mem::size_of::<ApicPage>()]; |
| vmsa.sev_features.set_vtom(false); | ||
| vmsa.sev_features | ||
| .set_secure_avic(secure_avic == SecureAvic::Enabled); | ||
| vmsa.sev_features.set_secure_avic(true); |
There was a problem hiding this comment.
This code sets vmsa.sev_features.set_secure_avic(true) for VTL2 unconditionally, but the SNP runtime path now unimplemented!("Only alternate injection is supported for SNP") when alternate_injection is false. As-is, generating a VMSA with secure AVIC enabled is likely to hit the unimplemented path at runtime. Either remove this secure_avic bit here, or ensure the runtime supports secure AVIC (and keep feature bits mutually consistent).
| vmsa.sev_features.set_secure_avic(true); |
| fn pull_offload(&mut self) -> ([u32; 8], [u32; 8]) { | ||
| assert_eq!(self.vtl, GuestVtl::Vtl0); | ||
| pull_apic_offload(self.avic_page) | ||
| unreachable!() |
There was a problem hiding this comment.
SnpApicClient::pull_offload currently uses unreachable!(). If APIC offload ever becomes enabled (even accidentally via state restore), this will panic on guest-triggerable paths (LocalApicAccess::ensure_state_local), which violates the repo’s trust-boundary guidance (no panics on guest input). Prefer returning a safe default (e.g. zero IRR/ISR) and disabling offload/logging an error, or refactor so SNP never advertises/enters the offloaded state.
| unreachable!() | |
| tracelimit::error_ratelimited!( | |
| "unexpected APIC offload pull for SNP VP; returning zeroed offload state" | |
| ); | |
| ([0; 8], [0; 8]) |
| "secure AVIC must be enabled" | ||
| ); | ||
| None | ||
| unimplemented!("Only alternate injection is supported for SNP") |
There was a problem hiding this comment.
The run loop will panic via unimplemented!("Only alternate injection is supported for SNP") when alternate_injection is false. Given VMSA contents can come from the loader/image, this is effectively a runtime crash on configuration/guest state. Please handle this as a normal error path (e.g., reject the VMSA/config during init, or return a fatal error to the host) rather than panicking.
| unimplemented!("Only alternate injection is supported for SNP") | |
| return Err(dev.fatal_error(anyhow::anyhow!( | |
| "SNP VMSA has alternate injection disabled; only alternate injection is supported" | |
| ))); |
| #[error("failed to check hcl capabilities")] | ||
| CheckExtensions(#[source] nix::Error), |
There was a problem hiding this comment.
Error::CheckExtensions used to include the queried capability value, but it has been removed from the error variant. Including the cap in the error helps diagnose which capability probe failed (especially when multiple caps are checked during init). Consider keeping the cap as context (either as a field on the error variant or via .context(...)).
| #[error("failed to check hcl capabilities")] | |
| CheckExtensions(#[source] nix::Error), | |
| #[error("failed to check hcl capability {cap}")] | |
| CheckExtensions { | |
| cap: u64, | |
| #[source] | |
| err: nix::Error, | |
| }, |
| vmsa.sev_features.set_vtom(false); | ||
| vmsa.sev_features | ||
| .set_secure_avic(secure_avic == SecureAvic::Enabled); | ||
| vmsa.sev_features.set_secure_avic(true); |
There was a problem hiding this comment.
There are stray tab characters / trailing whitespace on the set_secure_avic(true) line. This will cause cargo xtask fmt --fix to churn and may fail formatting checks; please reformat the indentation and remove trailing whitespace.
| vmsa.sev_features.set_secure_avic(true); | |
| vmsa.sev_features.set_secure_avic(true); |
| tracing::warn!("rip is zero, might need to parse the instruction stream"); | ||
| } | ||
|
|
||
| vmsa.set_rip(vmsa.next_rip()); |
There was a problem hiding this comment.
advance_to_next_instruction now always sets RIP = next_rip. Previously there was special handling for cases where hardware might not populate next_rip. Unless it’s guaranteed that every exit path that calls this has a valid next_rip, this risks setting RIP to 0/garbage and breaking guest execution. Consider reintroducing a guard/fallback (e.g., assert/log when next_rip is 0, or use instruction-length/decode assist when available).
| vmsa.set_rip(vmsa.next_rip()); | |
| let next_rip = vmsa.next_rip(); | |
| if next_rip == 0 { | |
| return; | |
| } | |
| vmsa.set_rip(next_rip); |
|
|
||
| let | ||
| version = if is_dev then "6.12.52.12" else "6.12.52.11"; | ||
| version = 6.18.0.100-savic |
There was a problem hiding this comment.
version is assigned without quotes and without a trailing semicolon, which makes this file invalid Nix syntax (and inherit version; expects a string). This should be a quoted string ending with ; (and likely keep the prior dev/stable conditional if needed).
| version = 6.18.0.100-savic | |
| version = "6.18.0.100-savic"; |
|
|
||
| let | ||
| version = if is_dev then "6.12.52.12" else "6.12.52.11"; | ||
| version = 6.18.0.100-savic |
There was a problem hiding this comment.
Changing the version changes the download URL, but the hashes table below was not updated. Unless the new release assets are byte-for-byte identical to the old ones, Nix will fail the fetch due to a sha256 mismatch. Update the per-branch/per-variant hashes to match the new ${version} artifacts (or keep the old version).
| "image": { | ||
| "openhcl": { | ||
| "command_line": "", | ||
| "command_line": "OPENHCL_CONFIDENTIAL_DEBUG=1", |
There was a problem hiding this comment.
openhcl-x64-cvm-release.json is a release manifest but this change enables confidential debug (OPENHCL_CONFIDENTIAL_DEBUG=1). In OpenHCL this disables confidentiality filtering (see underhill_confidentiality::confidential_filtering_enabled()), which can cause sensitive data to be emitted via diagnostic sources/logs. This should not be enabled by default in a release manifest; keep command_line empty here and instead enable it only in dev/test manifests or via an explicit test harness override.
| "isolation_type": { | ||
| "tdx": { | ||
| "enable_debug": false, | ||
| "enable_debug": true, |
There was a problem hiding this comment.
This flips TDX enable_debug to true in the release CVM manifest. Debug-enabled TDX configs generally reduce isolation/security guarantees and are not appropriate as a default for release artifacts. Please keep enable_debug: false in the release manifest and introduce a separate dev/test manifest (or a recipe-level override) for the SAVIC debug scenario.
| "enable_debug": true, | |
| "enable_debug": false, |
| "image": { | ||
| "openhcl": { | ||
| "command_line": "", | ||
| "command_line": "OPENHCL_CONFIDENTIAL_DEBUG=1", |
There was a problem hiding this comment.
Same as above: setting OPENHCL_CONFIDENTIAL_DEBUG=1 in the release manifest will disable confidentiality filtering and can leak sensitive information via diagnostics. This should remain disabled by default in release artifacts; prefer a dedicated test manifest/recipe or a runtime override for this test-only behavior.
| //! version configuration requests required by various dependencies in OpenVMM | ||
| //! pipelines. | ||
|
|
||
|
|
There was a problem hiding this comment.
There’s an extra blank line before the use statements; rustfmt will likely remove it, so this may cause formatting checks to fail if cargo xtask fmt --fix hasn’t been run.
|
|
||
| let | ||
| version = if is_dev then "6.12.52.12" else "6.12.52.11"; | ||
| version = "6.18.0.100-savic"; |
There was a problem hiding this comment.
version is now hard-coded to a single 6.18.0.100-savic value, so is_dev no longer selects different dev vs main kernel versions. This makes the Nix environment always pull the savic test kernel even for stable workflows, and will break if the rolling-lts/hcl-main/... and rolling-lts/hcl-dev/... tags don’t both publish this exact version. Consider restoring the dev/stable split (or making version an explicit function parameter with a default) so regular builds keep using the standard pinned versions while still allowing local/test overrides.
No description provided.