Skip to content

Conversation

@jayvdb
Copy link

@jayvdb jayvdb commented Nov 20, 2025

Motivation

#1770

Solution

Use https://github.com/bourumir-wyngs/serde-saphyr in kube-client as the first phase of solving this.

The other uses of serde_yaml are in dev-dependencies.

Signed-off-by: John Vandenberg <[email protected]>
Signed-off-by: John Vandenberg <[email protected]>
@codecov
Copy link

codecov bot commented Nov 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.6%. Comparing base (d4bb23e) to head (c0ebc7e).

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1848     +/-   ##
=======================================
- Coverage   74.6%   74.6%   -0.0%     
=======================================
  Files         84      84             
  Lines       7910    7905      -5     
=======================================
- Hits        5900    5895      -5     
  Misses      2010    2010             
Files with missing lines Coverage Δ
kube-client/src/client/auth/mod.rs 50.0% <100.0%> (ø)
kube-client/src/config/file_config.rs 77.1% <100.0%> (-0.4%) ⬇️
kube-client/src/config/mod.rs 54.7% <ø> (ø)

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@clux clux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks a lot for doing this. I like the progress on saphyr and am positive on merging this myself. Maybe @nightkr has opinions.

Btw, there is still a lot of usage of serde-yaml in the examples, might be worth converting at least one of the non-trivial ones there (like the kubectl example) also if you have time.

documents.push(kubeconfig);
}
Ok(documents)
serde_saphyr::from_multiple(text).map_err(KubeconfigError::Parse)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems sensible and we are already changing the error type in a breaking way anyway here so if the old error falls away then that's probably ok.

@clux clux added the changelog-change changelog change category for prs label Nov 20, 2025
@clux clux added this to the 3.0.0 milestone Nov 20, 2025
@jayvdb
Copy link
Author

jayvdb commented Nov 20, 2025

A global replace of serde_yaml::to_string with serde_saphyr::to_string in examples, and adding it to examples/Cargo.toml , does build. However, because saphyr-parser is still not fully YAML 1.2 compliant, each example would need to be manually tested to ensure that it still works as expected. I am guessing that CI doesnt fully cover the examples.

As a result, doing the examples is best done by someone with more familiarity with them.

For example, here is a test failure in kube-runtime when switching the test to using serde-saphyr:

thread 'wait::conditions::tests::pod_running_unschedulable' (2252470) panicked at kube-runtime/src/wait.rs:546:47:
called Result::unwrap() on an Err value: Message { msg: "invalid indentation in quoted scalar", location: Location { row: 17, column: 32 } }
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

all the other tests in kube-runtime pass.

@jayvdb
Copy link
Author

jayvdb commented Nov 20, 2025

IMO if this is merged, it should be put through its paces a bit by devs before releasing. I suspect there would be a subset of kube config files which may no longer parse, due to weird indentation for example.

If we wanted perfect compatibility with serde_yaml , I have found https://crates.io/crates/yaml-spanned to behave identically to serde_yaml on a few work projects I have switched across to using it.

@nightkr
Copy link
Member

nightkr commented Nov 20, 2025

Pure Rust would, of course, be nice, but I'm still not massively sold on taking on the new dependency from authors that I'm not really familiar with. In general, I'd lean towards serde-norway instead (from what I have seen of cafkafk, and because sticking to basically-the-same-serde-yaml-as-before seems less likely to break things), but I also haven't really been keeping track of which way the overall tide is turning. Us sticking to a serde-yaml fork doesn't do much for the supply chain if the rest of the ecosystem ends up pulling in serde-saphyr anyway.

That said, going through the README I do have a few questions:

  • General tag deserialization (á la thing: !!function definitelyNotMaliciousISwear) has been a pretty widespread source of exploits for YAML stuff, isn't mentioned at all in the README
  • Externally tagged enums by default which aligns with K8s convention anyway, serde-yaml already burned us before by moving away from this at some point, good to get the default back
  • Schema-based coercion seems like a compatibility trap, since other impls (incl. the Go K8s tooling) doesn't do this
    • I'd rather disable this, both because it can start approaching unintentional EEE, and to avoid locking us into serde-saphyr going forward
  • Budgets seem like a sensible idea, but we'd need to have a think about defaults (and how to override them?) if we end up using them
  • Anchors could also be a compat hazard? honestly not sure about Go's support here but we should probably follow their lead.. at least this one has a bit more spec backing
  • Feature flags that affect parsing behaviour, not a fan of this at all (spooky action at a distance)

I also haven't reviewed the code itself or looked too deep into saphyr-rs.

@jayvdb
Copy link
Author

jayvdb commented Nov 20, 2025

Note serde_norway is maintained, but has the same problems in the unsafe code in it that caused the maintainer of serde_yaml to deprecate it. serde_yaml has a pending RUSTSEC PR, but serde_norway will also have one after the first has settled down.

https://github.com/romnn/yaml-spanned doesnt have the same problem, because it uses https://github.com/simonask/libyaml-safer which doesnt have unsafe code.

@jayvdb
Copy link
Author

jayvdb commented Nov 20, 2025

ping @bourumir-wyngs so they can consider the points raised above.

@bourumir-wyngs
Copy link

bourumir-wyngs commented Nov 21, 2025

  • The idea of executing arbitrary Rust code from YAML seems extremely difficult—perhaps even intentionally so. With roughly 7.5 K SLoC, serde-saphyr is small enough to be fully reviewed for any “dark magic,” as you put it.
  • All budget limits are fully configurable via Options.
  • If you need compatibility with a YAML parser that requires quoting for strings that might otherwise be interpreted differently ("100", "true", etc.), you can disable schema-based parsing by setting no_schema to true in Options. Ambiguous strings will then be rejected with the error “quoting required.” This preserves backward compatibility with parsers that depend on such quoting.
  • serde-saphyr takes care to avoid crashing with panic on any input. serde-yaml contains multiple panic, unwrap, and similar constructs that can be triggered; see here for example.
  • Even if robotic extensions are enabled accidentally, they remain inactive unless the relevant option flag is explicitly set. I understand that these features may appear intimidating to users outside the robotics ecosystem.
  • There are fuzz tests that reliably hang unsafe-libyaml, and these are included in the standard fuzzing suite for serde-saphyr (and serde-yaml-bw). When prioritizing safety, this should be taken into account. serde-yaml-bw uses a saphyr-parser–based “firewall” that performs pre-parsing with budget checks—something I consider a major security improvement. It protects against such attacks at the cost of some performance.
  • If you look at the c2rust project page, you’ll see that the tool was never intended to generate final, production-ready code. The expectation has always been that a human will read and refine the translated output.
  • Please note that the saphyr-rs repository is a multi-crate workspace, of which only one crate—saphyr-parser—is used by serde-saphyr. The rest of saphyr-rs has a completely different data model and does not implement Serde at all.
  • I am always happy to receive bug reports that include a unit test or any RFE related to API limitations. If the issue lies within serde-saphyr itself, I typically fix it within a few days. However, since serde-saphyr does not parse raw YAML directly and relies on saphyr-parser, some bugs may need to be reported there instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog-change changelog change category for prs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants