Skip to content

Use cargo SBOM precursor files, if available #213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 25, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,4 +104,4 @@ Do not rely on SBOMs when dealing with supply chain attacks!

### What is blocking uplifting this into Cargo?

The [RFC for this functionality in Cargo itself](https://github.com/rust-lang/rfcs/pull/2801) has been [postponed](https://github.com/rust-lang/rfcs/pull/2801#issuecomment-2122880841) by the Cargo team until the [more foundational SBOM RFC](https://github.com/rust-lang/rfcs/pull/3553) is implemented.
The [RFC for this functionality in Cargo itself](https://github.com/rust-lang/rfcs/pull/2801) has been [postponed](https://github.com/rust-lang/rfcs/pull/2801#issuecomment-2122880841) by the Cargo team until the [more foundational SBOM RFC](https://github.com/rust-lang/rfcs/pull/3553) is implemented. That RFC has now been implemented and is available via an [unstable feature](https://doc.rust-lang.org/cargo/reference/unstable.html#sbom). cargo-auditable integrates with this: if you enable that feature and build with cargo auditable, e.g with `CARGO_BUILD_SBOM=true cargo auditable -Z sbom build` and a nightly Rust toolchain, then cargo auditable will use the SBOM precursor files generated by cargo.
23 changes: 20 additions & 3 deletions cargo-auditable/src/collect_audit_data.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,30 @@ use std::str::from_utf8;

use crate::{
auditable_from_metadata::encode_audit_data, cargo_arguments::CargoArgs,
rustc_arguments::RustcArgs,
rustc_arguments::RustcArgs, sbom_precursor,
};

/// Calls `cargo metadata` to obtain the dependency tree, serializes it to JSON and compresses it
pub fn compressed_dependency_list(rustc_args: &RustcArgs, target_triple: &str) -> Vec<u8> {
let metadata = get_metadata(rustc_args, target_triple);
let version_info = encode_audit_data(&metadata).unwrap();
let sbom_path = std::env::var_os("CARGO_SBOM_PATH");

// If cargo has created precursor SBOM files, use them instead of `cargo metadata`.
let version_info = if sbom_path.as_ref().map(|p| !p.is_empty()).unwrap_or(false) {
// Cargo creates an SBOM file for each output file (rlib, bin, cdylib, etc),
// but the SBOM file is identical for each output file in a given rustc crate compilation,
// so we can just use the first SBOM we find.
let sbom_path = std::env::split_paths(&sbom_path.unwrap()).next().unwrap();
let sbom_data: Vec<u8> = std::fs::read(&sbom_path)
.unwrap_or_else(|_| panic!("Failed to read SBOM file at {}", sbom_path.display()));
let sbom_precursor: sbom_precursor::SbomPrecursor = serde_json::from_slice(&sbom_data)
.unwrap_or_else(|_| panic!("Failed to parse SBOM file at {}", sbom_path.display()));
sbom_precursor.into()
} else {
// If no SBOM files are available, fall back to `cargo metadata`
let metadata = get_metadata(rustc_args, target_triple);
encode_audit_data(&metadata).unwrap()
};

let json = serde_json::to_string(&version_info).unwrap();
// compression level 7 makes this complete in a few milliseconds, so no need to drop to a lower level in debug mode
let compressed_json = compress_to_vec_zlib(json.as_bytes(), 7);
Expand Down
1 change: 1 addition & 0 deletions cargo-auditable/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ mod object_file;
mod platform_detection;
mod rustc_arguments;
mod rustc_wrapper;
mod sbom_precursor;
mod target_info;

use std::process::exit;
Expand Down
199 changes: 199 additions & 0 deletions cargo-auditable/src/sbom_precursor.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
use std::collections::HashMap;

use auditable_serde::{Package, Source, VersionInfo};
use cargo_metadata::{
semver::{self, Version},
DependencyKind,
};
use serde::{Deserialize, Serialize};

/// Cargo SBOM precursor format.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SbomPrecursor {
/// Schema version
pub version: u32,
/// Index into the crates array for the root crate
pub root: usize,
/// Array of all crates
pub crates: Vec<Crate>,
/// Information about rustc used to perform the compilation
pub rustc: RustcInfo,
}

impl From<SbomPrecursor> for VersionInfo {
fn from(sbom: SbomPrecursor) -> Self {
// cargo sbom data format has more nodes than the auditable info format - if a crate is both a build
// and runtime dependency it will appear twice in the `crates` array.
// The `VersionInfo` format lists each package only once, with a single `kind` field
// (Runtime having precedence over other kinds).

// Firstly, we deduplicate the (name, version) pairs and create a mapping from the
// original indices in the cargo sbom array to the new index in the auditable info package array.
let (_, mut packages, indices) = sbom.crates.iter().enumerate().fold(
(HashMap::new(), Vec::new(), Vec::new()),
|(mut id_to_index_map, mut packages, mut indices), (index, crate_)| {
match id_to_index_map.entry(crate_.id.clone()) {
std::collections::hash_map::Entry::Occupied(entry) => {
// Just store the new index in the indices array
indices.push(*entry.get());
}
std::collections::hash_map::Entry::Vacant(entry) => {
let (name, version, source) = parse_fully_qualified_package_id(&crate_.id);
// If the entry does not exist, we create it
packages.push(Package {
name,
version,
source,
// Assume build, if we determine this is a runtime dependency we'll update later
kind: auditable_serde::DependencyKind::Build,
// We will fill this in later
dependencies: Vec::new(),
root: index == sbom.root,
});
entry.insert(packages.len() - 1);
indices.push(packages.len() - 1);
}
}
(id_to_index_map, packages, indices)
},
);

// Traverse the graph as given by the sbom to fill in the dependencies with the new indices.
//
// Keep track of whether the dependency is a runtime dependency.
// If we ever encounter a non-runtime dependency, all deps in the remaining subtree
// are not runtime dependencies, i.e a runtime dep of a build dep is not recognized as a runtime dep.
let mut stack = Vec::new();
stack.push((sbom.root, true));
while let Some((old_index, is_runtime)) = stack.pop() {
let crate_ = &sbom.crates[old_index];
for dep in &crate_.dependencies {
stack.push((dep.index, dep.kind == DependencyKind::Normal && is_runtime));
}

let package = &mut packages[indices[old_index]];
if is_runtime {
package.kind = auditable_serde::DependencyKind::Runtime
};

for dep in &crate_.dependencies {
let new_dep_index = indices[dep.index];
if package.dependencies.contains(&new_dep_index) {
continue; // Already added this dependency
} else if new_dep_index == indices[old_index] {
// If the dependency is the same as the package itself, skip it
continue;
} else {
package.dependencies.push(new_dep_index);
}
}
}

VersionInfo { packages }
}
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Crate {
/// Package ID specification
pub id: String,
/// List of target kinds
pub kind: Vec<String>,
/// Enabled feature flags
pub features: Vec<String>,
/// Dependencies for this crate
pub dependencies: Vec<Dependency>,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Dependency {
/// Index into the crates array
pub index: usize,
/// Dependency kind: "normal", "build", or "dev"
pub kind: DependencyKind,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct RustcInfo {
/// Compiler version
pub version: String,
/// Compiler wrapper
pub wrapper: Option<String>,
/// Compiler workspace wrapper
pub workspace_wrapper: Option<String>,
/// Commit hash for rustc
pub commit_hash: String,
/// Host target triple
pub host: String,
/// Verbose version string: `rustc -vV`
pub verbose_version: String,
}

const CRATES_IO_INDEX: &str = "https://github.com/rust-lang/crates.io-index";

/// Parses a fully qualified package ID spec string into a tuple of (name, version, source).
/// The package ID spec format is defined at https://doc.rust-lang.org/cargo/reference/pkgid-spec.html#package-id-specifications-1
///
/// The fully qualified form of a package ID spec is mentioned in the Cargo documentation,
/// figuring it out is left as an exercise to the reader.
///
/// Adapting the grammar in the cargo doc, the format appears to be :
/// ```norust
/// fully_qualified_spec := kind "+" proto "://" hostname-and-path [ "?" query] "#" [ name "@" ] semver
/// query := ( "branch" | "tag" | "rev" ) "=" ref
/// semver := digits "." digits "." digits [ "-" prerelease ] [ "+" build ]
/// kind := "registry" | "git" | "path"
/// proto := "http" | "git" | "file" | ...
/// ```
/// where:
/// - the name is always present except when the kind is `path` and the last segment of the path doesn't match the name
/// - the query string is only present for git dependencies (which we can ignore since we don't record git information)
fn parse_fully_qualified_package_id(id: &str) -> (String, Version, Source) {
let (kind, rest) = id.split_once('+').expect("Package ID to have a kind");
let (url, rest) = rest
.split_once('#')
.expect("Package ID to have version information");
let source = match (kind, url) {
("registry", CRATES_IO_INDEX) => Source::CratesIo,
("registry", _) => Source::Registry,
("git", _) => Source::Git,
("path", _) => Source::Local,
_ => Source::Other(kind.to_string()),
};

if source == Source::Local {
// For local packages, the name might be in the suffix after '#' if it has
// a diferent name than the last segment of the path.
if let Some((name, version)) = rest.split_once('@') {
(
name.to_string(),
semver::Version::parse(version).expect("Version to be valid SemVer"),
source,
)
} else {
// If no name is specified, use the last segment of the path as the name
let name = url
.split('/')
.next_back()
.unwrap()
.split('\\')
.next_back()
.unwrap();
(
name.to_string(),
semver::Version::parse(rest).expect("Version to be valid SemVer"),
source,
)
}
} else {
// For other sources, the name and version are after the '#', separated by '@'
let (name, version) = rest
.split_once('@')
.expect("Package ID to have a name and version");
(
name.to_string(),
semver::Version::parse(version).expect("Version to be valid SemVer"),
source,
)
}
}
1 change: 1 addition & 0 deletions cargo-auditable/tests/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Cargo.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fn main() {}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fn main() {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[package]
name = "bar"
version = "0.1.0"
edition = "2021"

[dependencies]

[workspace]
14 changes: 14 additions & 0 deletions cargo-auditable/tests/fixtures/path_not_equal_name/bar/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
pub fn add(left: u64, right: u64) -> u64 {
left + right
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn it_works() {
let result = add(2, 2);
assert_eq!(result, 4);
}
}
10 changes: 10 additions & 0 deletions cargo-auditable/tests/fixtures/path_not_equal_name/foo/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[package]
name = "foo"
version = "0.1.0"
edition = "2021"

[dependencies]
bar = { version = "0.1.0", path = "../bar" }
baz = { version = "0.1.0", path = "../qux" }

[workspace]
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
fn main() {
println!("Hello, world!");
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[package]
name = "baz"
version = "0.1.0"
edition = "2021"

[dependencies]

[workspace]
14 changes: 14 additions & 0 deletions cargo-auditable/tests/fixtures/path_not_equal_name/qux/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
pub fn add(left: u64, right: u64) -> u64 {
left + right
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn it_works() {
let result = add(2, 2);
assert_eq!(result, 4);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fn main() {}
Loading
Loading