-
Notifications
You must be signed in to change notification settings - Fork 207
Add packageURL
field to product in affected
array.
#409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add packageURL
field to product in affected
array.
#409
Conversation
The `affected` array is an array containing `product` objects, which must at minimum include an "identifier" (which may be a composite identifier composed of multiple fields) along with a set of version bounds or a default status. Products may also specify an assortment of additional fields which further constrain the applicability of the CVE to its intended target hardware or software. Previously, the set of identifiers available were: - A `vendor` and `product` - A `collectionURL` and `packageName` This commit adds support for a new identifier, called `packageURL`, which uses the purl (Package URL) specification. The contents of the commit add this as a new field on the `product` type, with a description and examples, and also update the data constraints on the `product` type, both to make `packageURL` an option to fulfill the identifier requirement already in place on the type, and to ensure that the new `packageURL` field is not mixed with the existing `collectionURL` or `packageName` fields, as they are redundant with `packageURL` and including both increases the possibility of data inconsistency within a single CVE record. This inclusion of a new `packageURL` type which can be used instead of the existing pair of `collectionURL` and `packageName` would require consumers of CVE records to update their logic both to accept the new field, and to use it in places where they may today use the pair of `collectionURL` and `packageName`. This commit does not include a regular expression to parse Package URLs specifically. Rather, it reuses the existing `uriType` schema. So we can be sure after validating CVE records against this updated record format that the `packageURL` field is a URL, but not that it is a valid Package URL per the Package URL specification. It would be the responsibility of CVE Services to further validate the field to ensure values match the Package URL specification. We do not perform this validation in-schema due to the complexity of expressing the validation in the form of a regular expression. This work is submitted as an alternative formulation of the design proposed in the draft RFD on software identifiers [1], and as an alternative to the existing proposals for making the `cpeApplicability` structure generic [2] (instead of it being CPE-specific) and enhancing this new generic applicability structure with support for Package URLs [3]. If this change is accepted, then [2] and [3] should not be accepted. [1]: CVEProject#407 [2]: CVEProject#391 [3]: CVEProject#397 Signed-off-by: Andrew Lilley Brinker <[email protected]>
Open questions:
|
Summary: Should versions be permitted within a
|
Points in favor of Option 2: The Option 1 Pro is unimportant because it is rare for a vulnerability to affect only one version. A group at MITRE researched this across OSV data in approximately 2024 and found that, with a few exclusions such as malicious-packages, less than 10% of vulnerabilities affected only one version. Also, there is probably no realistic category of CNA where, across all of their CVE Records, vulnerabilities affect only one version. Thus, every CNA may need a process for creating a range. Large parts of the open source community are familiar with a constraint where Purl must omit a version, e.g., "The purl field is a string following the Package URL specification that identifies the package, without the @version component" at https://ossf.github.io/osv-schema/ Placing a version number inside the Purl will, in some cases, increase the complexity of the secondary parser. For example, a Debian package version may have a ':' character (see the https://www.debian.org/doc/debian-policy/ch-controlfields.html#version page) that must be encoded as %3A within a Purl. Depending on my usage of Purl data in CVE, placing all versioning outside of the Purl may mean that my secondary parser doesn't need to know about URL encoding. (Admittedly, most people would still choose a secondary parser that handles 100% of the Purl complexities.) With ADPs, there can be different reporting practices that cause different data providers to publish different version information for the same vulnerability. To simplify a consumer's process of comparing the differences, it could be helpful if the data were expressed in the same way. For example, one provider might be expressing only 8.0.0 but another is expressing "version":"6.0.0","lessThanOrEqual":"8.0.0" |
Allowing versioning in both locations, but only one at a time, seems like it would complicate validation for CVE services as well. We'd need to parse the Purl to determine if version information is present. If it is, then we'd have to ensure no version information is provided in the existing elements, if it is not, then we'd need to ensure it is. Basically making sure it appears once and only once. I'd prefer requiring version information in the existing fields and recommending it not be in the Purl - but we don't have to enforce that. We could say in cases of disagreement the non-Purl version information always wins and CNAs SHOULD NOT include version info in the Purl. (As opposed to MUST NOT.) But even if we make it is MUST NOT it seems easier to check the Purl and reject the record if version is included, and ensure the other fields are filled in. And it makes things more consistent downstream, for example the website, etc. |
(Masters) I support the 2nd option - not permitting version information in the PURL string. |
@MrMegaZone I've amended the list of options to reflect a third one based on your comment. If you disagree with my summary or think I've missed any pros or cons, let me know. |
I've drafted example CVE records representing different possibilities for embedding version information in a Having thought about this over the weekend, I think option 2: only permitting version information in the It's the simplest option for CVE data consumers, who only need to look in one place for version data. It doesn't permit data inconsistency (an example of which you can see in the Gist I linked above). Also, given that the decision we'd reach here should be applied in the same way to any future identifier formats added, not permitting versions embedded in an identifier avoids a future where CVE consumers would need to parse many different identifier formats to get all possible version information. The current design of the This would mean that CNAs need to put version information in the So, in short, only permitting version information in the
It would require CVE Services to parse purls, but there is an official JavaScript package for doing just that, which CVE Services could use easily: https://github.com/package-url/packageurl-js |
+1 for option 2. Decoupling the version information would allow for version types to be added/vetted/validated asynchronously from package identifiers and would allow for a compact representation of a single software with multiple version ranges. |
One more point in favor of disallowing versions in purls, OSV disallows them today: https://ossf.github.io/osv-schema/#affectedpackage-field. It would be ideal to match the behavior of other ecosystems where possible to increase interoperability. |
This amends the specification for Package URLs to no longer permit versions in them, updating the description and examples for the `packageURL` field of the `product` object. The actual enforcement of this requirement will need to be done within CVE Services. Signed-off-by: Andrew Lilley Brinker <[email protected]>
The PR has been updated to no longer permit versions in Package URLs, going with Option 2 outlined above. |
One new issue has been raised by @ElectricNroff (Matt Power) concerning forward compatibility. In short: the proposal today represents a forward-compatibility hazard, because of how it modifies the required field constraints on the The current proposal turns this... "anyOf": [
{"required": ["vendor", "product"]},
{"required": ["collectionURL", "packageName"]}
] Into this... "anyOf": [
{"required": ["vendor", "product"]},
{"required": ["collectionURL", "packageName"]},
{"required": ["packageURL"]}
] The concern is that by adding a new option to the In this case, that additional parsing logic would, if the consumer wants to make use of the Package URL, also likely involve a need to parse the On parsing, it's worth noting that many languages have libraries which offer a facility to generate parsers based on a JSON schema, so in these languages the challenge of updating to a new parser version is generally limited. This is a distinct compatibility issue from the backwards-compatibility concerns reflected in SchemaVer; because of this, Matt also raised a view that SchemaVer may not be the right versioning scheme to use for the CVE Record Format (for that reason, I've also removed mention of it from the "RFD to introduce an RFD process" (#405), so the issue of what versioning scheme to use can be fully addressed in a more focused future RFD. Given the future-compatibility hazard this proposal represents (and the same hazard exists for the OmniBOR PR [#410]), there's a decision to make, with trade-offs:
The following a breakdown of pros and cons for the two options:
There is a broader question here of whether the CVE Record Format should try to preserve forward compatibility between versions. SchemaVer, which the QWG has previously had at least a loose consensus on adopting, does not consider forward compatibility for versioning, so even ADDITION-level changes are permitted to break forward compatibility by doing things like adding to the set of fields which fulfill a closed set of requirements, or adding new variants to an So we have both the immediate question of how to resolve this |
I'm not sure if we've answered this permanently, but do we expect CVE Services (and the pile-of-JSON-files cache in GitHub) to only accept and return one version of the schema? Or can different records have different schema versions (and Services/file cache would accept/return different schema versions? The reason I ask:
If threre is one schema (version) at a time, and it gets updated, then I don't think use of the prior schema is a concern. Those potential "users of the prior schema" would need to use the updated schema, in which case they would accept |
@zmanion I’m not certain if records embed a reference to the schema they’re written against, but I think CVE Services enforces one schema version for input, and downstream consumers would still need to make changes to adapt to new schema versions, otherwise they would be unable to parse newer records when a forward compatibility breaking change is made. |
In today's QWG meeting folks raised a desire for examples of what both OmniBOR (see #410) and purl would look like in a Here are examples! https://gist.github.com/alilleybrinker/de8f56ba599609f7867bc5589c73505b |
One additional issue, raised by @ccoffin (see: https://gist.github.com/alilleybrinker/de8f56ba599609f7867bc5589c73505b?permalink_comment_id=5633750#gistcomment-5633750) is whether there should be a constraint to disallow the creation of CVE Records where the only identifier used within the Personally, I do not think this constraint is necessary, for two reasons:
As I see it, at the very least this is a constraint which could be added later if we observe CVE Records being issued with insufficient identifier information. Meta note: I'm tracking this issue here, although it may more properly be tracked in #410, because this PR is already used for tracking two other issues and I want to reduce splintering of the conversation. |
Signed-off-by: Andrew Lilley Brinker <[email protected]>
Signed-off-by: Andrew Lilley Brinker <[email protected]>
(This really applies to the RFD #407, but I am pasting it here as well for completeness) Note Final Comment PeriodA Final Comment Period (FCP) has been called for this proposal. This is a final opportunity to raise new concerns with the proposal. The FCP will close at 2pm PDT / 5pm EDT July 3rd, at the end of the Quality Working Group Meeting. |
Signed-off-by: Andrew Lilley Brinker <[email protected]>
The
affected
array is an array containingproduct
objects, which must at minimum include an "identifier" (which may be a composite identifier composed of multiple fields) along with a set of version bounds or a default status. Products may also specify an assortment of additional fields which further constrain the applicability of the CVE to its intended target hardware or software.Previously, the set of identifiers available were:
vendor
andproduct
collectionURL
andpackageName
This commit adds support for a new identifier, called
packageURL
, which uses the purl (Package URL) specification. The contents of the commit add this as a new field on theproduct
type, with a description and examples, and also update the data constraints on theproduct
type, both to makepackageURL
an option to fulfill the identifier requirement already in place on the type, and to ensure that the newpackageURL
field is not mixed with the existingcollectionURL
orpackageName
fields, as they are redundant withpackageURL
and including both increases the possibility of data inconsistency within a single CVE record.This inclusion of a new
packageURL
type which can be used instead of the existing pair ofcollectionURL
andpackageName
would require consumers of CVE records to update their logic both to accept the new field, and to use it in places where they may today use the pair ofcollectionURL
andpackageName
.This commit does not include a regular expression to parse Package URLs specifically. Rather, it reuses the existing
uriType
schema. So we can be sure after validating CVE records against this updated record format that thepackageURL
field is a URL, but not that it is a valid Package URL per the Package URL specification. It would be the responsibility of CVE Services to further validate the field to ensure values match the Package URL specification. We do not perform this validation in-schema due to the complexity of expressing the validation in the form of a regular expression.This work is submitted as an alternative formulation of the design proposed in the draft RFD on software identifiers 1, and as an alternative to the existing proposals for making the
cpeApplicability
structure generic 2 (instead of it being CPE-specific) and enhancing this new generic applicability structure with support for Package URLs 3.If this change is accepted, then 2 and 3 should not be accepted.