-
Notifications
You must be signed in to change notification settings - Fork 32
TC-2564 : Add versions column to sbom to allow version filtering #1700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@jcrossley3 , is that test error expected ? |
Yes. Per our chat, many of our unit tests invoke code branches that may result in error messages logged to stdout. It's the failures, not the errors, you want to focus on. |
|
/scale-test |
|
🛠️ Scale test has started! Follow the progress here: Workflow Run |
|
To my understanding, this will join and expand searching for SBOMs with all packages. Which can be huge! Just for the case when the user might want to search for this information. And not even all information, just for the packages describing the SBOM, right? I think we should only add a join like this in cases where the user actually wants to search for this information. |
Goose ReportGoose Attack ReportPlan Overview
Request Metrics
Response Time Metrics
Status Code Metrics
Transaction Metrics
Scenario Metrics
📄 Full Report (Go to "Artifacts" and download report) |
|
To me it looks like a hit on the sbom search performance. We should investigate this. |
|
@ctron, to be able to search on the version column we have to join with sbom_package, I can't see another way. Regarding the performance hit, a view seems to be a good alternative. BTW although views aren't managed directly in migrations they can be managed using raw SQL during migrations with |
True, but those tests don't search for that information. So having that, not using it, seems to have an impact anyway. Maybe we require a PoC of a views then. Proving that they do increase performance and seeing the complexity they add. |
|
@ctron, shall we create a separate issue for the tests to be investigated ? +1 regarding the views POC. |
|
I don't think a view PoC is a good use of resources, so I'm -1 on that. I question the current structure of the response. What does I suggest we add a |
|
SPDX SBOMs can have I also don't believe that a view will do us any good. But following the idea of "don't guess, measure", maybe it's worth finding out and learning something from it. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1700 +/- ##
==========================================
+ Coverage 65.09% 65.29% +0.19%
==========================================
Files 355 356 +1
Lines 14733 14721 -12
==========================================
+ Hits 9591 9612 +21
+ Misses 5142 5109 -33 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@ctron . @jcrossley3, the tests are all [GREEN] now. I initially used an Inner Join instead of a Left Join, the latter preserves all the records of the left table (sbom) even when there are no matches with the right table (sbom_package). Also I had to change the test for |
|
/scale-test |
|
🛠️ Scale test has started! Follow the progress here: Workflow Run |
Goose ReportGoose Attack ReportPlan Overview
Request Metrics
Response Time Metrics
Status Code Metrics
Transaction Metrics
Scenario Metrics
Error Metrics
📄 Full Report (Go to "Artifacts" and download report) |
|
@ctron. @jcrossley3, following-up some of your earlier comments. Effectively a more efficient approach would be a dedicated search only when columns of related tables are involved. For me this goes back to the discussion about removing the search "anything" feature in the GUI and offer the search/filter feature for explicit fields where we can have search capacity adapted for them. BTW the scale test seems to be [GREEN] |
I see a massive decrease in SBOM search performance. |
I initially thought that would leaves us with either :
But as @jcrossley3, who's been through this, suggested in related Jira, the best way would be to add versions` column to the SBOM table and populate it during ingestion. This seems to be the best way because it's already done for other fields so this will be quickly implemented and pretty impact less and also will keep the performance unchanged. Let me add a patch using that latter approach... |
Please ensure that there's a migration as part of the patch filling the entry for existing entries. |
|
@jcrossley3, with latest commit the version search works only for a full text (the entire version string) but it doesn't work for a sub-string of the version (the "like" case). |
I'm surprised any of it works! But I'm curious how you know. Can you add a test to this PR that demonstrates what works and what doesn't so that I can replicate locally? Thanks! |
|
/scale-test |
|
🛠️ Scale test has started! Follow the progress here: Workflow Run |
Goose ReportGoose Attack ReportPlan Overview
Request Metrics
Response Time Metrics
Status Code Metrics
Transaction Metrics
Scenario Metrics
Error Metrics
📄 Full Report (Go to "Artifacts" and download report) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'd appreciate a test for covering one more case. But not a blocker.
| .alias("sbom_node", "r0") | ||
| .translator(|f, op, v| match f.split_once(':') { | ||
| Some(("label", key)) => Some(format!("labels:{key}{op}{v}")), | ||
| None => match f { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be possible adding a test for this as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ctron, I've changed to avoid an unnecessary case, please let me know what you think.
| /// Extract versions for a SPDX SBOM by collecting versions of describing packages | ||
| fn versions(sbom: &SPDX) -> Vec<String> { | ||
| // packages describing the SBOM | ||
| let describing = describing_packages(sbom); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I'd prefer to have packages (or their versions) returned right away. Taking a look at the implementation of describing_packages that's however not really possible, as one source of this information only holds the string, not the full package information.
I think the compromise (searching through everything once again) it ok, but we should document this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ctron, I understand what you're saying but I'm not sure what comment you're expecting here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're having a two step process: 1) find describing packages 2) iterate through all the packages once again and check with contains. Why is that necessary? And that "why" should be in the source code as a comment.
| if !describing.contains(p.package_spdx_identifier.as_str()) { | ||
| continue; | ||
| } | ||
| if let Some(version) = &p.package_version { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok as it is. There's another way this should work, just that you know, and maybe you already do:
result.extend(p.package_version.clone());Option<T> is an iterable (zero or one item), and extend can accept any iterable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DB migration isn't executed and it requires a further enhancement to really fix the issue and let the user search by version for the already ingested SBOMs.
| .add_column( | ||
| ColumnDef::new(Sbom::Versions) | ||
| .array(ColumnType::Text) | ||
| .not_null(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this migration would be executed (which is not at the moment based on my above comment), it would fail because all the already ingested SBOMs won't have a value for the versions column.
As @ctron reported in his previous comment, there must be "a migration as part of the patch filling the entry for existing entries"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrizzi, thanks, I missed the lib.rs reference which is why the migration wasn't working and why I kept it in the init-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gildub sorry but this issue hasn't be addressed. The already ingested SBOMs will have no value for the versions column so Search it with "78775b9de88el4fdb43cc47b2db6ff3747b46df323e" will still return nothing after this PR will be merged and released because the SBOM expected to be in the result list won't have any value in the versions column to match the query.
There must be a filling of data for the new versions column for the existing entries in the sbom table to really solve the issue, that's what the second part of my comment above (and @ctron as well) was referring to.
|
@ctron, @jcrossley3 and @mrizzi, I've got all issues addressed, please revisit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add one more tiny test, please.
53f38d6 to
0a10f7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's still a need for a data filling for already ingested rows into the sbom table.
| .add_column( | ||
| ColumnDef::new(Sbom::Versions) | ||
| .array(ColumnType::Text) | ||
| .not_null(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gildub sorry but this issue hasn't be addressed. The already ingested SBOMs will have no value for the versions column so Search it with "78775b9de88el4fdb43cc47b2db6ff3747b46df323e" will still return nothing after this PR will be merged and released because the SBOM expected to be in the result list won't have any value in the versions column to match the query.
There must be a filling of data for the new versions column for the existing entries in the sbom table to really solve the issue, that's what the second part of my comment above (and @ctron as well) was referring to.
|
@mrizzi, my understanding of the the missing part is a default value (empty array) for the versions column, right ? |
As mentioned in the previous comment, this PR, as reported in the description, is meant to provide an SBOM in the response for the query Does this better clarify the need for a data filling during the migration? |
|
@mrizzi, I understand the need to migrate existing data so the versions column is populated. Is there any precedent code handling similar scenarios ? Sounds like re-ingesting SBOMs. |
Not that I'm aware of but please double check the existing migrations. |
|
@sourcery-ai summary |
|
@sourcery-ai review |
Reviewer's GuideIntroduce a dedicated ‘versions’ column in the SBOM table, populate it during ingest for SPDX, CycloneDX, and ClearlyDefined formats, extend query translators and GraphQL API to support exact and fuzzy version filtering, and validate the new functionality with integration tests. Class diagram for updated GraphQL SbomQuery and VulnerabilityAdvisorySummaryclassDiagram
class SbomQuery {
+sbom: SbomInformation
+sboms: [SbomInformation]
}
class VulnerabilityAdvisorySummary {
+sbom: SbomInformation
...
}
SbomQuery --> SbomInformation
VulnerabilityAdvisorySummary --> SbomInformation
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
@ctron, could you please provide an update or the link regarding the migration work issue ? |
|
@mrizzi, a fully supported migration with re-processing of documents is going to take time if accepted. In the meantime this PR and more importantly the bug behind is not progressing. Until we have such re-processing feature in place then we must accept there will be a breaking version at some point. Could we please consider that ? |
Along with #1749, this PR is partially fixing https://issues.redhat.com/browse/TC-2564 which is about using full text search to find information about version.
To be able to search versions data requires to have a
versionscolumn added to sbom table.That's the main purpose of this PR.
This will also allow to specifically search
versionswhich is different than full text search across the sbom.Summary by Sourcery
Add a new "versions" field to SBOM entities and enable version-specific filtering
New Features:
Enhancements:
Build:
Tests: