Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for the VulnerableCode agent #1776

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ziadhany
Copy link
Collaborator

The VulnerableCode agent currently focuses on one main task: extracting the correct version range from the vulnerability summary.

image

Screenshot from 2025-02-10 07-07-49

Screenshot from 2025-02-10 06-57-58

Signed-off-by: ziad hany <ziadhany2016@gmail.com>
Signed-off-by: ziad hany <ziadhany2016@gmail.com>
@ziadhany
Copy link
Collaborator Author

@pombredanne, this is an initial base for the AI summary improver:

Right now, we have two prompts—one to extract the purl and another to get the affected_versions and fixed_versions—without using RAG.

I think I should also feed the model with agent/purl_db/PURL.rst so it can generate more accurate results. I have already implemented the basics of this step.

However, I encountered a small issue related to testing and evaluating our improver because the model sometimes returns a different output each time.

How should we approach testing it?

There’s just a little work left, and I think this improver will be ready soon.

Input
Summary:

Off-by-one error in the apr_brigade_vprintf function in Apache APR-util before 1.3.5
              on big-endian platforms allows remote attackers to obtain sensitive information or cause a
              denial of service (application crash) via crafted input.

Output:

purl: pkg:apache/apr-util@<1.3.5
{
    "affected_versions": ["< 1.3.5"],
    "fixed_versions": [">= 1.3.5"]
}

@ziadhany
Copy link
Collaborator Author

ziadhany commented Mar 4, 2025

@pombredanne This is a small document for the budget you requested. I used some sources like https://llm-stats.com/, and I think the best option is to avoid running the model locally or in the cloud and instead use an API.

Please let me know if you have any comments on this.
https://docs.google.com/document/d/1JZ49FqjessEyMhdKlp1HmfheITr3qKA8xMbNZIZW7UA/edit?usp=sharing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant