Skip to content

Add cookbook: cross-database CVE verification with TensorFeed (hosted MCP)#2683

Open
RipperMercs wants to merge 5 commits into
openai:mainfrom
RipperMercs:ripper/add-tensorfeed-mcp-cookbook
Open

Add cookbook: cross-database CVE verification with TensorFeed (hosted MCP)#2683
RipperMercs wants to merge 5 commits into
openai:mainfrom
RipperMercs:ripper/add-tensorfeed-mcp-cookbook

Conversation

@RipperMercs
Copy link
Copy Markdown

@RipperMercs RipperMercs commented May 9, 2026

Summary

Adds an MCP-pattern cookbook demonstrating cross-database CVE triage using the OpenAI Responses API's native MCP tool integration. It composes three independent vulnerability databases (MITRE CVE List, CISA Known Exploited Vulnerabilities, FIRST.org EPSS) through TensorFeed.ai's hosted MCP server, in a single Responses call.

Disclosure

I maintain TensorFeed. This notebook uses only its free tier: no account, no API key, no payment, and the three tools it calls are free. In response to @jluocsa's review I removed the premium-endpoint section so the notebook reads as a neutral free-tier demo rather than a product pitch.

What it teaches

A common production failure mode for security agents is not hallucination. It is acting on a single source. A triage agent that judges a CVE off one database can be wrong without fabricating anything. Corroborating across independent sources narrows that failure class; it does not eliminate hallucination, and the notebook says so. The notebook shows the pattern with client.responses.create(...) and tools=[{type: "mcp", ...}]:

  1. Connect a hosted MCP server via tools[].type = "mcp" (no manual JSON-RPC loop)
  2. Two demos: single-CVE verification, then parallel triage of three CVEs
  3. Print the actual mcp_call trail so the corroboration is auditable

Why this fits examples/mcp

examples/mcp/mcp_tool_guide.ipynb introduces the Responses API MCP tool with a single-server demo. This adds a real-world pattern: multi-tool composition for cross-source corroboration inside one Responses call, with the audit trail surfaced. That is a common need for security, compliance, and finance agents that must not act on a single source.

Data licensing (per source)

The three databases carry different terms, so they are listed individually rather than bundled:

TensorFeed passes these records through unmodified and preserves source attribution on every response. The notebook itself is MIT under this repo's license.

Changes since review (thanks @jluocsa)

All six items are addressed in commit 7eda76f (notebook) and 07d917c (authors.yaml):

  1. Affiliation disclosed above; premium section cut to one neutral line plus an in-notebook maintainer note
  2. Security note added directly above the MCP tool cell: why require_approval: "never" is safe for a read-only public-data server, and to use "always" plus a tight allowed_tools allowlist for untrusted or write-capable servers
  3. New cell prints the actual mcp_call audit trail (server, tool, args, per-source tally) after the triage demo
  4. License claim replaced with the per-source breakdown above, each linked; EPSS is no longer described as CC0 or public domain
  5. authors.yaml ripper entry fixed: it was missing the schema-required avatar field. Now name plus website (GitHub profile, matching every other entry) plus avatar
  6. Opening framing softened from absolute to "a common failure mode," with an explicit note that corroboration reduces single-source error rather than eliminating hallucination

Files

  • examples/mcp/tensorfeed_cve_verification.ipynb
  • registry.yaml entry under examples/mcp, tags [mcp, agents, security]
  • authors.yaml entry for ripper

Note on the commit method

This branch is authored via the GitHub Contents API rather than a local clone because examples/data/hotel_invoices/extracted_invoice_json / contains a trailing space in its directory name, which Windows refuses to check out. All three changed files are in supported paths, so the commits are clean and reviewable normally. Happy to reconstruct locally if you would prefer.

Copy link
Copy Markdown

@jluocsa jluocsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice demo of the Responses API's native MCP tool integration — the cross-source corroboration pattern (confirmed_by list, ranked triage across MITRE/KEV/EPSS) is a genuine production idiom for security agents and a good fit for examples/mcp/. A few items I'd flag before merge:

1. Author affiliation should be disclosed in the PR description.

registry.yaml lists ripper as the author; the source-code link at the bottom (https://github.com/RipperMercs/tensorfeed) and the PR opener (@RipperMercs) make it clear the author is also the maintainer of TensorFeed itself. That's fine — cookbook accepts vendor demos — but two adjustments would land it cleanly:

  • Add a sentence in the PR description ("I maintain TensorFeed; this notebook uses its free tier") so reviewers don't have to triangulate.
  • Move the "Going further: TensorFeed's premium one-call composition" section either out of the notebook entirely, or scope it down. The current section reads as a sales page — pricing ($0.02/credit), payment rails (x402 V2 on Base mainnet), token-savings claim ("~6,000 saved tokens per call"), direct link to /developers/agent-payments. The cookbook generally avoids actively recommending paid third-party endpoints, especially when the author maintains them. Keeping the free-tier demo and dropping the premium pitch (or moving it to a single neutral line: "TensorFeed also offers a paid one-call composition endpoint — see tensorfeed.ai for details") would match how other vendor-adjacent cookbooks handle this.

2. Add a security caveat next to require_approval: "never".

TF_MCP_TOOL = {
    "type": "mcp",
    "server_label": "tensorfeed",
    "server_url": "https://tensorfeed.ai/api/mcp",
    ...
    "require_approval": "never",
}

This is fine for a vetted public-data MCP server (which TF appears to be), but cookbook patterns get copy-pasted. A reader pointing this at an untrusted MCP server with require_approval: "never" and broad allowed_tools is a real foot-gun. A two-line markdown note above this cell would prevent that:

require_approval: "never" is safe here because TensorFeed is a read-only public-data server. For untrusted or write-capable MCP servers, use "require_approval": "always" and an explicit allowed_tools allowlist.

The allowed_tools constraint to three tools is already great — explain why it's there too.

3. Show the corroboration trail, not just the final text.

The "What just happened" markdown says:

For the triage task, the model fanned out across three CVEs (~9 tool calls total)…

…but print(triage.output_text) only shows the model's prose answer. The actual tool calls — the auditable part that justifies the confirmed_by pattern the notebook is pitching — are hidden. A small additional cell would close this:

for item in triage.output:
    if item.type == "mcp_call":
        print(f"{item.server_label}.{item.name}({item.arguments})")

That's the difference between "the model said it consulted three sources" and "we can see it consulted three sources." For a security notebook arguing against single-source reasoning, showing the audit trail is the most important cell.

4. License claim is asserted but not sourced.

License: most underlying data is US Government public domain or CC0; commercial redistribution permitted; attribution preserved on every response.

KEV and parts of MITRE CVE are US Gov public domain; EPSS from FIRST.org has more nuanced terms (FIRST data is publicly available but redistribution terms aren't strictly CC0). For a security cookbook, an explicit link per source — MITRE CVE License, CISA KEV terms, FIRST.org EPSS license — would be more defensible than a bundled assertion. A reader who's going to depend on this in production will check.

5. authors.yaml entry is minimal compared to recent additions.

ripper:
  name: Ripper
  website: https://tensorfeed.ai

Compare to the entry directly above (PASFIELD-OAI has name, website, and avatar). Recommend matching: add a GitHub link as website (e.g., https://github.com/RipperMercs) and an avatar URL, with tensorfeed.ai as a secondary field if the schema supports it. Otherwise the cookbook author page for this notebook will be sparse.

6. Anti-hallucination framing is a bit oversold.

The actual production failure mode for security agents is not hallucination. It's acting on a single source.

It's a punchy opening but it's not quite right — a model can still hallucinate when corroborating three sources if it misreads the JSON, conflates CVE IDs, or pattern-matches the wrong record. The corroboration pattern reduces the most common class of single-source-error failures, it doesn't eliminate hallucination. Softening to something like "A common production failure mode for security agents isn't hallucination — it's acting on a single source." would be more accurate and less inviting to a "but actually models can still…" reply.

Nice touches I noticed:

  • The allowed_tools constraint (3 of TF's 17 tools) is exactly the right pattern for scoping an MCP server to a specific task — most readers will not know to do this and the notebook surfaces it naturally.
  • gpt-5.1 with # any current chat-completions/responses-capable model comment is the right way to handle model name drift in a long-lived cookbook entry.
  • The Demo 2 prompt is well-constructed — asking for "brief reasoning per CVE plus an ordered list at the end" gives the model permission to use the audit trail without forcing tool-call dumping into the output.

The corroboration-as-anti-hallucination concept is a worthwhile cookbook addition; the main thing IMO is trimming the premium-product framing and adding the security note on require_approval: "never".

- add a security note above the MCP tool cell explaining why require_approval:"never" is safe here and when to use "always" plus an allowed_tools allowlist
- add a cell that prints the actual mcp_call audit trail after triage
- list MITRE / CISA KEV / FIRST.org EPSS licensing per source with links (EPSS is not CC0/public-domain)
- soften the anti-hallucination framing to corroboration-as-risk-reduction
- trim the premium section to one neutral line and add a maintainer disclosure note
authors_schema.json requires name+website+avatar with additionalProperties:false; add avatar and use the GitHub profile as website to match every other entry.
@RipperMercs
Copy link
Copy Markdown
Author

Thanks, this is a genuinely useful review and I agree with all of it. Pushed a revision: notebook in 7eda76f, authors.yaml in 07d917c.

1. Affiliation + premium framing. Disclosed in the PR description now (I maintain TensorFeed; the notebook uses only the free tier). You're right that the "Going further" section read as a sales page. I removed the pricing, the payment-rail detail, the token-savings number, and the deep link, and replaced the whole block with one neutral line plus an explicit maintainer note in the notebook itself. The demo is free-tier only.

2. Security note on require_approval: "never". Added as a blockquote directly above the tool cell. It states that "never" is acceptable here only because the TF server is read-only public data, and that untrusted or write-capable servers should use "always" with a tight allowed_tools allowlist. The allowed_tools rationale is now framed as least privilege and blast-radius, not just focus, with matching inline comments in the code cell.

3. Show the corroboration trail. Best catch in the review, and you're right that it's the most important cell for this notebook. Added a cell after the triage that pulls mcp_call items from triage.output and prints server_label.name, status, args, and a per-source tally. "The model says it checked three databases" is now "here are the calls it made." I also stopped asserting a hard "~9 tool calls" and "no hallucinations" in the prose and point at the printed trail instead.

4. License sourcing. You're right that the bundled assertion was too loose, especially for EPSS. The notebook now lists the three sources separately with a link each: CVE Program Terms of Use, the CISA KEV CC0 1.0 license file, and FIRST.org EPSS. EPSS is explicitly described as free for public and commercial use with attribution requested but not required, and explicitly not CC0 or public domain. Same breakdown in the PR description.

5. authors.yaml. Good prompt to check this. Verifying against .github/authors_schema.json, the entry was actually schema-invalid, not just sparse: name, website, and avatar are all required and additionalProperties is false, so a separate tensorfeed.ai field isn't possible. Fixed to name + website (GitHub profile, matching every other entry) + avatar.

6. Anti-hallucination framing. Softened to "A common production failure mode for security agents isn't hallucination. It's acting on a single source," with an added sentence making explicit that corroboration narrows the single-source-error class but does not make the model immune to misreading or conflating a record. Agreed it's both more accurate and less of an invitation to the "but models still hallucinate" reply.

Thanks also for the notes on the allowed_tools scoping, the model-drift comment, and the Demo 2 prompt construction; kept all three as is. Ready for another look, and happy to make further changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants