-
Notifications
You must be signed in to change notification settings - Fork 212
feat: Add support for MCP Bundles (MCPB) in registry #295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add support for MCP Bundles (MCPB) in registry #295
Conversation
- Add MCPB as a new registry_name option - Add file_hashes field to Package model for integrity verification - Implement URL validation restricting MCPB packages to GitHub/GitLab hosts - Require SHA-256 hash for all MCPB packages - Add example MCPB package in documentation - Update schemas and OpenAPI specification Implements Option 1 from issue modelcontextprotocol#260 - MCPB as an additional package type pointing to hosted .mcpb files with cryptographic verification.
} | ||
|
||
// Allowlist of trusted hosts for MCPB packages | ||
allowedHosts := []string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to verify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
internal/service/registry_service.go
Outdated
allowedHosts := []string{ | ||
"github.com", | ||
"www.github.com", | ||
"raw.githubusercontent.com", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possibly this one (raw.githubusercontent.com) is not necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little uneasy about shoehorning registry_name
and name
to this purpose, while we still have the opportunity to make pre-release breaking changes. Thoughts on the proposed clearer modeling?
docs/server-json/examples.md
Outdated
"packages": [ | ||
{ | ||
"registry_name": "mcpb", | ||
"name": "https://github.com/modelcontextprotocol/text-editor-mcpb/releases/download/v1.0.2/text-editor.mcpb", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is right with our current schema. But a thought for potentially renaming some details here: name
doesn't really make sense for a URL like this, and registry_name
doesn't for mcpb
either (it's more of a package type
).
I think I'd be in favor of something more like this here:
{
"location": {
"type": "mcpb",
"url": "https://github.com/modelcontextprotocol/text-editor-mcpb/releases/download/v1.0.2/text-editor.mcpb"
},
"runtime_hint": ...
...
}
And that's it - drop version
, because the only reason we have version
is to effectively "construct" the url
in the registry_name
case.
Basically, my proposal is to add a nested PackageLocation
object inside here, where we currently just store package_name, name, and version in the top level of the Package
object. It would be an "OR" situation where either the original fields (registry_name, name, and version) or the new fields (type, url) are required. mcpb
is the only valid entry for type
.
Then the original case for other examples, like the one just before this, becomes:
"packages": [
{
"location": {
"registry_name": "npm",
"name": "@example/hybrid-mcp-server",
"version": "1.5.0"
},
"runtime_hint": "npx",
"package_arguments": [
{
"type": "named",
"name": "--mode",
"description": "Operation mode",
"default": "local",
"choices": ["local", "cached", "proxy"]
}
]
}
]
It's a breaking change but I think at this point, pre-release, is a good time to make this rather than shoehorn values into registry_name
and name
. What do you all think? cc @domdomegg @joan-anthropic @toby
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tadasant thanks for the quick turnaround. Agree that registry_name
and name
feel a bit overloaded in the current proposal - I like the distinction.
It'd be nice to keep version as a separate field - to better enable future auto-update workflows for DXTs. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow the use case, could you explain auto-update workflows for DXTs? Is there a reason you wouldn't be able to use the https://github.com/modelcontextprotocol/text-editor-mcpb/releases/download/v1.0.2/text-editor.mcpb
value for that (which has the version in it)?
We've had some feedback from @toby that all these version
fields throughout the various server.json levels are causing confusion, so my thinking is that this is a good opportunity to potentially eliminate one of them as it's not actually necessary (but maybe I'm missing something).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Maybe package_type and package_identifier seems good? From question on the docker runtime type #249 (reply in thread). I think also as a meta thing I'd prefer them to stay in the package object rather than a nested location object, because I think unnecessary nesting is painful for many ecosystems (e.g. parsing nested structs like this in Java creates a lot of boilerplate).
So together I think I might propose something like:
"packages": [
{
"package_type": "mcpb",
"package_identifier": "https://github.com/modelcontextprotocol/cool-mcp/releases/download/v1.2.3/cool-mcp.mcpb",
// there's something that aesthetically feels a little worse about this not being an object, but I think it will make it easier on consumers. and very unlikely to really need other hash functions unless sha256 is broken
"package_hash_sha256": "fe333e598595000ae021bd27117db32ec69af6987f507ba7a63c90638ff633ce"
},
{
"package_type": "docker",
// docker implicitly has registry name in package_identifier, I think we should just have package_identifier because of this
"package_identifier": "ghcr.io/domdomegg/cool-mcp:1.2.3@sha256:4a7f307d128da53e99ea7f45064df440c742aa3f648a09a09b51dfd1cd55378b"
},
{
"package_type": "commonjs",
// i think for now only allowlist npmjs. but subregistries could use e.g. https://npm.pkg.github.com or https://artifactory.corp.internal/npm
"registry_name": "https://registry.npmjs.org/",
"package_identifier": "@domdomegg/[email protected]",
"runtime_hint": "npx",
"package_arguments": [
{
"type": "named",
"name": "--mode",
"description": "Operation mode",
"default": "local",
"choices": ["local", "cached", "proxy"]
}
]
}
]
- I think I agree with @tadasant that the version is not necessary in the
packages
array for DXT. There will still be a version at the top level in server.json. For auto-update I think we'd use this top-level version, and maybeis_latest
.
docs/server-json/examples.md
Outdated
"file_hashes": { | ||
"sha-256": "fe333e598595000ae021bd27117db32ec69af6987f507ba7a63c90638ff633ce" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider lifting this inside the PackageLocation
object I am proposing above, because it is only a concern relevant to non-package-registry entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there is the possibility we add some registry in the future that does not do its own checks. So I'm fine keeping it here as an optional field.
I assume we need to rely on the MCP Registry to generate this file hash at publish-time, right? Or is the idea it will get generated by the CLI tool en route to creating a server.json, and the registry just validates it's correct and not tampered with in the publish flow?
I think we should map out and document this feature in a little more detail, maybe a dedicated doc in docs/
explaining the flows, who's responsible for what. I worry a bit about how file_hashes
is something we are expecting in the immutable server.json file, but also need to verify they're actually valid hashes. Seems doable but would like to see a step by step explanation somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opinionated here and will defer to you all re: what makes sense for the workflow the registry committee would prefer to support; happy add docs to reflect.
- Add hash generation during publish (can be disabled with --no-hash) - Add verify command to check existing hashes - Add hash-gen command to generate/update hashes - Support NPM and direct URL (MCPB) packages - Document implementation in docs/file-hashes.md The CLI tool now generates SHA-256 hashes at publish time. Registry validates these hashes to ensure package integrity.
@joan-anthropic another thought just came to me (sorry for the thrashing): what if instead of For example, this old format: {
"registry_name": "npm",
"name": "@example/hybrid-mcp-server",
"version": "1.5.0",
"runtime_hint": "npx",
"package_arguments": [
{
"type": "named",
"name": "--mode",
"description": "Operation mode",
"default": "local",
"choices": ["local", "cached", "proxy"]
}
]
} Becomes: {
"url": "https://www.npmjs.com/package/@example/hybrid-mcp-server/v/1.5.0",
"type": "javascript",
"runtime_hint": "npx",
"package_arguments": [
{
"type": "named",
"name": "--mode",
"description": "Operation mode",
"default": "local",
"choices": ["local", "cached", "proxy"]
}
]
} So then we have just one way of doing things. We can still properly sanitize and logic branch on Running out of time here today on my end to fully think through this, but this seems like a potentially cleaner approach here and drawbacks aren't coming to top of mind. I also like how it helps us remove the schema-level dependency on enumerating all these branded centralized registry names Curious for @domdomegg's take here |
Yep I like this! I ended up commenting something similar before I scrolled far enough down to see this. I think I prefer the direction of this approach. I don't mind too much between a URL and slightly more structure (e.g. package type, registry name, identifier). I think I have a weak preference towards the slightly more structure just because I think clients might have an easier time parsing it to programmatically get packages etc. But fine to go with URL, especially if we provide guidance on the formats of URLs. Perhaps also some small SDKs in common languages for parsing and running server.jsons, as I imagine clients will want something like that anyway. |
Resolved conflicts to integrate PackageLocation changes with new server wrapper format
- Add LegacyPackage type to migrate-seed tool to handle old package format - Fix conversion from legacy registry_name/name to new PackageLocation format - Fix test data in import_test.go to include packages in ServerResponse - Fix field name typo: RuntimeHint -> RunTimeHint - Remove unused server detail handlers in test
@domdomegg I took @tadasant 's suggestion here but happy to replace with |
- Replace fmt.Print with log for console output - Make 'docker' a constant (packageTypeDocker) - Fix file permissions from 0644 to 0600 - Convert if-else chain to switch statement - Fix indent-error-flow by removing unnecessary else - Add nolint comment for Docker tag format (not a port)
- Fixed indentation issues in repository and version_detail blocks - Corrected array/object closing brackets for packages and remotes - Ensured consistent 2-space indentation throughout all examples - All 9 examples now pass JSON validation
- Removed file hash generation from publisher tool - Removed MCPB hash validation requirement from registry service - Removed file-hashes.md implementation guide - Kept file_hashes as optional field in schema for external tools - Publisher tool now only handles schema changes (PackageLocation)
Sorry - I'm missing a few things here; please hold off on review! |
…entifier fields - Replace overloaded registry_name and name fields with semantic package_type, registry, and identifier fields - Update all JSON schemas and OpenAPI specifications to reflect new structure - Convert all 396 seed.json entries to new format - Update publisher tool to generate new structure - Update all examples and documentation - Fix all tests to use new field names - Update validation logic for new structure
Preserve the original \u003c and \u003e Unicode escapes for < and > characters
- Update model to use RegistryName instead of Registry - Update all code references to use the new field name - Update JSON schemas and OpenAPI spec - Update all examples and documentation - Update seed.json to use registry_name
Resolved conflicts by: - Combining imports in registry_service.go (both errors and fmt/url/strings) - Keeping package validation while adding new versioning logic - Preserving nolint directives for ireturn in auth providers - Using main branch's URLs in examples.md - Updating package structure to use new fields
Remove placeholder repository IDs that were added in this PR
Fix inconsistent indentation for version and nested fields in package objects
- Fix indentation for version fields on lines 243, 296, 443, 517, 561 - Fix indentation for registry_name field on line 294 - Remove extra closing braces and fix array/object structure - Ensure all JSON examples are valid and properly formatted
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
- Fix indentation for entire environment_variables blocks - Fix indentation for package_arguments blocks - Fix closing brace indentation on line 406 - Ensure all JSON examples have consistent 2-space indentation
- Restore original structure in import_test.go while keeping schema changes - Keep migrate-seed test files with only schema field updates - Remove unnecessary nolint directives from auth providers - Maintain only essential changes for PackageType/RegistryName/Identifier schema
- Add OldPackage type to represent legacy package format - Convert from old schema (registry_name/name) to new (package_type/registry_name/identifier) - Use zero timestamps for seed data to distinguish from published data - Update test data files to match expected schema formats
@domdomegg would welcome a review here - I'm not totally sure how these changes ought to fit into the migration script and leaning heavily on Claude there. Re: schema changes, I try to reflect both goals of you and Tadas - providing some structure to ease downstream parsing, but preferring a flattened structure that can be somewhat registry-agnostic.
Changes:
|
Also, I remove the file hash generation / validation workflow from this PR and would love to treat that as a follow-up |
…e_RegistryURL The test was failing because the ServerResponse objects in the paginated endpoint were missing Package data that the test assertions expected. Added the appropriate package definitions to match the test expectations. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Resolved merge conflicts: - internal/service/registry_service.go: Combined package validation with remote URL validation - tools/publisher/main.go: Added Schema field to ServerJSON struct 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Don't worry about migrate-seed! It's just a temporary script we made for ourselves for doing a bit of work with the seed file previously, can (should?) probably be binned. Regarding field naming I have thought a bit more and maybe have a slight minor preference to registry_type. This is because clients will want to download the packages from somewhere, so need to know how (registry_type) and where (registry_name / registry_identifier). I think generally registry_type (e.g. npm) implies package_type (e.g. javascript), so we don't need the latter. This is (sadly) needed because we could actually have multiple registry_names per programming language. E.g. (a confusing point is that often the dominant registry API type mirrors the name of the official registry in that world, e.g. the NPM spec vs the NPM registry, and same for PyPi, etc.) And finally for registry name to avoid ambiguity maybe it'd be best to specify the base url, especially as this makes it easier to swap out for custom ones etc. in subregistries. I'm happy to implement the changes above either on this branch, or another branch leading into this one, if that would make it clearer? |
Thoughts about alts to registry_name, as actually this is a bit weird after staring at the examples when updating them on a branch locally:
I think I am most keen on registry_base_url. That would be:
|
…e_url - Remove package_type field (redundant with registry_type) - Rename package_type → registry_type (indicates HOW to download) - Rename registry_name → registry_base_url (uses actual URLs for clarity) - Remove migrate-seed script (temporary tool no longer needed) - Update all schemas, examples, tests, and API code This provides clearer semantics: registry_type tells clients how to download packages while registry_base_url provides the actual endpoint URL.
have raised joan-anthropic#1 |
…refactor refactor: update package schema to use registry_type and registry_base_url
Motivation and Context
See issue #260
How Has This Been Tested?
Breaking Changes
N/A, registry has not yet been launched
Types of changes
Checklist
Additional context