Skip to content

Conversation

joan-anthropic
Copy link
Contributor

@joan-anthropic joan-anthropic commented Aug 21, 2025

Motivation and Context

  • Add MCPB as a new registry_name option
  • Add file_hashes field to package model for integrity verification
  • Implement URL validation restricting MCPB packages to GitHub/GitLab hosts
  • Require SHA-256 hash for all MCPB packages
  • Add example MCPB package in documentation
  • Update schemas and OpenAPI specification

See issue #260

How Has This Been Tested?

Breaking Changes

N/A, registry has not yet been launched

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

- Add MCPB as a new registry_name option
- Add file_hashes field to Package model for integrity verification
- Implement URL validation restricting MCPB packages to GitHub/GitLab hosts
- Require SHA-256 hash for all MCPB packages
- Add example MCPB package in documentation
- Update schemas and OpenAPI specification

Implements Option 1 from issue modelcontextprotocol#260 - MCPB as an additional package type
pointing to hosted .mcpb files with cryptographic verification.
}

// Allowlist of trusted hosts for MCPB packages
allowedHosts := []string{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to verify

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@joan-anthropic joan-anthropic marked this pull request as draft August 21, 2025 18:53
allowedHosts := []string{
"github.com",
"www.github.com",
"raw.githubusercontent.com",
Copy link
Member

@domdomegg domdomegg Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly this one (raw.githubusercontent.com) is not necessary

domdomegg
domdomegg previously approved these changes Aug 21, 2025
Copy link
Member

@domdomegg domdomegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Would like a review from one of @tadasant or @toby to verify they are happy with the direction and server.json spec implementation too :)

Copy link
Member

@tadasant tadasant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little uneasy about shoehorning registry_name and name to this purpose, while we still have the opportunity to make pre-release breaking changes. Thoughts on the proposed clearer modeling?

"packages": [
{
"registry_name": "mcpb",
"name": "https://github.com/modelcontextprotocol/text-editor-mcpb/releases/download/v1.0.2/text-editor.mcpb",
Copy link
Member

@tadasant tadasant Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is right with our current schema. But a thought for potentially renaming some details here: name doesn't really make sense for a URL like this, and registry_name doesn't for mcpb either (it's more of a package type).

I think I'd be in favor of something more like this here:

{
      "location": {
        "type": "mcpb",
        "url": "https://github.com/modelcontextprotocol/text-editor-mcpb/releases/download/v1.0.2/text-editor.mcpb"
      },
      "runtime_hint": ...
      ...
}

And that's it - drop version, because the only reason we have version is to effectively "construct" the url in the registry_name case.

Basically, my proposal is to add a nested PackageLocation object inside here, where we currently just store package_name, name, and version in the top level of the Package object. It would be an "OR" situation where either the original fields (registry_name, name, and version) or the new fields (type, url) are required. mcpb is the only valid entry for type.

Then the original case for other examples, like the one just before this, becomes:

  "packages": [
    {
      "location": {
        "registry_name": "npm",
        "name": "@example/hybrid-mcp-server",
        "version": "1.5.0"
      },
      "runtime_hint": "npx",
      "package_arguments": [
        {
          "type": "named",
          "name": "--mode",
          "description": "Operation mode",
          "default": "local",
          "choices": ["local", "cached", "proxy"]
        }
      ]
    }
  ]

It's a breaking change but I think at this point, pre-release, is a good time to make this rather than shoehorn values into registry_name and name. What do you all think? cc @domdomegg @joan-anthropic @toby

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tadasant thanks for the quick turnaround. Agree that registry_name and name feel a bit overloaded in the current proposal - I like the distinction.

It'd be nice to keep version as a separate field - to better enable future auto-update workflows for DXTs. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow the use case, could you explain auto-update workflows for DXTs? Is there a reason you wouldn't be able to use the https://github.com/modelcontextprotocol/text-editor-mcpb/releases/download/v1.0.2/text-editor.mcpb value for that (which has the version in it)?

We've had some feedback from @toby that all these version fields throughout the various server.json levels are causing confusion, so my thinking is that this is a good opportunity to potentially eliminate one of them as it's not actually necessary (but maybe I'm missing something).

Copy link
Member

@domdomegg domdomegg Aug 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Maybe package_type and package_identifier seems good? From question on the docker runtime type #249 (reply in thread). I think also as a meta thing I'd prefer them to stay in the package object rather than a nested location object, because I think unnecessary nesting is painful for many ecosystems (e.g. parsing nested structs like this in Java creates a lot of boilerplate).

So together I think I might propose something like:

"packages": [
    {
      "package_type": "mcpb",
      "package_identifier": "https://github.com/modelcontextprotocol/cool-mcp/releases/download/v1.2.3/cool-mcp.mcpb",
      // there's something that aesthetically feels a little worse about this not being an object, but I think it will make it easier on consumers. and very unlikely to really need other hash functions unless sha256 is broken
      "package_hash_sha256": "fe333e598595000ae021bd27117db32ec69af6987f507ba7a63c90638ff633ce"
    },
    {
      "package_type": "docker",
      // docker implicitly has registry name in package_identifier, I think we should just have package_identifier because of this
      "package_identifier": "ghcr.io/domdomegg/cool-mcp:1.2.3@sha256:4a7f307d128da53e99ea7f45064df440c742aa3f648a09a09b51dfd1cd55378b"
    },
    {
      "package_type": "commonjs",
      // i think for now only allowlist npmjs. but subregistries could use e.g. https://npm.pkg.github.com or https://artifactory.corp.internal/npm
      "registry_name": "https://registry.npmjs.org/",
      "package_identifier": "@domdomegg/[email protected]",
      "runtime_hint": "npx",
      "package_arguments": [
        {
          "type": "named",
          "name": "--mode",
          "description": "Operation mode",
          "default": "local",
          "choices": ["local", "cached", "proxy"]
        }
      ]
    }
  ]
  • I think I agree with @tadasant that the version is not necessary in the packages array for DXT. There will still be a version at the top level in server.json. For auto-update I think we'd use this top-level version, and maybe is_latest.

Comment on lines 411 to 413
"file_hashes": {
"sha-256": "fe333e598595000ae021bd27117db32ec69af6987f507ba7a63c90638ff633ce"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider lifting this inside the PackageLocation object I am proposing above, because it is only a concern relevant to non-package-registry entries?

Copy link
Member

@tadasant tadasant Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is the possibility we add some registry in the future that does not do its own checks. So I'm fine keeping it here as an optional field.

I assume we need to rely on the MCP Registry to generate this file hash at publish-time, right? Or is the idea it will get generated by the CLI tool en route to creating a server.json, and the registry just validates it's correct and not tampered with in the publish flow?

I think we should map out and document this feature in a little more detail, maybe a dedicated doc in docs/ explaining the flows, who's responsible for what. I worry a bit about how file_hashes is something we are expecting in the immutable server.json file, but also need to verify they're actually valid hashes. Seems doable but would like to see a step by step explanation somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opinionated here and will defer to you all re: what makes sense for the workflow the registry committee would prefer to support; happy add docs to reflect.

- Add hash generation during publish (can be disabled with --no-hash)
- Add verify command to check existing hashes
- Add hash-gen command to generate/update hashes
- Support NPM and direct URL (MCPB) packages
- Document implementation in docs/file-hashes.md

The CLI tool now generates SHA-256 hashes at publish time.
Registry validates these hashes to ensure package integrity.
@tadasant
Copy link
Member

tadasant commented Aug 22, 2025

@joan-anthropic another thought just came to me (sorry for the thrashing): what if instead of PackageLocation we just unified the registry-based location and the arbitrary file-upload based one, but rather than shoehorning into registry_name and name, we refactored those to be agnostic of the registries?

For example, this old format:

    {
      "registry_name": "npm",
      "name": "@example/hybrid-mcp-server",
      "version": "1.5.0",
      "runtime_hint": "npx",
      "package_arguments": [
        {
          "type": "named",
          "name": "--mode",
          "description": "Operation mode",
          "default": "local",
          "choices": ["local", "cached", "proxy"]
        }
      ]
    }

Becomes:

    {
      "url": "https://www.npmjs.com/package/@example/hybrid-mcp-server/v/1.5.0",
      "type": "javascript",
      "runtime_hint": "npx",
      "package_arguments": [
        {
          "type": "named",
          "name": "--mode",
          "description": "Operation mode",
          "default": "local",
          "choices": ["local", "cached", "proxy"]
        }
      ]
    }

So then we have just one way of doing things.

We can still properly sanitize and logic branch on npmjs.com as if registry_name=npm in everything we care about downstream.

Running out of time here today on my end to fully think through this, but this seems like a potentially cleaner approach here and drawbacks aren't coming to top of mind.

I also like how it helps us remove the schema-level dependency on enumerating all these branded centralized registry names npm, pypi; that just becomes a in-code validation-level concern.

Curious for @domdomegg's take here

@domdomegg
Copy link
Member

what if instead of PackageLocation we just unified the registry-based location and the arbitrary file-upload based one, but rather than shoehorning into registry_name and name, we refactored those to be agnostic of the registries?

Yep I like this! I ended up commenting something similar before I scrolled far enough down to see this.

I think I prefer the direction of this approach. I don't mind too much between a URL and slightly more structure (e.g. package type, registry name, identifier). I think I have a weak preference towards the slightly more structure just because I think clients might have an easier time parsing it to programmatically get packages etc. But fine to go with URL, especially if we provide guidance on the formats of URLs. Perhaps also some small SDKs in common languages for parsing and running server.jsons, as I imagine clients will want something like that anyway.

@joan-anthropic joan-anthropic marked this pull request as ready for review August 26, 2025 05:05
Resolved conflicts to integrate PackageLocation changes with new server wrapper format
- Add LegacyPackage type to migrate-seed tool to handle old package format
- Fix conversion from legacy registry_name/name to new PackageLocation format
- Fix test data in import_test.go to include packages in ServerResponse
- Fix field name typo: RuntimeHint -> RunTimeHint
- Remove unused server detail handlers in test
@joan-anthropic
Copy link
Contributor Author

@domdomegg I took @tadasant 's suggestion here but happy to replace with package_type, registry_name etc if preferred - still trying to figure out test/lint failures but just want to make sure the broad structure looks good to you

- Replace fmt.Print with log for console output
- Make 'docker' a constant (packageTypeDocker)
- Fix file permissions from 0644 to 0600
- Convert if-else chain to switch statement
- Fix indent-error-flow by removing unnecessary else
- Add nolint comment for Docker tag format (not a port)
- Fixed indentation issues in repository and version_detail blocks
- Corrected array/object closing brackets for packages and remotes
- Ensured consistent 2-space indentation throughout all examples
- All 9 examples now pass JSON validation
- Removed file hash generation from publisher tool
- Removed MCPB hash validation requirement from registry service
- Removed file-hashes.md implementation guide
- Kept file_hashes as optional field in schema for external tools
- Publisher tool now only handles schema changes (PackageLocation)
@joan-anthropic
Copy link
Contributor Author

Sorry - I'm missing a few things here; please hold off on review!

…entifier fields

- Replace overloaded registry_name and name fields with semantic package_type, registry, and identifier fields
- Update all JSON schemas and OpenAPI specifications to reflect new structure
- Convert all 396 seed.json entries to new format
- Update publisher tool to generate new structure
- Update all examples and documentation
- Fix all tests to use new field names
- Update validation logic for new structure
Preserve the original \u003c and \u003e Unicode escapes for < and > characters
- Update model to use RegistryName instead of Registry
- Update all code references to use the new field name
- Update JSON schemas and OpenAPI spec
- Update all examples and documentation
- Update seed.json to use registry_name
Resolved conflicts by:
- Combining imports in registry_service.go (both errors and fmt/url/strings)
- Keeping package validation while adding new versioning logic
- Preserving nolint directives for ireturn in auth providers
- Using main branch's URLs in examples.md
- Updating package structure to use new fields
Remove placeholder repository IDs that were added in this PR
Fix inconsistent indentation for version and nested fields in package objects
- Fix indentation for version fields on lines 243, 296, 443, 517, 561
- Fix indentation for registry_name field on line 294
- Remove extra closing braces and fix array/object structure
- Ensure all JSON examples are valid and properly formatted
@codecov-commenter
Copy link

codecov-commenter commented Aug 26, 2025

Codecov Report

❌ Patch coverage is 2.32558% with 42 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/service/registry_service.go 0.00% 42 Missing ⚠️

📢 Thoughts on this report? Let us know!

- Fix indentation for entire environment_variables blocks
- Fix indentation for package_arguments blocks
- Fix closing brace indentation on line 406
- Ensure all JSON examples have consistent 2-space indentation
- Restore original structure in import_test.go while keeping schema changes
- Keep migrate-seed test files with only schema field updates
- Remove unnecessary nolint directives from auth providers
- Maintain only essential changes for PackageType/RegistryName/Identifier schema
- Add OldPackage type to represent legacy package format
- Convert from old schema (registry_name/name) to new (package_type/registry_name/identifier)
- Use zero timestamps for seed data to distinguish from published data
- Update test data files to match expected schema formats
@joan-anthropic
Copy link
Contributor Author

@domdomegg would welcome a review here - I'm not totally sure how these changes ought to fit into the migration script and leaning heavily on Claude there.

Re: schema changes, I try to reflect both goals of you and Tadas - providing some structure to ease downstream parsing, but preferring a flattened structure that can be somewhat registry-agnostic.


  Before:
  {
    "registry_name": "npm",
    "name": "@21st-dev/magic"
  }

  After:
  {
    "package_type": "javascript",
    "registry_name": "npm",
    "identifier": "@21st-dev/magic"
  }

  {
    "package_type": "mcpb",
    "registry_name": "github-releases",
    "identifier": "https://github.com/example/mcp-bundle/releases/download/v1.0.0/bundle.mcpb"
  }


Changes:

  1. Added package_type field - Explicitly identifies the package ecosystem (javascript, python, docker, dotnet, mcpb, binary)
  2. Renamed name to identifier - More clearly indicates this is the package identifier within its registry
  3. Kept registry_name - Specifies which registry hosts the package (npm, pypi, docker-hub, nuget, github-releases, etc.) - would love a sense-check here whether eg github-releases is reasonable package ID for mcpb, else can remove

@joan-anthropic
Copy link
Contributor Author

Also, I remove the file hash generation / validation workflow from this PR and would love to treat that as a follow-up

joan-anthropic and others added 2 commits August 26, 2025 12:44
…e_RegistryURL

The test was failing because the ServerResponse objects in the paginated endpoint
were missing Package data that the test assertions expected. Added the appropriate
package definitions to match the test expectations.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Resolved merge conflicts:
- internal/service/registry_service.go: Combined package validation with remote URL validation
- tools/publisher/main.go: Added Schema field to ServerJSON struct

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@domdomegg
Copy link
Member

Don't worry about migrate-seed! It's just a temporary script we made for ourselves for doing a bit of work with the seed file previously, can (should?) probably be binned.

Regarding field naming I have thought a bit more and maybe have a slight minor preference to registry_type. This is because clients will want to download the packages from somewhere, so need to know how (registry_type) and where (registry_name / registry_identifier). I think generally registry_type (e.g. npm) implies package_type (e.g. javascript), so we don't need the latter.

This is (sadly) needed because we could actually have multiple registry_names per programming language. E.g. npm or jsr for javascript, or pypi vs conda in python land.

(a confusing point is that often the dominant registry API type mirrors the name of the official registry in that world, e.g. the NPM spec vs the NPM registry, and same for PyPi, etc.)

And finally for registry name to avoid ambiguity maybe it'd be best to specify the base url, especially as this makes it easier to swap out for custom ones etc. in subregistries.


I'm happy to implement the changes above either on this branch, or another branch leading into this one, if that would make it clearer?

@domdomegg
Copy link
Member

Thoughts about alts to registry_name, as actually this is a bit weird after staring at the examples when updating them on a branch locally:

  1. registry_name - This does sound like it should be a name rather than a URL, so I can see why they're concerned
  2. registry_identifier - This could be confusing as it might sound like it's identifying the package on the registry rather than the registry itself
  3. Other options could be:
    - registry_url - Most explicit about what it contains
    - registry_base_url - Very clear but a bit verbose
    - registry_endpoint - Technical but clear
    - registry_host - Simple and clear

I think I am most keen on registry_base_url. That would be:

…e_url

- Remove package_type field (redundant with registry_type)
- Rename package_type → registry_type (indicates HOW to download)
- Rename registry_name → registry_base_url (uses actual URLs for clarity)
- Remove migrate-seed script (temporary tool no longer needed)
- Update all schemas, examples, tests, and API code

This provides clearer semantics: registry_type tells clients how to download
packages while registry_base_url provides the actual endpoint URL.
@domdomegg
Copy link
Member

have raised joan-anthropic#1

…refactor

refactor: update package schema to use registry_type and registry_base_url
@domdomegg domdomegg merged commit 57efb58 into modelcontextprotocol:main Aug 27, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants