Skip to content

Feature: add download_artifact tool; revert get_artifact to metadata-only #265

@nuozhoux

Description

@nuozhoux

Summary

The existing artifact tooling has a gap that breaks agent workflows requiring actual artifact content:

  • list_artifacts_for_job / list_artifacts_for_build return artifact metadata including both url (metadata endpoint) and download_url (download endpoint) ✓
  • get_artifact currently downloads full file content via DownloadArtifactByURL and returns it base64-encoded — loading binary file content into the MCP/LLM context is impractical for large files
  • There is no tool that downloads an artifact to a local file path using the server's own authenticated session

Agents fall back to shelling out:

curl -sL -H "Authorization: Bearer $BUILDKITE_API_TOKEN" "<download_url>" -o /tmp/artifact

This requires users to separately provision $BUILDKITE_API_TOKEN with read_artifacts scope — defeating the point of the MCP server, which already holds authenticated Buildkite credentials.


REST API background

There are two distinct endpoints for artifacts (REST API docs):

1. Get artifact metadata

GET /v2/organizations/{org.slug}/pipelines/{pipeline.slug}/builds/{build.number}/jobs/{job.id}/artifacts/{artifact.id}

Returns: id, filename, path, state, sha1sum, file_size, mime_type, url, download_url

2. Download artifact

GET /v2/organizations/{org.slug}/pipelines/{pipeline.slug}/builds/{build.number}/jobs/{job.id}/artifacts/{artifact.id}/download

Returns: HTTP 302 redirect to the actual file (e.g. a presigned S3 URL). Requires following the redirect to get file content.

list_artifacts_for_job already returns both url (metadata endpoint) and download_url (download endpoint) for each artifact, so agents have both available without an extra round-trip.


Proposed changes

1. Revert get_artifact to true metadata-only behavior

Align the implementation with the description ("Get detailed information about a specific artifact including its metadata, file size, SHA-1 hash, and download URL") and with the REST API's metadata endpoint.

Revised args:

type GetArtifactArgs struct {
    URL string `json:"url"` // the artifact's `url` field from list_artifacts_for_job (metadata endpoint, NOT download_url)
}

Behavior: HTTP GET to the metadata endpoint (the url field from artifact list responses), return structured metadata — not file content.

Returns:

{
  "id": "abc123",
  "filename": "coverage.json",
  "path": "coverage.json",
  "state": "finished",
  "sha1sum": "622b2946b93f40736b3ac767dafc13ded982e978",
  "file_size": 4012,
  "mime_type": "application/json",
  "url": "https://api.buildkite.com/v2/organizations/{org}/pipelines/{pipeline}/builds/{build}/jobs/{job}/artifacts/{id}",
  "download_url": "https://api.buildkite.com/v2/organizations/{org}/pipelines/{pipeline}/builds/{build}/jobs/{job}/artifacts/{id}/download"
}

2. Add a new download_artifact tool

Uses the download endpoint (the download_url field from artifact list responses), follows the redirect to the actual file, and writes content to a local path. File content never enters the MCP/LLM context.

Args:

type DownloadArtifactArgs struct {
    URL        string `json:"url"`         // the artifact's `download_url` field from list_artifacts_for_job (download endpoint)
    OutputPath string `json:"output_path"` // local filesystem path to write the file to
}

Behavior: HTTP GET to the download endpoint (follows 302 redirect to S3/storage), stream response body to output_path using the server's own authenticated Buildkite client. Reuses the existing DownloadArtifactByURL method, writing to an os.File instead of a bytes.Buffer.

Returns:

{
  "output_path": "/tmp/artifacts/coverage.json",
  "bytes_written": 4012,
  "status": "200 OK",
  "statusCode": 200
}

Tool surface after this change

Tool REST API endpoint used Input url field Returns
list_artifacts_for_job GET /v2/.../jobs/{job.id}/artifacts List of artifact objects (each has url + download_url)
get_artifact GET /v2/.../artifacts/{artifact.id} url field from artifact object Metadata only (id, filename, sha1sum, file_size, mime_type, download_url)
download_artifact (new) GET /v2/.../artifacts/{artifact.id}/download → 302 download_url field from artifact object {output_path, bytes_written, status} — file written to disk

Real-world impact

A common agentic CI-debugging pattern is:

  1. Call list_artifacts_for_job → get artifact list with url and download_url per file
  2. Download artifact content to read/parse it (e.g. JSON reports, log files, compressed archives)

Because no MCP tool can download to a file through the server's own auth, agents that need to inspect artifact content are forced out of the MCP boundary entirely. With download_artifact, the full workflow stays within the server's authenticated session:

list_artifacts_for_job → download_artifact(download_url, "/tmp/report.json") → read file selectively

No extra token setup required.


References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions