Local MCP server for analyzing images using the OpenAI vision API. Designed for OpenCode agents to inspect screenshots, diagrams, UI renders, and other visual assets during development.
The server exposes an MCP tool so agents can submit an image file path along with acceptance criteria and get back a detailed vision-based analysis — all locally, no external services beyond the OpenAI API.
Image file on disk → base64 encode → OpenAI Chat Completions (vision model)
↓
Agent query + acceptance criteria → structured prompt → PASS/FAIL verdict per criterion
The image is read from disk, base64-encoded, and sent to OpenAI's vision model (default gpt-4o). The prompt includes acceptance criteria, an optional expected outcome, and optional known tolerances. The model returns a detailed analysis with PASS/FAIL for each criterion.
| Tool | What it does |
|---|---|
analyze_image |
Analyze an image against acceptance criteria using a vision model |
| Parameter | Required | Description |
|---|---|---|
path |
Yes | Absolute path to the image file on disk |
acceptance_criteria |
Yes | Description of what to look for and validate |
expected |
No | Expected outcome or reference description |
known_tolerances |
No | Known tolerances or acceptable deviations |
OPENAI_API_KEY=sk-... uv run --directory tools/vision-mcp python -m vision_mcp.serverAdd to your opencode.json:
Built with curiosity, Python and a lot of AI.
{ "$schema": "https://opencode.ai/config.json", "mcp": { "vision": { "type": "local", "command": ["uv", "run", "--directory", "tools/vision-mcp", "python", "-m", "vision_mcp.server"], "enabled": true, "environment": { "OPENAI_API_KEY": "${OPENAI_API_KEY}", "VISION_MODEL": "gpt-4o" } } } }