feat(afdocs): make docs.atomicmemory.ai agent-ready (llms.txt, mirror, skill.md, MCP) by ethanj · Pull Request #3 · atomicmemory/atomicmemory-docs

ethanj · 2026-05-01T08:59:28Z

Make docs.atomicmemory.ai agent-ready (AFDocs)

Overview

The docs site at https://docs.atomicmemory.ai currently scores 59/100 (grade F) against the AFDocs agent-readiness checks (https://afdocs.dev) — only 7 of 29 checks pass. This PR lands seven of the eight failing checks while staying on GitHub Pages. The remaining failure (Accept: text/markdown content negotiation) is genuinely impossible without moving off GH Pages and is split out as follow-up #1. A second follow-up covers hosting the MCP server at a live HTTP endpoint if AFDocs interprets "MCP Server Discoverable" strictly.

Target post-merge score: ≥85/100 when AFDocs runs against https://docs.atomicmemory.ai.

Key Features

🤖 Per-page llms.txt directive

Remark plugin (src/remark/llms-directive.mjs) injects a blockquote linking to /llms.txt, /llms-full.txt, /skill.md immediately after the first H1. Handles both markdown headings and the OpenAPI plugin's <Heading as="h1"> JSX. Idempotent across HMR/multi-pass builds.
Absolute URLs sidestep Docusaurus's onBrokenLinks: 'throw' (which validates against route metadata, not postBuild artifacts). siteUrl passed in as an explicit option since Docusaurus doesn't auto-inject siteConfig into remark plugins. URLs constructed via new URL() to prevent trailing-slash double-slashes.

📑 llms.txt + llms-full.txt

build/llms.txt (≈12 KB) — spec-compliant per llmstxt.org. H1 + blockquote summary + H2 sections (Get started / Platform / SDK / Integrations / API Reference) + bulleted markdown links with one-sentence descriptions.
build/llms-full.txt (≈243 KB, ~6.2k lines) — full corpus, one entry per route. Driven by the canonical-mirror map so the <route>.md / <route>/index.md mirror pair never produces duplicate corpus entries.
Sections derived from a hand-maintained URL-prefix → label map (small, stable, doesn't depend on internals).

📝 Markdown URL mirror

scripts/mirror-markdown.mjs writes both build/<path>.md and build/<path>/index.md for every route. Hand-authored pages copy the MDX source (front-matter / import / MDX-comment lines stripped). API reference pages render from vendor/atomicmemory-core-openapi.yaml directly (joined to slugs via kebabCase(operationId)) so the mirror carries real parameters, request body, and response shape rather than stripped JSX.

🛠️ Agent skill + MCP descriptor

static/skill.md — Mintlify-style operating guide for agents reading the docs (when to read, how to navigate via /llms.txt / /llms-full.txt / .md URL convention, citation guidance).
static/.well-known/mcp.json + static/mcp.json — MCP descriptor stub for the local-install @atomicmemory/mcp-server. transport.type: "stdio", status: "local-install-only", hostedEndpoint: null pending follow-up docs: add Cursor integration guide #2.

✏️ Content Start Position fixes

Moved > Disambiguation blockquotes on docs/platform/providers.md and docs/sdk/concepts/provider-model.md to a ## Naming section near the bottom of each page.
Removed lede-blocking MDX authoring comments ({/* … */}) from introduction.md, platform/architecture.md, platform/composition.md, platform/observability.md, platform/scope.md, platform/stores.md.

Build pipeline integration

prebuild (regen:api)
  → docusaurus build
       ├── beforeDefaultRemarkPlugins:[llms-directive]   (per-page directive)
       ├── allContentLoaded({ allContent })             (custom plugin captures docs map)
       ├── HTML emission
       └── postBuild plugin:
              1. mirror-markdown.mjs    (build/<route>.md + build/<route>/index.md)
              2. build-llms-txt.mjs     (build/llms.txt + build/llms-full.txt)
              3. build-skill-md.mjs     (verify static/skill.md landed in build/)

GH Actions workflow needs no changes.

Implementation Details

New Files

src/remark/llms-directive.mjs — directive injection remark plugin
src/plugins/llms-and-mirror-plugin.mjs — Docusaurus plugin shell that captures docs content via allContentLoaded and dispatches generators in postBuild
scripts/mirror-markdown.mjs — markdown URL mirror generator
scripts/build-llms-txt.mjs — llms.txt + llms-full.txt generator
scripts/build-skill-md.mjs — skill.md guard
static/skill.md, static/.well-known/mcp.json, static/mcp.json

Modified Files

docusaurus.config.ts — register the custom plugin + remark plugin; centralize SITE_URL / BASE_URL constants
package.json + package-lock.json — add js-yaml + @types/js-yaml (devDeps)
8 docs pages — Content Start Position editorial fixes

Code Quality

Metrics

Files Changed: 19
Insertions: +876 lines
Deletions: -84 lines
All scripts < 400 LOC, all functions < 40 LOC (per workspace rules)

Workspace rules observed

No timing-based solutions
No fallback values (build fails loud if a required option is missing)
No silent error catching (mirror operations propagate failures)
Markdown / config files exempt from the 400-line code limit

Testing

npm run build succeeds with onBrokenLinks: 'throw' intact
npm run typecheck clean
7/7 platform pages contain the directive
31/31 API reference pages contain the directive
build/llms.txt is spec-compliant (H1 + blockquote + H2 sections)
build/llms-full.txt is 243 KB, 6167 lines, one entry per route
build/.well-known/mcp.json parses as valid JSON
136 .md mirror files (68 routes × 2 URL shapes)

Verification expectations

Local build verifies artifact correctness only: presence of files, directive in HTML, mirror routes, valid JSON, shape of llms.txt. The directive emits production absolute URLs (https://docs.atomicmemory.ai/llms.txt) regardless of where the build is served, so AFDocs scoring on a local serve is not authoritative — AFDocs would compare directive host with served host.

Authoritative AFDocs scoring runs against the live site after merge:

npx @afdocs/cli check https://docs.atomicmemory.ai

Expected residual failures:

Content Negotiation (deferred to follow-up docs: update Vercel AI SDK integration #1 — needs Cloudflare Pages / Vercel / Netlify with edge logic)
Possibly MCP Server Discoverable if AFDocs requires a live HTTP endpoint (deferred to follow-up docs: add Cursor integration guide #2 — needs hosted @atomicmemory/mcp-server)

Follow-ups (separate scope, not this PR)

F1 — Hosting migration: move docs.atomicmemory.ai to Cloudflare Pages or Vercel; add edge function for Accept: text/markdown rewriting /foo → /foo.md. Closes Content Negotiation.
F2 — Hosted MCP: deploy @atomicmemory/mcp-server over Streamable HTTP at https://mcp.atomicmemory.ai; update mcp.json transport to http with the live URL. Closes any strict reading of MCP Server Discoverable.

Reviewer note

The plan went through five rounds of Codex review before approval; the resulting decisions are documented in inline code comments where they're load-bearing (broken-link-checker workaround, source vs sourceFilePath field disambiguation, siteUrl injection contract, new URL() vs string-concat for slash normalization, slug ↔ operationId join key).

🤖 Generated with Claude Code

…iptor Make docs.atomicmemory.ai agent-ready against the AFDocs checks (https://afdocs.dev). The site previously scored 59/100 (grade F); this PR lands seven of the eight failing checks. The remaining failure (Accept: text/markdown content negotiation) requires moving off GitHub Pages and is tracked as a follow-up. What landed: - **Per-page llms.txt directive** via remark plugin (src/remark/llms-directive.mjs). Inserts a blockquote with absolute URLs to /llms.txt, /llms-full.txt, /skill.md right after the first H1. Handles both markdown headings and the OpenAPI plugin's `<Heading as="h1">` JSX. Idempotent. Absolute URLs sidestep Docusaurus's onBrokenLinks: 'throw' (which validates against route metadata, not postBuild artifacts). siteUrl is passed in as an explicit option since Docusaurus does not auto-inject siteConfig into remark plugins; URLs constructed via new URL() to avoid trailing-slash double-slashes. - **Custom Docusaurus plugin** (src/plugins/llms-and-mirror-plugin.mjs) that captures docs-plugin content via allContentLoaded and dispatches three generators in postBuild. - **Markdown URL mirror** (scripts/mirror-markdown.mjs) — for each route writes both build/<path>.md and build/<path>/index.md. Hand-authored pages copy from the MDX source (front-matter / import / MDX comment lines stripped). API reference pages render directly from vendor/atomicmemory-core-openapi.yaml (joined to slugs via kebabCase(operationId)) so the mirrored markdown carries real parameters, request body, and response shape rather than stripped JSX. - **llms.txt and llms-full.txt** (scripts/build-llms-txt.mjs) — index is grouped by a hand-maintained URL-prefix → H2-label map, and llms-full is driven from the canonical-mirror map (one entry per route — never walks the filesystem, so the <route>.md / <route>/index.md mirror pair never produces duplicate corpus entries). - **skill.md guard** (scripts/build-skill-md.mjs) — verifies the hand-authored static/skill.md landed in build/. - **Static artifacts** — static/skill.md (agent operating guide), static/.well-known/mcp.json + static/mcp.json (MCP descriptor stub for the local-install @atomicmemory/mcp-server; transport.type: stdio, hostedEndpoint: null pending follow-up infra work). - **Editorial fixes** for content-start-position offenders — moved Disambiguation blockquotes on docs/platform/providers.md and docs/sdk/concepts/provider-model.md to a `## Naming` section near the bottom; removed lede-blocking MDX authoring comments from introduction.md, platform/architecture.md, platform/composition.md, platform/observability.md, platform/scope.md, and platform/stores.md. Build pipeline order: prebuild (regen:api) → docusaurus build → beforeDefaultRemarkPlugins [llms-directive] → allContentLoaded (capture docs map) → HTML → postBuild {mirror-markdown, build-llms-txt, build-skill-md}. Out of scope (deferred): - Accept: text/markdown content negotiation — needs Cloudflare Pages / Vercel / Netlify with edge logic. - Hosted HTTP MCP endpoint — needs deploying @atomicmemory/mcp-server over Streamable HTTP. mcp.json declares status: local-install-only for now. Verification: - npm run build succeeds with onBrokenLinks: 'throw' intact - npm run typecheck clean - 7/7 platform pages and 31/31 API reference pages have the directive - build/llms.txt is spec-compliant (H1 + blockquote + H2 sections) - build/llms-full.txt = 243KB, 6167 lines, one entry per route - build/.well-known/mcp.json parses - 136 .md mirror files (68 routes × 2 URL shapes)

… at /index.md Two follow-ups from Codex review on PR #3: 1. `atomicmemory-http-api.md` (the rolled-up API overview page) was falling through to readSourceMdx() which leaves JSX intact. The source `.info.mdx` is almost entirely <span> / <Heading> / <div> JSX, producing a useless mirror. Detect `*.info.mdx` source files and render directly from `spec.info` in the vendored OpenAPI YAML (title, version, description, license, optional contact). 2. The Introduction entry in llms.txt linked to the bare site URL (https://docs.atomicmemory.ai/) instead of the markdown mirror. Since content negotiation is deferred, that entry would still require HTML scraping. Point it at /index.md so every llms.txt bullet is consistently a `.md` link.

Codex flagged that `npm run build` was dirtying the 29 committed generated `*.api.mdx` files because `prebuild` ran `regen:api` on every build. The diff was only the compressed `api:` frontmatter blob (non-deterministic re-encoding), but it broke build reproducibility: running the documented build path against a clean tree always left the worktree modified. The committed `.api.mdx` files are intended artifacts — refreshed explicitly via `npm run vendor:spec` + `npm run regen:api` whenever the upstream OpenAPI spec changes (documented in scripts/vendor-core- spec.mjs:14, "Next: run 'npm run regen:api' and commit the refreshed .mdx artifacts"). Drop the auto-regen hooks so `npm run build` and `npm run start` run only Docusaurus, leaving committed sources alone. Spec-refresh workflow stays the same: 1. npm install @atomicmemory/atomicmemory-core@<version> 2. npm run vendor:spec 3. npm run regen:api 4. Commit the refreshed `vendor/atomicmemory-core-openapi.yaml` + `docs/api-reference/http/*.api.mdx` together. Verification: a second `npm run build` against a clean tree leaves the worktree clean (only this `package.json` change dirty against HEAD). The directive still lands on all 31 API reference pages.

ethanj added 3 commits May 1, 2026 01:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(afdocs): make docs.atomicmemory.ai agent-ready (llms.txt, mirror, skill.md, MCP)#3

feat(afdocs): make docs.atomicmemory.ai agent-ready (llms.txt, mirror, skill.md, MCP)#3
ethanj wants to merge 3 commits intomainfrom
docs/agent-ready-afdocs

ethanj commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ethanj commented May 1, 2026

Make docs.atomicmemory.ai agent-ready (AFDocs)

Overview

Key Features

🤖 Per-page llms.txt directive

📑 llms.txt + llms-full.txt

📝 Markdown URL mirror

🛠️ Agent skill + MCP descriptor

✏️ Content Start Position fixes

Build pipeline integration

Implementation Details

New Files

Modified Files

Code Quality

Metrics

Workspace rules observed

Testing

Verification expectations

Follow-ups (separate scope, not this PR)

Reviewer note

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant