Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/lint-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ on:
push:
branches:
- main
- develop
- dev
paths:
- '**.md'
- '.markdownlint.json'
Expand Down
29 changes: 29 additions & 0 deletions ROADMAP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Roadmap

The APS roadmap lives in [`plans/index.aps.md`](plans/index.aps.md) — we use APS
to plan APS.

## Quick Overview

| Horizon | Focus | Status |
|---------|-------|--------|
| **v0.2 Usability** | Scaffold, templates, docs, validation | Done |
| **v0.3 Distribution** | Install overhaul, multi-harness agents | Current |
| **Future** | GitHub Action, VS Code extension, formal spec | Planned |

See [plans/index.aps.md](plans/index.aps.md) for the full breakdown with modules,
status, and work items.

## Non-Goals

These are explicitly out of scope:

- **Execution engines** — APS describes intent; it doesn't run code
- **Vendor plugins** — No Jira/Linear/Notion plugins (specs are portable markdown)
- **AI training** — Not a dataset for model fine-tuning
- **Hosted services** — No cloud component; everything runs locally

## Contributing

Have ideas for the roadmap? [Open an issue](https://github.com/EddaCraft/anvil-plan-spec/issues)
to discuss, or submit a PR updating [plans/index.aps.md](plans/index.aps.md).
21 changes: 15 additions & 6 deletions bin/aps
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
# APS CLI - Anvil Plan Spec tooling
#
# Usage:
# aps init [dir] Create APS structure in a new project
# aps update [dir] Update templates, skill, and commands
# aps init [dir] Create APS structure in a new project (v2 layout)
# aps update [dir] Update templates, skill, and tool files
# aps migrate [dir] Convert v1 layout to v2
# aps lint [file|dir] Validate APS documents (default: plans/)
# aps lint --json Output as JSON
# aps --help Show this help
Expand Down Expand Up @@ -38,8 +39,9 @@ show_help() {
aps - Anvil Plan Spec CLI

Usage:
aps init [dir] Create APS structure in a new project
aps update [dir] Update templates, skill, and commands
aps init [dir] Create APS structure in a new project (v2 layout)
aps update [dir] Update templates, skill, and tool files
aps migrate [dir] Convert v1 layout to v2 (.aps/ consolidation)
aps lint [file|dir] Validate APS documents
aps lint --json Output results as JSON
aps --help Show this help
Expand All @@ -52,8 +54,11 @@ Environment:
APS_VERSION Git ref to download from (default: main)

Examples:
aps init # Init in current directory
aps update # Update templates and skill
aps init # Interactive wizard (v2 layout)
aps init --profile solo --scope small --tools claude-code
aps update # Update templates and tool files
aps migrate # Convert v1 -> v2
aps migrate --dry-run # Preview migration
aps lint # Lint plans/ directory
aps lint plans/index.aps.md # Lint specific file
aps lint . --json # Lint current dir, JSON output
Expand All @@ -72,6 +77,10 @@ main() {
shift
cmd_update "$@"
;;
migrate)
shift
cmd_migrate "$@"
;;
lint)
shift
cmd_lint "$@"
Expand Down
167 changes: 167 additions & 0 deletions docs/agent-testing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# Agent Cross-Harness Test Plan

Test plan for AGENT-006: verifying APS agents work correctly in each tool's
environment.

## Test Matrix

| Tool | Agent Format | Test Method | Status |
|------|-------------|-------------|--------|
| Claude Code | `.claude/agents/*.md` | Task dispatch in live project | Validated |
| Codex | `.codex/agents/*.toml` + config | `/agent spawn` | Manual (needs Codex) |
| Copilot | `.github/agents/*.md` | Agent discovery | Manual (needs Copilot) |
| OpenCode | `.opencode/agents/*.md` | `@mention` invocation | Manual (needs OpenCode) |
| Gemini | `.gemini/skills/*/SKILL.md` | `gemini skills link` | Manual (needs Gemini) |

## Automated Validation (Complete)

### Build Script

- [x] `build.sh` runs without errors
- [x] `build.sh` is idempotent (running twice produces identical output)
- [x] All 14 output files generated (2 core + 2 Claude Code + 2 Copilot + 2
OpenCode + 3 Codex + 2 Gemini verified)

### Format Validation

**Claude Code:**

- [x] YAML frontmatter: `name`, `description`, `model`, `tools`
- [x] Model values use valid shorthand (`opus`, `sonnet`)
- [x] Tools list matches expected (planner: +Task, librarian: no Task)

**Copilot:**

- [x] YAML frontmatter: `name`, `description` only
- [x] No unsupported fields (model, tools)
- [x] Body identical to Claude Code variant

**OpenCode:**

- [x] YAML frontmatter: `description`, `mode`, `model`, `steps`, `tools`,
`permission`
- [x] `mode: subagent` (not primary)
- [x] Model uses `provider/model-id` format (`anthropic/claude-opus-4-6`,
`anthropic/claude-sonnet-4-6`)
- [x] No `name` field (filename-derived)
- [x] Permission maps set dangerous tools to `"ask"`

**Codex:**

- [x] TOML format with `sandbox_mode` and `developer_instructions`
- [x] Config snippet has correct `[agents.*]` blocks
- [x] `o4-mini` model (OpenAI — commented for clarity)
- [x] Developer instructions contain full core prompt

**Gemini:**

- [x] Pure markdown (no YAML frontmatter)
- [x] Self-contained (condensed, not a core prompt copy)
- [x] Covers key responsibilities in skill-appropriate format

### Content Validation

**Planner (all variants):**

- [x] Project init
- [x] Index/module/work-item creation
- [x] Status tracking
- [x] Work item execution
- [x] Wave-based parallel coordination
- [x] Action plan support
- [x] References `plans/` paths (D-017 compliance)
- [x] Does not duplicate SKILL.md content

**Librarian (all variants):**

- [x] Archiving completed modules
- [x] Orphan detection
- [x] Cross-reference maintenance
- [x] Stale doc flagging
- [x] References `plans/` paths (D-017 compliance)

## Manual Test Procedures

### Claude Code

```bash
# 1. Copy agents to test project
cp scaffold/agents/claude-code/aps-planner.md /tmp/test-project/.claude/agents/
cp scaffold/agents/claude-code/aps-librarian.md /tmp/test-project/.claude/agents/

# 2. Dispatch planner via Task tool
# Ask: "What's the plan status?"
# Expect: Agent reads plans/, reports module statuses

# 3. Dispatch librarian
# Ask: "Audit the repo for orphaned files"
# Expect: Agent scans plans/, reports findings
```

### Codex

```bash
# 1. Place agent files
cp scaffold/agents/codex/aps-planner.toml /tmp/test-project/.codex/agents/
cp scaffold/agents/codex/aps-librarian.toml /tmp/test-project/.codex/agents/
# Merge codex-config-snippet.toml into .codex/config.toml

# 2. Spawn planner
# /agent spawn aps-planner
# Ask: "What's the plan status?"

# 3. Spawn librarian
# /agent spawn aps-librarian
# Ask: "Audit the repo"
```

### Copilot

```bash
# 1. Place agent files
cp scaffold/agents/copilot/aps-planner.md /tmp/test-project/.github/agents/
cp scaffold/agents/copilot/aps-librarian.md /tmp/test-project/.github/agents/

# 2. In Copilot Chat, agents should appear as available
# 3. Invoke @aps-planner and @aps-librarian
```

### OpenCode

```bash
# 1. Place agent files
cp scaffold/agents/opencode/aps-planner.md /tmp/test-project/.opencode/agents/
cp scaffold/agents/opencode/aps-librarian.md /tmp/test-project/.opencode/agents/

# 2. Switch to subagent via Tab or @aps-planner
# 3. Ask for plan status
```

### Gemini

```bash
# 1. Place skill files
cp -r scaffold/agents/gemini/aps-planner /tmp/test-project/.gemini/skills/
cp -r scaffold/agents/gemini/aps-librarian /tmp/test-project/.gemini/skills/

# 2. Link skills
# gemini skills link . --scope workspace

# 3. Activate skill in conversation
```

## Issues Found and Fixed

1. **Stale OpenCode model IDs** — Updated from `claude-opus-4-20250514` /
`claude-sonnet-4-20250514` to `claude-opus-4-6` / `claude-sonnet-4-6`
2. **Missing vendor comment** — Added inline comment to Codex config snippet
clarifying `o4-mini` is an OpenAI model

## Notes

- Full end-to-end testing of non-Claude-Code tools requires those tools
installed. The automated validation covers everything that can be checked
without the tools: file format, content correctness, build reproducibility.
- Claude Code agents were validated live (format + content + dispatch readiness).
- The Gemini planner skill intentionally omits wave-based execution detail —
this is appropriate condensation for the skill format.
Loading
Loading