diff --git a/skills/.experimental/product-init/README.md b/skills/.experimental/product-init/README.md new file mode 100644 index 00000000..4ffa2fc7 --- /dev/null +++ b/skills/.experimental/product-init/README.md @@ -0,0 +1,130 @@ +# product-init + +**AI sets the goal. product-init makes sure you're shooting at the right one.** + +> "Vibe-coding without a product spec isn't moving fast. It's building the wrong thing at the speed of AI." + +Codex ships `/goal`. Cursor ships `/build`. Every AI tool now moves faster. +43% of startups still die from the same cause: wrong product. + +product-init is the gate before the gate — 9 hard stops between your idea and your deploy, each blocked by a Python audit script with real pass/fail criteria. CRITICAL findings stop the pipeline. No `--skip` flag. + +Works on Claude Code, Codex CLI, and OpenClaw/Hermes. + +--- + +## How it works + +``` +/product-init "build an HR assessment tool" +``` + +Three questions. AI drafts the rest. 9 gates run in sequence. + +``` +Gate 1 Discovery Constitution JTBD + four-risk model +Gate 2 Statement of Work Shape Up appetite + PR-FAQ +Gate 3 Design Every screen maps to a job from Gate 1 +Gate 4 Build Commit-to-AC, no orphan TODOs +Gate 5 QA Unit + integration + E2E — all green +Gate 6 UAT Real human signs off on real URL +Gate 7 Deploy HTTP 200 to prod, smoke job, rollback drill +Gate 8 Handoff ADRs + runbook + DEBT.md — a contract, not a README +Gate 9 Warranty 72h monitoring window: error rate, latency, uptime +``` + +Gate 1 is the one that matters most. It asks: *who gets fired if this fails, what job are they hiring it for, and what does failure look like in production?* +That's the goal you're shooting at. Everything else is build speed. + +--- + +## Install + +```bash +curl -sSL https://raw.githubusercontent.com/mturac/product-init/main/install.sh | bash +``` + +That's it. The script detects which AI tools you have installed (Claude Code, Codex CLI, OpenClaw) and installs there automatically. Works on all of them at once if you have multiple. + +--- + +## Usage + +```bash +# Bootstrap a new project +python3 scripts/orchestrator.py --project-dir /path/to/project init "your idea" + +# Run a specific gate +python3 scripts/orchestrator.py --project-dir /path/to/project gate 1 + +# Run all audits +python3 scripts/orchestrator.py --project-dir /path/to/project audit --json +``` + +Every audit accepts `--project-dir` and `--json`. JSON output: `{ findings: [...], exit_code: 0|1 }`. + +--- + +## Runtime support + +| Runtime | Adapter | Install dir | +|---------|---------|-------------| +| Claude Code | `runtime/claude-code.md` | `~/.claude/skills/product-init` | +| Codex CLI | `runtime/codex.md` | `~/.codex/skills/product-init` | +| OpenClaw + Hermes | `runtime/openclaw.md` | `~/.openclaw/skills/product-init` | + +Orchestrator auto-detects path: `$PRODUCT_INIT_SKILL_DIR` → `~/.codex/` → `~/.openclaw/` → `~/.claude/`. + +--- + +## What ships at the end + +- `PRODUCT.md` — golden path, persona, outcome metric, kill criteria +- `SPEC.md` — scope, acceptance criteria +- `PLAN.md` — Shape Up pitch, appetite, deferred list +- `TASKS.md` — golden path tasks only (filter blocks scope creep) +- `COMPETITIVE_BENCHMARK.md` — v0/Bolt/Lovable/Railway targets +- `DEBT.md` — every TODO/FIXME named and owned +- `UAT_REPORT.md` — signed off, sha256-tagged +- `HANDOFF.md` — ADRs, runbook, rollback, credentials vault link +- `.github/workflows/ci.yml` — audit jobs as required checks + +--- + +## Demo + +HR assessment tool built in one session with product-init: +- Editorial landing page: "Hire on evidence, not on a feeling." +- Dark cinematic interview room — live AI sessions +- Dashboard with scored candidates +- PDF reports across 4 dimensions + +Live: https://demorpoject.vercel.app + +--- + +## CI + +`.github/workflows/dogfood.yml` — self-audits on every push: +- Gate 1 + Gate 2 must exit 0 on known-good fixture +- Orchestrator `init`, `audit`, `gate` subcommands tested +- Runtime adapters frontmatter validated +- `install.sh` executable check +- Python syntax check on all 23 scripts +- `$PRODUCT_INIT_SKILL_DIR` env var override verified + +--- + +## Research basis + +| Source | Applied at | +|--------|-----------| +| CB Insights 2024 (43% PMF failure) | Gate 1 hard block | +| Christensen JTBD | Gate 1 Q2 | +| Cagan four-risk model | Gate 1 Q7 | +| Basecamp Shape Up | Gate 2 appetite + scope | +| Amazon PR-FAQ | Gate 2 user narrative | +| Torres Continuous Discovery | Gate 1 Q13 | +| Ries Lean Startup | Kill criteria + BML loop | + +Full citations: `references/research-evidence.md` diff --git a/skills/.experimental/product-init/SKILL.md b/skills/.experimental/product-init/SKILL.md new file mode 100644 index 00000000..d5930c6c --- /dev/null +++ b/skills/.experimental/product-init/SKILL.md @@ -0,0 +1,303 @@ +--- +name: product-init +description: AI-first turnkey product delivery. One command, 9 hard-gated stages, a shipped product at the end. Validates the problem before writing a single line of code. Works on Claude Code, Codex CLI, and OpenClaw/Hermes. +allowed-tools: + - Bash + - Read + - Write + - Edit +metadata: + version: "2.1.0" + author: mturac + license: MIT + tags: + - product-management + - developer-tools + - ai + - shipping + - pmf +--- + +# product-init — AI-First Operating Manual + +This file is **instructions for Claude**, not for the human user. The human never runs `python3 orchestrator.py` directly. Claude does that. The human answers questions in natural language; Claude translates answers into PRODUCT.md, runs audits, narrates findings, and gates progress. + +## When to invoke + +**Auto-trigger** when the user says (any language): +- "yeni ürün fikrim var" / "new product idea" / "yeni proje" / "kickoff" +- "X için skill/uygulama yapalım" / "let's build X" +- "MVP brief", "discovery", "spec yazalım", "spec from zero" +- "ship it", "deliver this", "anahtar teslim", "turnkey" +- "is this really done?", "audit my repo", "this isn't shipping" +- The user is staring at a blank repo or asking "where do I start?" + +**Slash command**: `/product-init` + +**Do NOT invoke** when the user is mid-implementation on an established project, debugging, or asking a narrow technical question. + +## Operating mode (the part most skills get wrong) + +You are running this skill **conversationally**. The user does not see audit script output unless you choose to show it. You translate everything. + +**Resolve skill dir first** (before any script call): +```bash +# auto-detect — works on Claude Code, Codex CLI, OpenClaw, Hermes +SKILL_DIR="${PRODUCT_INIT_SKILL_DIR:-}" +if [ -z "$SKILL_DIR" ]; then + for d in ~/.codex/skills/product-init ~/.openclaw/skills/product-init ~/.claude/skills/product-init; do + [ -d "$d" ] && SKILL_DIR="$d" && break + done +fi +``` +Use `$SKILL_DIR` for all subsequent references. If none found, tell the user to run `bash install.sh` first. + +**You** (Claude) do the following automatically, without asking: +- Use the skill-local venv: `$SKILL_DIR/.venv/bin/python` for all script invocations. If `.venv/` is missing, run `bash $SKILL_DIR/install.sh` (creates venv in place, works on all runtimes) +- `mkdir` the project directory if it does not exist +- `cp` template files into the project repo +- `git init` if the repo is not under version control +- Run `python3 $SKILL_DIR/scripts/orchestrator.py --project-dir audit` and parse the JSON +- Run `filter_task.py` against any task the user proposes during the session +- Write PRODUCT.md, SPEC.md, PLAN.md, TASKS.md, COMPETITIVE_BENCHMARK.md, DEBT.md based on the conversation +- **Generate source code** for the golden path either directly (Write tool) or by delegating to free builders (codex/mistral-large/alibaba/big-pickle via Agent or `*:task` skills) +- **Scaffold tests**: vitest/jest/pytest for unit, integration tests against real DB/API, Playwright `e2e/` with `@golden-path` tagged spec, `e2e/uat/*.uat.spec.ts` for Gate 6 +- **Configure CI**: `.github/workflows/ci.yml` with audit jobs, smoke job, branch protection note +- **Set up deploy**: write Vercel/Netlify/Render config; ask user once for the deploy token; otherwise produce a deploy-ready repo and stop at "you run `vercel --prod` once" +- **Draft handoff artifacts**: README.md, runbook.md, HANDOFF.md, Loom-walkthrough script +- Re-run audits after every material change +- Show diffs of what you wrote before saving (use `--dry-run` mentality) + +**You ask the user** only for: +- The one-line idea (Q1 seed) +- Genuinely judgment-call answers among the 14 (typically Q3 current alternative, Q5 outcome metric, Q7 four-risk evidence, Q8 appetite, Q9 kill criteria) +- Approval before writing destructive operations (delete, force-push, drop) +- Sign-off on auto-generated PR-FAQ, OST, and pitch — user must read and confirm + +For the other ~10 of the 14 questions, you **draft a confident answer** based on context, the idea, and best practice, then say *"I'm filling Q4 (10-min success signal) as ``. Override or accept?"*. Default to acceptance after 1 round of review. + +## The conversational protocol + +### Phase 1 — Capture the idea (Gate 1 prep) + +``` +User: "I want to build [thing]." +You: "Okay. Quick capture so I can write the constitution. I'll fill what + I can confidently, ask only what needs your judgment. + + [Pause. Then ask in ONE message, not 14 separate questions:] + + Three things from you, the rest I'll draft: + 1. Who specifically is the user? (one sentence — name a real persona) + 2. What do they use today instead, and what's wrong with it? + 3. What's your appetite — weeks of work, rough budget if any? + + Once I have these I'll draft the rest and we iterate." +``` + +After the user answers, you write all 5 constitution files in one go and show: +``` +"Here's what I drafted. Skim, push back on anything that doesn't sound like + you, and I'll adjust: + + [bullet summary of key choices, not the full files] + + - Golden Path: '[your draft]' + - Riskiest assumption: '[your draft]' + - Kill criteria: '[your draft]' + - 4 risks: [your draft per-risk] + - Deferred list: RBAC, marketplace, compliance, multi-region, observability, + enterprise integrations [the standard 6] + - Competitive benchmarks: v0/Bolt/Lovable/Railway with target numbers I picked + + Ready to commit and run Gate 1 audit?" +``` + +### Phase 2 — Run audits, translate findings + +After committing the constitution, run: +```bash +python3 $SKILL_DIR/scripts/orchestrator.py --project-dir audit --json +``` + +Parse the JSON. **Do not paste raw audit output to the user.** Translate: + +- CRITICAL findings → "We have to fix this before we can move forward: [plain language]" +- HIGH findings → "These should be fixed soon, but I'll keep moving. Want to handle now or backlog?" +- MEDIUM/LOW/INFO → mention only if the user asks for the full report + +Always group by gate. Always end with one concrete next action ("Next: write Gate 2 SoW. I have a draft ready, want to see it?"). + +### Phase 3 — Gate-by-gate progression + +You walk the user through gates 1 → 9 sequentially. After each gate is green, you: +1. Summarize what just locked in (1-2 sentences) +2. Show the next gate's deliverable as a draft +3. Ask only the human-judgment questions for that gate + +Never advance with a red gate. If the user pushes ("just skip it"), reply: +> "I can't open Gate N with a CRITICAL finding. The skill's hard rule is no +> softening. We can either (a) fix it — I have an idea — or (b) write a +> DEBT.md entry with your sign-off documenting the conscious deferral. +> Which?" + +### Phase 4 — Build & Deliver (Gate 3-7) + +This is where the skill earns its keep. Claude does NOT sit idle waiting for the user to write code — Claude either writes it directly or delegates to free builders. The user's job is to approve and walk the URL. + +**Per-gate delivery actions** (what Claude does, not what audit checks): + +| Gate | Claude's action | +|---|---| +| 3 Design | **MANDATORY**: invoke `frontend-design:frontend-design` skill via the Skill tool BEFORE writing any UI code, OR delegate to a frontend agent (sonnet/codex) with the frontend-design discipline embedded in the prompt. Generate ASCII/mermaid wireframes per spec step into `design/`. Skipping this step ships AI-slop UI — it has happened, it is a known failure mode of this skill (lived experience). | +| 4 Build | Read TASKS.md (golden_path_step ordered). For each task: either Write the code directly or `Agent(subagent_type="mistral-large:mistral-large-rescue", ...)` for backend logic, sonnet for UI, codex for senior reasoning. Open one PR per outcome-epic, not per task. | +| 5 QA | Scaffold the test directory tree on first run: `tests/unit/`, `tests/integration/`, `e2e/`, `e2e/uat/`. Write `playwright.config.ts` with non-localhost `baseURL`. Generate one `@golden-path` Playwright test from SPEC.md acceptance criteria. Set up vitest/pytest config. After every code task, write companion test. | +| 6 UAT | Generate `e2e/uat/golden-path.uat.spec.ts` from SPEC. Generate `UAT_REPORT.md` template with the table pre-filled with action steps, expected results, sha256 placeholder, Signed-off-by line. Tell user: "Send the URL + this report to [persona name]. When they sign, I tag `uat-v1.0.0`." | +| 7 Deploy | Write `vercel.json` / `netlify.toml` / `render.yaml`. Add `.github/workflows/ci.yml` with smoke job. Generate `runbooks/runbook.md` and `runbooks/rollback-drills.md` (initialized with today's drill entry — user runs the drill, we log it). Stop at "you run `vercel deploy` once with the token; I'll wire the rest." | + +**Builder delegation rules**: +- Default: write code directly with Write tool if scope is < 200 lines and clearly within Claude's reach. +- Delegate when: task is > 200 lines, requires deep domain reasoning (e.g., custom algorithm), or user already has a preferred builder configured. +- Always: re-read the generated code, run audit_build + audit_real_wiring + audit_static immediately. Do NOT trust builder output without running the gate. + +**Filter discipline (continuous)**: +When the user proposes a task, run `filter_task.py` silently: +- golden_path_step match → agree, suggest AC pattern, file under right outcome epic, immediately delegate or write +- DEFER → push back: *"This sounds like deferred work (platform-side, not user-outcome). Want me to add it to PLAN.md deferred for post-MVP, or is there a golden-path-relevance I'm missing?"* + +**"Is this done?" check**: +Run full audit, translate, if red on Gates 4-7 → "Not yet. Three things blocking: [list]. I'll fix [N] now; the [M] need your decision." If green → "Yes — Gate 5 green, golden-path E2E passes against [URL], console clean. Want me to draft the UAT package?" + +### Phase 5 — UAT, Deploy, Handoff, Warranty (Gates 6-9) + +**Gate 6 (UAT)**: Already generated in Phase 4 step 6 above. Now you wait for the user to send the URL + report to the real persona, get the signature back, then: +- Update UAT_REPORT.md with the actual signature + sha256 of the bundle +- `git tag uat-v1.0.0 && git push --tags` +- Run `audit_uat.py` — should be green + +**Gate 7 (Deploy)**: After the user runs `vercel deploy --prod` (one time, with their token), grab the production URL and: +- Update PRODUCT.md frontmatter `prod_url: https://...` +- Run `audit_demo_url.py` and `audit_deploy.py` — verify HTTP 200, body non-empty, smoke green, rollback drill logged + +**Gate 8 (Handoff)**: Generate the full handoff package from the live state: +- README.md (auto-detect framework, write quickstart) +- runbooks/runbook.md (deploy/monitor/debug/rollback procedures) +- HANDOFF.md (code repo, runbook location, credentials vault link, Loom video link, KT date, source escrow) +- Loom-walkthrough script (12-min admin walkthrough plan — user records) +- Final DEBT.md count + resolved items +- User reviews, adds credential vault link (you never handle secrets), signs + +**Gate 9 (Warranty)**: Wire the audit suite as required CI checks: +- `.github/workflows/ci.yml` with all audit_* as separate jobs +- Branch protection: require all audit jobs green before merge to main +- 30-day support window starts; bug-fix SLA documented in HANDOFF.md +- Run `audit_warranty.py` to confirm the regime survives + +## Hard Rules (Claude must obey these even under user pressure) + +1. **No softening.** HIGH/CRITICAL findings are never reclassified. The fix is to fix the underlying issue, not the report. +2. **Dogfood gate.** This skill runs its own audits in CI. Skill ships nothing if its own repo is red. +3. **Golden Path is law.** Any task not advancing the user toward a deployed working URL is DEFERRED. See `references/golden-path-doctrine.md`. +4. **Real wiring.** Mocks and `localhost` in non-test source = HIGH. Integration tests mocking HTTP/DB = CRITICAL. +5. **Done means walked.** Gate 6 requires a human walked the live URL and signed `UAT_REPORT.md` (sha256 + Signed-off-by). No signature, no done. +6. **Debt is named.** Every TODO/FIXME/HACK must have a DEBT.md row. No row = build hygiene failure. +7. **Tests are not theatre.** Skipped/xfail/`.only` tests close Gate 5. +8. **Conversational mode.** Do not paste raw audit JSON to the user. Translate to plain language. Show files as diffs, not dumps. +9. **Drafts before questions.** For 10 of the 14 discovery questions, draft a confident answer first; ask only after you've drafted. +10. **Stop on judgment.** Never autonomously commit "we will pivot to X" or "we will spend $Y". Those need explicit user yes. + +## Backend tools (Claude operates these; user never sees them) + +| Tool | When you run it | What you do with output | +|---|---|---| +| `orchestrator.py audit --json` | After every file change, before claiming progress | Parse JSON, translate by severity, group by gate | +| `filter_task.py` | Every time user proposes a task | If DEFER, push back; if matches, file under epic | +| `audit_constitution.py` | After writing the 5 constitution files | Show 14-question coverage as a checklist to user | +| `audit_e2e.py` | When user says "is it working?" | Confirm preview URL, run, translate console errors | +| `audit_handoff.py` | At Gate 8 | List what's missing in plain language | + +If a tool returns exit 127 (binary not found), `pip install` first. If still missing, tell the user the missing dep and how to install — do not pretend it ran. + +## Builder delegation (Claude → free builders for code generation) + +When code volume exceeds direct-write threshold, delegate to free builders via the Agent tool or `*:task` skills: + +| Builder | When to use | +|---|---| +| `Agent(subagent_type="mistral-large:mistral-large-rescue")` | Senior backend logic, complex algorithm | +| `Agent(subagent_type="codex:codex-rescue")` | Reasoning-heavy refactor, architecture decisions | +| `Agent(subagent_type="alibaba:alibaba-rescue")` | Markdown content, config files, structured prose | +| `Agent(subagent_type="big-pickle:big-pickle-rescue")` | Rust, text-heavy docs | +| `Agent(subagent_type="general-purpose")` | Multi-file deliverables requiring full Write/Edit/Bash | +| Direct (Write tool) | Scope < 200 lines, clear within Claude's reach | + +**After every builder dispatch**: re-read written files, run `audit_build` + `audit_real_wiring` + `audit_static`. Do NOT trust builder output blind — gate first, then ship. + +**Builder failure handling**: if companion fails 2x, fall back to direct Write or different builder. Never silently skip the deliverable. + +## When this skill should NOT be the answer + +- User wants a one-off bug fix → use `izonconsule:investigate` instead +- User wants to refactor existing code → use `izonconsule:simplify-code` +- User is debugging an existing audit failure → run the specific failing script directly +- User asks "what does this code do?" → that's not product-init's job + +## The 9-Gate Chain (reference) + +| # | Gate | Claude's deliverable | Audit | +|---|---|---|---| +| 1 | Discovery Constitution | 5 files in repo, 14 questions answered | `audit_constitution.py` | +| 2 | SoW | Frozen scope, appetite, kill criteria, deferred list | `audit_sow.py` | +| 3 | Design | Every screen mapped to a golden_path_step | manual + `templates/jira-epic-skeleton.md` | +| 4 | Build | Commit-to-AC, debt ledger, no orphan TODOs | `audit_build.py` + `audit_real_wiring.py` | +| 5 | QA | Unit + integration + E2E (real URL) + console=0 + mutation + contract + static | the 8 QA audits | +| 6 | UAT | Live URL walked, signed report, `uat-v*` tag | `audit_uat.py` | +| 7 | Deploy | prod_url 200, smoke job, rollback drill ≤14d | `audit_deploy.py` + `audit_demo_url.py` | +| 8 | Handoff | README + runbook + creds vault + DEBT.md | `audit_handoff.py` | +| 9 | Warranty | Audits live in repo, CI required-checks | `audit_warranty.py` | + +## Multi-Runtime Support + +This skill runs on Claude Code, Codex CLI, and OpenClaw. Core audit scripts (`scripts/`) are runtime-agnostic Python. Only the tool surface differs per runtime. + +| Runtime | Adapter | Install | +|---------|---------|---------| +| Claude Code | `runtime/claude-code.md` | `git clone https://github.com/mturac/product-init ~/.claude/skills/product-init && bash ~/.claude/skills/product-init/install.sh` | +| Codex CLI | `runtime/codex.md` | `git clone https://github.com/mturac/product-init ~/.codex/skills/product-init && bash ~/.codex/skills/product-init/install.sh` | +| OpenClaw + Hermes | `runtime/openclaw.md` | `git clone https://github.com/mturac/product-init ~/.openclaw/skills/product-init && bash ~/.openclaw/skills/product-init/install.sh` | + +**Path resolution** (orchestrator auto-detects in this order): +1. `$PRODUCT_INIT_SKILL_DIR` env var (override) +2. `~/.codex/skills/product-init/` +3. `~/.openclaw/skills/product-init/` +4. `~/.claude/skills/product-init/` + +The `install.sh` creates the venv at `$SKILL_DIR/.venv/` in place — no hardcoded paths. + +When operating on a non-Claude-Code runtime, read the relevant `runtime/*.md` adapter before delegating to builders — it specifies the correct builder dispatch commands for that runtime. + +## Reference Index (load on demand) + +- `references/research-evidence.md` — 43% PMF stat + arxiv citations +- `references/golden-path-doctrine.md` — the central law +- `references/deferred-until-proven.md` — 6 banned-from-MVP categories +- `references/nine-gate-spec.md` — gate-by-gate specification +- `references/tooling-stack.md` — install commands, exit-code semantics +- `references/jtbd.md` — Christensen JTBD (Q2) +- `references/four-risks.md` — Cagan SVPG (Q7) +- `references/working-backwards.md` — Amazon PR-FAQ +- `references/continuous-discovery.md` — Torres OST (Q13) +- `references/shape-up.md` — Basecamp pitch (Q8/10/11) +- `references/lean-startup.md` — Ries BML (Q6 + kill criteria) +- `references/anti-patterns.md` — 15 ways "done" lies +- `templates/jira-epic-skeleton.md` — six outcome epics + +## Session opener (verbatim suggestion when first triggered) + +When the skill triggers and the user has not yet given you their idea, say something like: + +> "Got it. I'll drive — you make the calls. First: tell me the idea in one +> sentence, the user persona, and roughly how much time you're willing to +> spend before you'd kill or pivot. I'll draft the rest and we iterate." + +If the user already gave you the idea in the trigger message, skip the prompt and go straight to drafting. Show your drafts as bullet summaries; offer the full files only on request. diff --git a/skills/.experimental/product-init/install.sh b/skills/.experimental/product-init/install.sh new file mode 100755 index 00000000..015ac7c8 --- /dev/null +++ b/skills/.experimental/product-init/install.sh @@ -0,0 +1,44 @@ +#!/usr/bin/env bash +# product-init — one-line install, works on Claude Code, Codex CLI, OpenClaw +# curl -sSL https://raw.githubusercontent.com/mturac/product-init/main/install.sh | bash +set -euo pipefail + +REPO="https://github.com/mturac/product-init" +INSTALLED=0 + +clone_to() { + local dest="$1" + if [ -d "$dest" ]; then + echo " already installed at $dest, pulling latest..." + git -C "$dest" pull -q + else + echo "→ Installing to $dest ..." + mkdir -p "$(dirname "$dest")" + git clone -q "$REPO" "$dest" + fi + make_venv "$dest" + INSTALLED=$((INSTALLED + 1)) +} + +make_venv() { + local dir="$1" + if [ ! -f "$dir/.venv/bin/python" ]; then + echo " → creating venv..." + python3 -m venv "$dir/.venv" + "$dir/.venv/bin/pip" install -q -r "$dir/scripts/requirements.txt" + fi +} + +# Auto-detect installed runtimes and install there +[ -d "$HOME/.claude" ] && clone_to "$HOME/.claude/skills/product-init" +[ -d "$HOME/.codex" ] && clone_to "$HOME/.codex/skills/product-init" +[ -d "$HOME/.openclaw" ] && clone_to "$HOME/.openclaw/skills/product-init" + +# Fallback: nothing detected → install for Claude Code (most common) +if [ "$INSTALLED" -eq 0 ]; then + echo "No runtime detected — installing for Claude Code (default)" + clone_to "$HOME/.claude/skills/product-init" +fi + +echo "" +echo "product-init ready. Type /product-init in your AI tool to start." diff --git a/skills/.experimental/product-init/logo.svg b/skills/.experimental/product-init/logo.svg new file mode 100644 index 00000000..236370e5 --- /dev/null +++ b/skills/.experimental/product-init/logo.svg @@ -0,0 +1,49 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/skills/.experimental/product-init/references/anti-patterns.md b/skills/.experimental/product-init/references/anti-patterns.md new file mode 100644 index 00000000..0c7e295d --- /dev/null +++ b/skills/.experimental/product-init/references/anti-patterns.md @@ -0,0 +1,163 @@ +--- +name: anti-patterns +description: Fifteen ways "done" lies; per pattern the symptom, root cause, programmatic detection, and counter-clamp. +type: reference +--- + +# Fifteen Anti-Patterns + +Each anti-pattern is documented as: **symptom** (what you see), **root cause** (why it happens), **programmatic detection** (which audit script catches it), **counter-clamp** (the practice that prevents it). + +## 1. AI-Plan-Bloat + +**Symptom.** The AI produces a beautiful 40-step plan. Demo of the plan goes well. Nothing is built. + +**Root cause.** The plan is treated as evidence of progress. Plans are cheap; real outputs are expensive. The team rewards the cheap thing. + +**Detection.** `audit_build.py` -- if commits are sparse but TASKS.md is bursting, plan-bloat is in play. `audit_demo_url.py` -- prod URL is missing, returning <500 bytes, or has no ``. + +**Counter-clamp.** Gate 5/Gate 7. No gate closes on a plan; only on a working live URL. + +## 2. Platform-vs-Product Drift + +**Symptom.** The team ships "platform features" (queue, observability, multi-backend abstraction) instead of product features (user can complete the Golden Path). + +**Root cause.** Platform work feels infinite, low-risk, internally legible. Product work is high-risk and externally judged. The team retreats to platform. + +**Detection.** `filter_task.py` -- platform tasks score below 0.3 against the seven golden_path_steps and return DEFER. + +**Counter-clamp.** Run `filter_task.py` on every ticket before sprinting. Anything that DEFERs goes to PLAN.md deferred list. + +## 3. Capability Epic Trap + +**Symptom.** Jira epics are organised by capability ("Auth Epic", "Generation Epic", "Deploy Epic") rather than by user outcome ("Idea Captured", "Spec Approved", "Preview Deployed"). + +**Root cause.** Capability epics map cleanly to engineering teams; outcome epics force cross-team work and accountability. Capability is the easy default. + +**Detection.** Manual inspection. The orchestrator's `gate 3` review and `templates/jira-epic-skeleton.md` define the six outcome-epics. + +**Counter-clamp.** Re-organise epics around the six outcome-epics in `templates/jira-epic-skeleton.md`. If an epic does not map to a user outcome, it is deferred or merged. + +## 4. Vanity Done + +**Symptom.** Tickets are closed at high rate. Sprint completion looks healthy. The product still does not work end-to-end. + +**Root cause.** "Done" is defined per-component, not per-outcome. A ticket that closes "the auth flow is implemented" passes even when no user can authenticate against the deployed system. + +**Detection.** `audit_e2e.py` -- the `@golden-path` test is failing or absent even though tickets are closed. + +**Counter-clamp.** Per-outcome AC tied to a passing E2E test against the real preview URL. No ticket is "done" until the E2E assertion that proves it is green. + +## 5. Test Theatre + +**Symptom.** Test coverage is high. Mutation score is low. CI is green. Real bugs ship. + +**Root cause.** Tests assert on irrelevant outputs (return values from mocks; HTTP status codes that the test fixture forces). The tests prove the test, not the code. + +**Detection.** `audit_mutation.py` -- mutation score < 60% on changed code. `audit_integration.py` -- HTTP/DB layers are mocked. + +**Counter-clamp.** Mutation testing in CI. Integration tests hit real services in a docker-compose or hosted preview environment. + +## 6. Wired-But-Broken Frontend + +**Symptom.** The frontend renders. Buttons exist. Clicking them does nothing useful, or fails silently, or shows a fake success message. + +**Root cause.** Frontend is built against a mocked API client. The mock is never replaced. Or the API exists but the response shape drifts and nobody notices because no end-to-end test runs. + +**Detection.** `audit_real_wiring.py` -- `import.*mock|fakeApi|stubApi|MockAdapter` in non-test source. `audit_e2e.py` -- no `@golden-path` test or it fails. + +**Counter-clamp.** Frontend integration tests against the real backend in CI; ban mock clients in `src/`. + +## 7. Mock-Only Path + +**Symptom.** The integration test suite is large and green; production fails on the first real request. + +**Root cause.** Integration tests import the production HTTP/DB clients but stub them. The tests verify behaviour against the stub, not against the wire. + +**Detection.** `audit_integration.py` -- `vi.mock|jest.mock|@patch|MagicMock` near `requests|axios|prisma|psycopg` import lines. + +**Counter-clamp.** Integration tests are reserved for tests that hit real services. Anything that mocks the wire is a unit test (and named accordingly). At least one E2E test (`@golden-path`) hits the deployed preview URL. + +## 8. Local-Works-Prod-Doesn't + +**Symptom.** "Works on my machine" is the standing joke. Production keeps failing in ways the dev environment cannot reproduce. + +**Root cause.** Configuration drift between local and prod (env vars, feature flags, API base URLs, secrets). Local uses fixtures; prod uses real services with different latency, auth, rate limits. + +**Detection.** `audit_real_wiring.py` -- `localhost|127.0.0.1` in non-test source. `audit_e2e.py` -- `baseURL` in `playwright.config.*` contains `localhost`. + +**Counter-clamp.** Preview URLs per branch (Vercel/Netlify); E2E tests run against preview, never localhost; environment differences captured in a single `infra/env.md`. + +## 9. Console Pollution + +**Symptom.** The browser console is full of warnings and errors during normal use. Users see them; engineers learned to ignore them. + +**Root cause.** The team treats console errors as cosmetic. They are not -- they are usually misconfigured imports, deprecated APIs, missing keys, or unhandled promise rejections that mask real bugs. + +**Detection.** `audit_console_clean.py` -- counts `type: console` events with level `error` or `warning` in Playwright trace JSON. + +**Counter-clamp.** Console error count = 0 on the golden path is a CI gate. Each existing error is named in DEBT.md or fixed. + +## 10. Schema Drift FE/BE + +**Symptom.** The frontend expects field `userName`. The backend returns `user_name`. The frontend renders `undefined`. No test catches it. + +**Root cause.** No contract pinned between FE and BE. TypeScript types and Pydantic schemas drift independently. + +**Detection.** `audit_contract.py` -- oasdiff/graphql-inspector flags breaking changes against `origin/main`. + +**Counter-clamp.** Single source of truth (OpenAPI YAML or GraphQL schema or proto). FE and BE generate types from it. Breaking changes require a DEBT.md row and version bump. + +## 11. Skipped Tests Graveyard + +**Symptom.** The test suite has dozens of `.skip`, `.only`, `it.todo`, or `xfail`. Each was added with the intent to fix; none ever was. + +**Root cause.** "Just unblock CI for now" is the default response to a flaky test. The fix never returns to the top of the priority list. + +**Detection.** `audit_build.py` -- new `\.skip|\.only|it\.todo|xfail|@pytest\.mark\.skip` in test files in the diff. `audit_unit.py` -- skipped count > 0. + +**Counter-clamp.** Skipped tests block Gate 5. Either fix or delete; do not commit a skip. + +## 12. TODO/FIXME Accretion + +**Symptom.** `rg TODO` returns hundreds of hits. None of them have owners or dates. + +**Root cause.** TODOs are cheap to add and have no enforcement. They become a passive-aggressive backlog. + +**Detection.** `audit_build.py` -- new TODO/FIXME/XXX/HACK in the diff that is not referenced in DEBT.md. + +**Counter-clamp.** Every TODO must have a DEBT.md row referencing `<file>:<line>`. The DEBT.md row has owner + date + acceptance condition. + +## 13. No-Real-User-Walked-It + +**Symptom.** The team demos to itself. The demo passes. No external user has used the product. + +**Root cause.** Internal demos are easy to schedule; external user sessions require recruiting and confront the team with reality. + +**Detection.** `audit_uat.py` -- `UAT_REPORT.md` missing or unsigned; no `uat-v*` git tag. + +**Counter-clamp.** Gate 6 requires a signed UAT report from a real user against the live URL. Internal sign-off does not count. + +## 14. Demo URL Rot + +**Symptom.** PRODUCT.md says the prod URL is X. Hitting X returns 503, or 404, or a 200 with an empty body, or a 200 with an old version. + +**Root cause.** Deploys are intermittent; the URL is never monitored; rot accumulates between demos. + +**Detection.** `audit_demo_url.py` -- HTTP 200 + body length > 500 bytes + non-empty `<title>`. `audit_deploy.py` -- prod_url HEAD fails or no `<title>`. + +**Counter-clamp.** Smoke job in CI hits prod_url after every deploy. Gate 7 audit runs on a schedule (daily) to detect rot between deploys. + +## 15. BC Theatre (Backward-Compat Theatre) + +**Symptom.** API breaking changes ship with a soothing changelog ("we improved the response shape"). Customers' integrations break silently. + +**Root cause.** "Backward-compat" is asserted in prose but not enforced. The schema diff is hand-eyeballed. + +**Detection.** `audit_contract.py` -- oasdiff flags breaking change without a DEBT.md row documenting the breakage and migration plan. + +**Counter-clamp.** Breaking changes require a versioned endpoint or a feature flag, plus a DEBT.md row, plus a customer-comms plan in HANDOFF.md. + +## How to use this list + +These fifteen are not exhaustive; they are the ones with the highest signal in the failure mode this skill was built to prevent. Print them. Tape them above the standup board. When a gate goes green-but-suspicious, walk the list before declaring victory. Each pattern's counter-clamp is enforced by an audit script; the audits exist because human discipline alone fails on tired Friday afternoons in week six of a six-week cycle. diff --git a/skills/.experimental/product-init/references/continuous-discovery.md b/skills/.experimental/product-init/references/continuous-discovery.md new file mode 100644 index 00000000..78d32111 --- /dev/null +++ b/skills/.experimental/product-init/references/continuous-discovery.md @@ -0,0 +1,82 @@ +--- +name: continuous-discovery +description: Teresa Torres's Opportunity Solution Tree and weekly cadence; how Question 13 locks the rhythm in. +type: reference +--- + +# Continuous Discovery + +## Origin + +Teresa Torres's _Continuous Discovery Habits_ (2021, Product Talk Press) is the canonical reference. Torres had been running discovery coaching at Product Talk since 2014, and the book consolidated practices observed across hundreds of product teams. The blog archive at https://www.producttalk.org is the running, opinionated commentary; the book is the structured frame. + +Two ideas anchor the practice: + +1. **Continuous discovery cadence.** A product trio (PM + engineer + designer) interviews at least one customer every week. Not every quarter, not every "discovery sprint". Every week. The cadence matters more than the technique. +2. **Opportunity Solution Tree (OST).** A visual map that ties the desired outcome (top) to opportunities (gaps in customer experience), to candidate solutions, to assumption tests. The tree is updated weekly as new interviews surface new opportunities or kill old ones. + +## Why weekly + +Torres's empirical claim: teams running monthly or quarterly discovery underweight customer evidence relative to internal opinion, because the gap between "what the engineer believes today" and "what the customer told us last quarter" widens with time. Weekly closes the gap. The corollary: weekly cadence forces the team to recruit a sustainable interview pipeline (Calendly+ recruiting service, a partner like UserInterviews.com, or in-product prompts) instead of one-shot recruiting drives. Sustainable recruiting is itself a forcing function for "are we still selling to a segment that exists?". + +## The Opportunity Solution Tree + +The OST has four levels: + +1. **Outcome.** The business or user outcome the team is responsible for, e.g. "First-time users complete a deployed product within 10 minutes". Outcomes are not features; they are measurable changes in user behaviour. +2. **Opportunities.** Customer pains, desires, or unmet needs surfaced through interviews. Each opportunity is phrased in the customer's voice and is mappable to one outcome. "I tried four AI builders this month and gave up because each one half-finished the project" is an opportunity. +3. **Solutions.** Candidate ways to address the opportunity. Two or more per opportunity, to force comparison. A single-solution opportunity is a sign the team has stopped exploring. +4. **Assumption tests.** For each candidate solution, the smallest experiment that would falsify the most-load-bearing assumption. Tests are sized in hours, not weeks. + +The OST is updated every week. Opportunities die when interviews stop surfacing them. Solutions die when assumption tests falsify them. Outcomes change rarely (quarterly at most). The tree is shared with the whole team and serves as the single source of truth for "what we are working on and why". + +## How Question 13 locks it in + +Question 13 of the 14 mandatory discovery questions reads: "Discovery cadence (weekly)?". The audit (`audit_constitution.py`) checks for the keyword `discovery cadence` or `weekly touchpoint` in PRODUCT.md or PLAN.md. The substantive answer must include: + +- **Who interviews.** Named PM + engineer + designer trio, or a documented stand-in. +- **How often.** Weekly is the bar. +- **Where it lives.** The OST file or board path; an OST that is nowhere is an OST that does not exist. +- **What feeds the funnel.** Recruiting source (existing user list, Calendly+, partner). A weekly cadence with no recruiting plan is fiction. + +A team that says "we will interview as needed" is failing Q13. As-needed = never. + +## Cadence rituals + +Torres recommends three artefacts per cadence cycle: + +1. **Interview snapshot.** A one-page summary per interview: who, when, the verbatim "moment of struggle" (Bob Moesta's term, JTBD), the inferred opportunity, the surprise. Snapshots are filed in a discoverable location (Notion DB, GitHub, a shared Drive folder). +2. **OST diff.** Weekly: which opportunities did we add, which did we kill, which solutions advanced, which assumption tests ran. The diff is a 5-minute standup item, not a 1-hour meeting. +3. **Outcome trend.** The outcome metric (Q5) plotted against time. Without this, the OST has no feedback loop. + +For an AI product builder MVP, a healthy week might look like: 2 customer interviews (one new, one returning), 1 prototype tested with 1 user, 1 OST diff (added "users want a one-click rollback after a generated deploy fails"), and the outcome metric (10-minute success rate) tracked against the previous week. + +## Common anti-patterns + +**The interview drought.** "We are heads-down this sprint, we will interview after launch." The launch never comes; the interviews never resume. By the time the team looks up, they have built features no current user has asked for. The defence: weekly cadence is a non-negotiable team commitment, not a phase. + +**The opportunity hoard.** A 200-node opportunity tree where nothing is ever removed. Opportunities should expire if no interview surfaces them for two cycles. The OST is a living tree, not a graveyard. + +**The single-solution opportunity.** Every opportunity has exactly one candidate solution, and that candidate is "the thing the engineer wants to build". The tree is now a roadmap with extra steps. Force two or more candidate solutions per opportunity. + +**The unmeasured outcome.** The outcome at the top of the tree is "delight users" or "be the best". Unmeasurable outcomes mean the tree has no kill criteria; everything below it is justifiable forever. Outcomes must be a measurable user behaviour with a baseline and a target. + +**The interview-without-listening.** The PM runs the interview, asks five leading questions, and gets the answers they expected. Torres's _Continuous Discovery Habits_ has a chapter on interview craft; the short version is "ask about specific past behaviour, not opinion or speculation". + +## How OST integrates with the 9 gates + +| Gate | OST artefact | +| --- | --- | +| 1 Discovery | Outcomes (PRODUCT.md), top-level opportunities (PLAN.md kill criteria mapping), Q13 cadence commitment. | +| 2 SoW | Selected solutions with assumption tests; appetite tied to the test. | +| 4 Build | Each in-flight ticket maps to a solution node on the tree. | +| 5 QA | E2E `@golden-path` tests verify the outcome at the top of the tree. | +| 7 Deploy | Outcome metric tracked post-launch; tree feedback loop closes. | + +## Reading list + +- Torres, _Continuous Discovery Habits_, Product Talk Press, 2021. +- Torres blog: https://www.producttalk.org -- especially the OST series (https://www.producttalk.org/opportunity-solution-tree/). +- Bob Moesta + Chris Spiek, _Demand-Side Sales 101_, Lioncrest, 2020 -- interview craft. +- Steve Portigal, _Interviewing Users_, 2nd ed., 2023 -- interview craft. +- Erika Hall, _Just Enough Research_, 2nd ed., 2019 -- recruiting and bias. diff --git a/skills/.experimental/product-init/references/deferred-until-proven.md b/skills/.experimental/product-init/references/deferred-until-proven.md new file mode 100644 index 00000000..39951930 --- /dev/null +++ b/skills/.experimental/product-init/references/deferred-until-proven.md @@ -0,0 +1,73 @@ +--- +name: deferred-until-proven +description: Six capability categories that are banned from MVP scope until specific evidence unlocks them; the why and the unlock condition for each. +type: reference +--- + +# Deferred Until Proven + +Six categories of work are banned from MVP scope by default. They are not banned forever. They are banned **until specific evidence** justifies their inclusion. Until that evidence exists, every hour spent on them is an hour not spent moving the user along the golden path. + +`audit_sow.py` checks PLAN.md's deferred list and requires at least three of these six categories to be explicitly named. `audit_sow.py` also scans TASKS.md for any of these terms and flags CRITICAL if found inside MVP scope. + +The six categories below are listed with: **definition**, **why it leaks early**, **what unlocks it**. + +## 1. RBAC (Role-Based Access Control) + +**Definition.** Per-user, per-resource permission systems with roles, groups, policies, scopes, OAuth scopes, attribute-based rules, organisation/team hierarchies, delegated admin, audit-grade access logs. + +**Why it leaks early.** RBAC is intellectually seductive because it sounds like "just data modelling". In practice it is six months of edge cases (impersonation, soft-delete, cross-org sharing, expired roles, transient grants, service accounts) that interact with every feature you build afterwards. Worse, RBAC built before the user's actual workflow is known almost always gets ripped out: the roles you guessed are not the roles real customers need. + +**What unlocks it.** A paying or actively-piloting customer asks for it in writing, names two specific roles, and the lack of RBAC is blocking purchase or churning the account. Until then, two hardcoded roles ("user" and "admin") satisfy 95% of MVP needs. If you absolutely must, ship `is_admin: boolean` and move on. + +## 2. Compliance / audit chain + +**Definition.** SOC 2, ISO 27001, HIPAA, PCI, GDPR DSAR pipelines, immutable audit logs, evidence collection systems, data classification frameworks, data retention policies enforced in code. + +**Why it leaks early.** Engineers and PMs treat compliance as a checklist they can pre-empt. They cannot. Real compliance is contextual: the auditor's interpretation, the customer's risk team, the actual data flows you have, not the ones you guess. Pre-built compliance scaffolding is almost always wrong on at least three axes by the time the auditor walks in. + +**What unlocks it.** A signed customer contract that requires it, with an attestation deadline, with budget. Or a board mandate with a date. Or your data flows touch regulated data (PHI, cardholder, EU PII) for a real user, not a hypothetical one. Until then, write a one-page security note, encrypt at rest, encrypt in transit, do not log secrets, and ship. + +## 3. Marketplace + +**Definition.** Multi-tenant catalogues of third-party plugins, themes, integrations, extensions, with submission flows, review processes, revenue share, partner portals, sandbox environments. + +**Why it leaks early.** Marketplaces are second-order products. They require (a) enough first-order customers to justify a marketplace and (b) enough second-order developers to populate it. Building one before either condition is met produces an empty marketplace, which is worse than no marketplace because it signals abandonment. Salesforce, Shopify, and Atlassian all built their marketplaces years after their core product had clear PMF. + +**What unlocks it.** Three or more independent third parties have built unofficial integrations against your API and asked for an official channel. Or a pilot customer has explicitly said they will buy if and only if their existing partner is in your marketplace. Until then, a documented webhook + REST API is the marketplace. + +## 4. Multi-region + +**Definition.** Active-active or active-passive deployments across two or more cloud regions, geo-routing, region-local data residency, region-aware failover, replicated databases. + +**Why it leaks early.** "What if we get popular in Asia?" is a great question to think about and a terrible question to build for. Multi-region adds 40-100% to operational cost, doubles your incident-blast-radius surface area, and requires data engineering you do not have until you have product. Most products that prematurely went multi-region rolled back to single-region within 18 months because the latency wins did not justify the operational cost. + +**What unlocks it.** A specific paying customer in a specific region requires data residency by contract, OR your latency SLA is being broken for >5% of traffic from a specific geography. Until then, single region with a CDN in front gets you to ~95% of the world at acceptable latency. + +## 5. Observability stack + +**Definition.** Distributed tracing, full-fidelity APM, custom Prometheus / Datadog / Honeycomb dashboards, custom alerting taxonomies, log aggregation pipelines, error budgets, SLI/SLO frameworks. + +**Why it leaks early.** Observability is "platform work that feels like product work". It is genuinely useful at scale and genuinely a distraction before scale. Building a Honeycomb-grade pipeline before you have users to observe produces dashboards no one looks at and alerts that fire on nothing. + +**What unlocks it.** First production incident that took >30 minutes to root-cause because logs were insufficient. OR sustained traffic above the threshold where eyeballing logs stops working (~10 req/s sustained, ~1 incident/week). Until then, structured JSON logs to stdout + your platform's free log viewer + Sentry-equivalent error tracking is enough. + +## 6. Enterprise integrations + +**Definition.** SAML SSO, SCIM provisioning, custom audit log exports, IP allowlisting, on-prem deployment options, custom data residency, BAA-signing, custom contract negotiations baked into the product. + +**Why it leaks early.** Enterprise features are high-margin and high-effort. Building them before you have an enterprise pipeline is building margin you cannot capture. Each enterprise integration is a six-week project that interacts non-trivially with auth, audit, and data layers. Doing them speculatively (a) ages badly because enterprise standards shift and (b) produces unused complexity in everyone else's experience. + +**What unlocks it.** A specific enterprise lead with a specific integration requirement, a deal size that justifies six weeks, and a procurement timeline you can match. Or a partnership where the integration is the wedge. Until then, OIDC + email/password covers 90% of the SMB market that you should be selling to first. + +## Common failure mode: the deferred list as theatre + +The deferred list is only useful if it is honoured. The failure mode is "we put it on the deferred list and then quietly built it anyway". `audit_sow.py` catches the easy form (TASKS.md mentions a deferred term) but not the subtle form (the team rebuilds RBAC under a different name like "permissions service"). The defence is the weekly Gate 1 audit run by someone outside the build team, treating the list as a contract. + +## Why three of six is the threshold + +`audit_sow.py` requires at least three of these six categories to be named in PLAN.md's deferred list. The number is not arbitrary. Empirically, teams that name fewer than three of these categories tend to be teams that have not actually done the deferral work; they have just written "TBD" against scope. Three forces the team to think about which two or three of the six are most tempting for their product and to name them out loud, which is what the discipline of deferral requires. + +## Re-litigating the deferral + +Deferred items are not closed forever. Each gate-review meeting includes one question per deferred item: "Has anything changed?". If yes (a customer asked, a contract was signed, an SLA was broken), the item is moved into a future increment with a date. If no, it stays deferred. This is the discipline that prevents both premature inclusion and amnesia-driven re-implementation. diff --git a/skills/.experimental/product-init/references/four-risks.md b/skills/.experimental/product-init/references/four-risks.md new file mode 100644 index 00000000..6d81b22a --- /dev/null +++ b/skills/.experimental/product-init/references/four-risks.md @@ -0,0 +1,89 @@ +--- +name: four-risks +description: Marty Cagan's four product risks (Value, Usability, Feasibility, Viability) and how Question 7 enforces them. +type: reference +--- + +# The Four Risks + +## Source + +Marty Cagan, founder of Silicon Valley Product Group (SVPG) and author of _Inspired_ (2008, revised 2017) and _Empowered_ (2020, with Chris Jones), articulated the four risks that every product feature must address. The canonical post is "The Four Big Risks": https://www.svpg.com/four-big-risks/. The framework underlies the Sequoia Arc programme materials (https://www.sequoiacap.com/article/company-building-arc/) and is widely adopted across product orgs whose teams operate as "missionaries, not mercenaries" (Cagan's phrase). + +The four risks are: + +1. **Value risk.** Will the customer buy this, or use it? +2. **Usability risk.** Can the customer figure out how to use it? +3. **Feasibility risk.** Can our engineers build it, with the time, skills, and tech we have? +4. **Viability risk.** Does this work for our business -- legal, sales, marketing, finance, support? + +Cagan's argument is that every feature has all four risks at all times. Failure in any one kills the feature. Most teams obsess over feasibility (because engineers are loud about it) and viability (because finance is loud about it) and underweight value and usability (because customers are not in the standup). + +## Why Q7 demands a four-risk ledger + +Question 7 of the 14 mandatory discovery questions requires the team to fill a four-row ledger for the MVP scope: + +| Risk | Rating | Mitigation | +| --- | --- | --- | +| Value | high / med / low | how we will test | +| Usability | ... | ... | +| Feasibility | ... | ... | +| Viability | ... | ... | + +The audit (`audit_constitution.py`) looks for "value", "usability", "feasibility", "viability" tokens in PRODUCT.md/PLAN.md and flags absence. The deeper review is human: does each risk row have a real mitigation, or is "high / TBD" being used to wave the risk past? + +## Risk by risk + +### Value risk + +The hardest risk and the most-skipped. "Will customers buy?" cannot be answered by an exec saying yes; it can only be answered by customers behaving in a way that costs them something (signing up, paying, returning, telling a friend). Mitigations that count: pre-sales, paid pilot, signed LOI, working prototype with measured retention, Sean Ellis 40% PMF survey above threshold. Mitigations that do not count: positive feedback in user interviews ("interest is not intent" -- Erika Hall), exec gut feel, a Slack channel of fans. + +For an AI product builder MVP, the value risk pivots on "would a real founder pay $50/month if the tool produced a working deploy by minute 10?". Until you have one founder paying, value risk is high. + +### Usability risk + +A subset of value risk but operationally distinct. Even if the customer wants the outcome, can they reach it? An AI builder where the user must write Cypress tests to verify their site has lost the usability fight. Mitigations: high-fidelity prototype with five-user usability studies per Nielsen Norman group's empirical 5-user-finds-85%-of-issues guideline (https://www.nngroup.com/articles/why-you-only-need-to-test-with-5-users/). Mitigations that do not count: the team using the product themselves (selection bias). + +### Feasibility risk + +The risk engineers usually own. "Can we build this in the appetite (Q8)?" Mitigations: spike, prototype, third-party tech evaluation, capability check on the AI provider, latency test under realistic load. Feasibility risk often hides in AI products as "we assume the model will handle this" until the eval shows it does not. Run the eval before committing to scope. + +### Viability risk + +The "rest of the business" risk. Will sales price this without losing margin? Will support handle the volume? Will legal sign off on the data flows? Will marketing find the audience? The deferred-list-from-`deferred-until-proven.md` is mostly viability-risk debt: RBAC, compliance, marketplace, etc. are deferred because tackling them speculatively is a viability mis-bet, not because they do not matter. + +## Common failure mode: the lopsided ledger + +The most common Q7 failure is a ledger where three of the four risks read "low" and one reads "high". This usually means the team has thought about one risk and bracketed the others. Cagan's empirical claim is that if the ledger looks lopsided, you have not interrogated the low-rated risks; you have just assumed them. The audit cannot catch this; the human review must. + +Healthy MVP ledgers tend to look like: + +| Risk | Rating | Why | +| --- | --- | --- | +| Value | high | no paying customer yet; paid pilot is Q1 milestone | +| Usability | medium | five-user prototype study scheduled week 2 | +| Feasibility | medium | model latency under 8s on benchmark prompts; needs 3s; spike planned | +| Viability | medium | pricing TBD; legal review of generated-code IP scheduled | + +If you write "low" anywhere, you owe a sentence per row explaining the evidence that makes it low. "Our team has built this before" is not evidence; it is selection bias. + +## Risk and appetite: the cross-table + +The four-risk ledger composes with the appetite (Q8). A small appetite plus a high-value-risk feature is a kill candidate: the appetite is too small to test the value question. A large appetite plus an all-low-risk feature is overspending: you are using six weeks where two would do. The Shape Up pitch (Q8/10/11) and the four-risk ledger should be read together at the start of every cycle. + +## Mapping summary + +| Risk | Owner | MVP mitigation pattern | +| --- | --- | --- | +| Value | PM + customer | paid pilot, pre-sales, retention measurement | +| Usability | designer + customer | 5-user prototype study, click tests | +| Feasibility | engineering | spike, eval, latency test | +| Viability | exec + cross-functional | pricing, legal, support load model | + +## Reading list + +- Cagan, _Inspired_, 2nd ed., 2017. +- Cagan, _Empowered_, 2020. +- Cagan, "The Four Big Risks", svpg.com. +- Erika Hall, _Just Enough Research_, 2nd ed., 2019. +- Sequoia, "Company Building: Arc", sequoiacap.com. diff --git a/skills/.experimental/product-init/references/golden-path-doctrine.md b/skills/.experimental/product-init/references/golden-path-doctrine.md new file mode 100644 index 00000000..6fc3d76c --- /dev/null +++ b/skills/.experimental/product-init/references/golden-path-doctrine.md @@ -0,0 +1,73 @@ +--- +name: golden-path-doctrine +description: The central question every task in this skill must answer; the doctrine the filter script enforces. +type: reference +--- + +# Golden Path Doctrine + +## The central question + +> **"Bu is kullaniciyi fikirden calisan deploy edilmis urune yaklastiriyor mu? Hayirsa MVP disi."** +> +> _Does this work move a user closer to a working, deployed product? If not, it is out of MVP scope._ + +This is the only question that matters. Every task, every story, every PR, every architectural decision is filtered through it. If the answer is "yes, and I can name the user, the product step, and the next deploy", the task is on path. If the answer requires a paragraph of justification, the task is off path and goes to the deferred list. + +## Why this doctrine exists + +The skill was born out of a specific failure mode. A team built an "AI product builder" platform. Jira showed dozens of closed tickets each sprint. Demos showed AI plans being generated, capabilities being unlocked, integrations going live. The team felt productive. The system was active. + +The product never actually worked end-to-end. A real user could not start at "I have an idea" and finish at "I have a deployed working product". The frontend rendered, but it was wired to plans that were never executed. The tests were green, but they tested mocks of the things that had never been built. The Jira board was a fiction the team had agreed to believe in. + +The root cause was not laziness. The root cause was that the team had stopped asking the central question. They had switched from "are we delivering the product?" to "are we shipping capabilities of the platform?". Every closed ticket was a real piece of work. None of them, in aggregate, produced the product. + +The Golden Path Doctrine is the antidote. It is a forcing function: if a task cannot be tied to one of the seven golden path steps, it is not allowed to consume MVP time. + +## The seven golden path steps + +A user, in their own words, walks from idea to live product through exactly seven steps: + +1. **Intake.** I tell the system my idea, my persona, my pain. +2. **Spec.** The system gives me back a specification I recognise as my idea, in a form I can edit. +3. **Code.** The system writes the code that implements that spec. +4. **Test.** The system tests the code against the spec, on real infrastructure, end-to-end. +5. **Deploy.** The system deploys the code to a URL the user can hit. +6. **Handoff.** The user gets the source, docs, credentials, and a walk-through. +7. **Support.** When something breaks, the user has a path to fix or escalate. + +Anything that does not advance one of these seven steps is deferred. RBAC, compliance, marketplace, multi-region, observability stack, enterprise integrations -- all are post-MVP unless they are blocking the seven-step walk for the actual user being served. + +## How `filter_task.py` enforces it + +The filter script is intentionally simple. It tokenises the task description and scores tokens against keyword sets for each of the seven golden path steps: + +| Step | Keywords (sample) | +| --- | --- | +| 1 Intake | intake, discovery, interview, persona, idea, brief, kickoff | +| 2 Spec | spec, specification, requirements, AC, acceptance, design, wireframe | +| 3 Code | code, implement, build, refactor, feature, endpoint, component | +| 4 Test | test, unit, integration, e2e, playwright, vitest, pytest, coverage | +| 5 Deploy | deploy, release, preview, staging, production, vercel, netlify | +| 6 Handoff | handoff, documentation, runbook, credentials, walkthrough | +| 7 Support | support, incident, monitor, alert, bugfix, hotfix, maintenance | + +If the maximum score across all seven steps is below 0.3, the script prints `DEFER: no golden_path_step match` and exits non-zero. This is deliberately blunt. It is better to have a noisy false positive that forces a human to write a one-line justification than a silent off-path drift that costs a sprint. + +The filter is not a substitute for judgement. It is a tripwire. When it fires, the human in the loop reads the task, decides whether the keywords are missing because the work is off-path or because the description is vague, and either rewrites the task description (so the filter passes) or moves the task to the deferred list. The filter cannot be made smarter without making it less useful. + +## What the doctrine does not say + +The doctrine does not say "ship sloppy". It says "ship narrow". The standard for what is on the path is high; the test of whether something is on the path is yes/no, not "kind of". A polished onboarding screen for a feature the user does not yet need is off-path. A flaky deploy of the feature the user does need is on-path-but-broken, which is a Gate 5 problem, not a Gate 1 problem. + +The doctrine does not say "no platforms". Platforms are legitimate when the platform is the product. The failure mode is platform-as-displacement-activity: building platform features that no current user needs because they feel like progress. The doctrine cuts that. + +The doctrine does not say "no debt". It says debt must be named. Every TODO/FIXME/HACK in a diff requires a DEBT.md row referencing `<file>:<line>` (`audit_build.py`). Naming the debt makes it possible to pay it; hiding it makes it compound. + +## Daily ritual + +At the start of every working session, the lead reads the Golden Path sentence from PRODUCT.md aloud. Yes, aloud. The doctrine is a habit, not a slogan. Every standup ends with a one-line answer to the central question for each task in flight: yes, on step N; or no, deferring. Anything that cannot answer in one line is decomposed until it can. + +## Failure mode if the doctrine is dropped + +When the Golden Path Doctrine is dropped, the visible signature is fast: backlog inflation, increasing rate of closed tickets, decreasing rate of working golden-path runs, a demo URL that becomes an empty shell, console errors that no one fixes, integration tests that mock the things that should be real. `audit_real_wiring.py`, `audit_console_clean.py`, `audit_integration.py`, and `audit_demo_url.py` exist precisely to detect that drift before it becomes the next post-mortem. diff --git a/skills/.experimental/product-init/references/jtbd.md b/skills/.experimental/product-init/references/jtbd.md new file mode 100644 index 00000000..e3b81960 --- /dev/null +++ b/skills/.experimental/product-init/references/jtbd.md @@ -0,0 +1,70 @@ +--- +name: jtbd +description: Christensen Jobs-to-be-Done framework, the milkshake study, and how it shapes Question 2's persona+pain answer. +type: reference +--- + +# Jobs to be Done (JTBD) + +## Origin + +The Jobs-to-be-Done framework was articulated by Clayton Christensen in his 2003 book _The Innovator's Solution_ (with Michael Raynor) and the 2005 _Harvard Business Review_ article "Marketing Malpractice: The Cause and the Cure" (https://hbr.org/2005/12/marketing-malpractice-the-cause-and-the-cure). The central reframe is that customers do not "buy products"; they "hire products to do a job". The unit of analysis is the job, not the demographic. + +## The milkshake study + +Christensen's most cited example is the McDonald's milkshake. McDonald's had spent years optimising the milkshake based on demographic feedback (sweeter, thicker, chunkier) with no measurable impact on sales. A JTBD-led re-analysis revealed two distinct jobs being hired: + +- **Morning commute job (40% of milkshake sales).** Adults driving to work needed something to occupy a long, boring drive that would last them most of the commute, was easy to consume one-handed, would not stain a suit, and would suppress hunger until lunch. They were hiring the milkshake against bagels (too dry, crumbly), donuts (sticky), and fruit (gone in two minutes). The thicker the milkshake, the better -- it lasted longer through the straw. +- **Afternoon parent-treat job (separate market).** Parents bringing kids in for a treat needed something fast, kid-portion-sized, parent-permissible. Thickness mattered less; speed of service mattered more. + +Demographic analysis ("our buyers are 35-45 male commuters") missed the entire causal chain. Job analysis surfaced it in one round of interviews. The redesigns that followed (thicker mornings, smaller-portion afternoons) doubled milkshake category revenue. + +The pattern generalises: products succeed when they win a job; they fail when they target a demographic. The job-to-be-done is the causal force; demographics are correlations. + +## The three dimensions of a job + +Christensen and later JTBD practitioners (notably Bob Moesta and Chris Spiek, _Demand-Side Sales 101_, 2020) decompose every job into three dimensions: + +1. **Functional job.** What measurable outcome does the user achieve? "Get a working web app deployed in 10 minutes." "Submit a tax return without errors." "Find a song to match this mood." The functional job is the easiest to articulate and the easiest to test. +2. **Social job.** How does the user want to be seen by others while doing this job? "Look like a competent technical founder to my non-technical co-founder." "Demonstrate to my team that I tried the AI tools we discussed." "Send my designer a clean Figma file." The social job is what makes the user choose the more visible alternative even when it is functionally weaker. +3. **Emotional job.** How does the user want to feel? "Feel that I am making real progress." "Feel that I have not wasted my afternoon." "Feel that I am not going to be embarrassed by tomorrow's demo." The emotional job is what drives churn and what drives ten-out-of-ten loyalty when met. + +Skipping any of the three produces a product that is locally optimal but globally rejected. Many AI products win the functional job ("we can generate the code") and lose the emotional job ("the user does not feel they have a working product"), which is why retention craters. + +## Why JTBD is Question 2 + +Question 2 of the 14 mandatory discovery questions reads: "Persona + pain (3)". The 3 is not arbitrary. The team is required to write three pains, structured implicitly along the three JTBD dimensions: + +- **Functional pain.** "I cannot get a working URL in under an hour using existing tools." +- **Social pain.** "My co-founder thinks AI tools are not real engineering, and I want to disprove that with a demo I can hand him." +- **Emotional pain.** "Every time I try a new builder, I end up with a half-finished project that mocks me from my desktop." + +A persona statement with one functional pain is failing JTBD. Two pains in the same dimension is failing JTBD. Three pains across the three dimensions is the minimum bar. + +## What JTBD is not + +JTBD is not a replacement for personas; it is a constraint on them. A persona without a job is a horoscope. A job without a persona is an abstract noun. The pair is what travels. + +JTBD is not a license to skip user research. The jobs are discovered, not deduced. The Torres Continuous Discovery cadence (`continuous-discovery.md`) is the practice that surfaces real jobs over time; JTBD is the analysis frame for what surfaces. + +JTBD is not the same as "user stories". A user story is a deliverable contract ("As a / I want / so that"). A job is a causal claim about the user. The story is downstream of the job. + +## Anti-patterns + +**The featurised job.** "The user wants to use our AI orchestrator." This is not a job; it is the product wearing a job costume. If the user could hire a different product to do the same job, name it. If they cannot, the job is too narrow. + +**The aspirational job.** "The user wants to ship a billion-dollar SaaS." Real jobs are immediate, instrumented, and provable in a 10-minute success signal (Question 4). Billion-dollar is not a job; it is a brochure. + +**The internal job.** "The user wants efficient code generation." Internal jobs describe how the team thinks about its tech stack, not what the user is hiring the product for. The user hires the product to ship a working URL today. + +## How `audit_constitution.py` checks Q2 + +The audit looks for the keyword `persona` (case-insensitive) in PRODUCT.md. This is a low bar by design; the heavy lifting is human review. The audit is a tripwire that catches "we forgot to fill it in"; it cannot catch "we filled it in with a featurised job". The Gate 1 review meeting is where a teammate reads the persona+pain entry aloud and asks: "Could a different product do this job? Is the pain functional, social, AND emotional? Could we write this paragraph for any other product and have it still be true?". If yes to any, the answer goes back for revision. + +## Reading list + +- Christensen, _The Innovator's Solution_, ch. 3. +- Christensen, "Marketing Malpractice", _HBR_ 2005. +- Bob Moesta, _Demand-Side Sales 101_, Lioncrest, 2020. +- Tony Ulwick, _Jobs to be Done: Theory to Practice_, 2016 -- the more rigorous, outcomes-driven (ODI) version of JTBD. +- Alan Klement, _When Coffee and Kale Compete_, 2016 -- a working-day-to-day JTBD primer. diff --git a/skills/.experimental/product-init/references/lean-startup.md b/skills/.experimental/product-init/references/lean-startup.md new file mode 100644 index 00000000..6cf0486e --- /dev/null +++ b/skills/.experimental/product-init/references/lean-startup.md @@ -0,0 +1,91 @@ +--- +name: lean-startup +description: Eric Ries's Build-Measure-Learn loop, validated learning, and innovation accounting; the source of Q6 and the kill criteria discipline. +type: reference +--- + +# Lean Startup + +## Origin + +Eric Ries published _The Lean Startup_ in 2011 (Crown Business). The book synthesised three lineages: Toyota's lean manufacturing (Womack and Jones), Steve Blank's customer development (_The Four Steps to the Epiphany_, 2005), and the agile/XP software movement. Ries had previously been CTO and co-founder of IMVU, where the practices that became Lean Startup were field-tested under conditions of extreme uncertainty. + +Three central ideas anchor the framework: + +1. **Validated learning.** Progress is measured in "what we have learned about what customers want", not in "code shipped" or "features built". +2. **Build-Measure-Learn loop.** The smallest possible product is built, measured against a hypothesis, and the result feeds the next loop. The loop is the unit of work. +3. **Innovation accounting.** A reporting framework that tracks per-cohort metrics so a team can tell whether its product is genuinely improving or whether headline metrics are being inflated by mix shift, vanity selection, or growth. + +## Why Lean Startup powers Q6 and kill criteria (Q9) + +Question 6 of the 14 mandatory discovery questions reads: "What is the riskiest assumption?" The answer is the seed of the next Build-Measure-Learn loop. The riskiest assumption is the one that, if false, kills the product; everything else is downstream of it. For an AI product builder MVP the riskiest assumption is often: "A solo technical founder, given a working URL in 10 minutes, will pay $50/month within a week." If that assumption is false, no amount of better orchestration matters. + +Question 9 reads: "Kill criteria?" Lean Startup's contribution is the insistence that the criteria be falsifiable and pre-committed. Ries's argument in the "Innovation Accounting" chapter is that without pre-committed kill criteria, teams will rationalise survival of every feature post hoc. The criteria must be: numeric, time-bound, and tied directly to the riskiest assumption. + +`audit_sow.py` enforces the structural form (at least one kill criterion bullet under `## Kill Criteria` in PLAN.md). The substantive review is human: is the criterion actually falsifiable? "Users will love it" is not falsifiable. "If fewer than 3 of 10 paid pilot users complete the golden path within 10 minutes, we kill the v1 architecture and re-shape" is falsifiable. + +## The Build-Measure-Learn loop + +Each loop is a structured experiment: + +1. **Build.** The smallest product or change that exposes the riskiest assumption to a real test. +2. **Measure.** Cohort-based, leading-indicator metrics that distinguish the experiment cohort from the baseline. +3. **Learn.** A written, signed-off conclusion: assumption confirmed, refuted, or inconclusive (and what we will do next). + +The cycle is short by design. Ries advocates loops measured in days for early-stage products and weeks for mature ones. Teams that run quarterly loops have stopped being lean; they are doing waterfall in a hoodie. + +For this skill, the loop maps onto the 9 gates as follows. Gates 1-2 set the hypothesis. Gates 3-5 build and instrument. Gates 6-7 measure with real users on the live URL. Gate 8 captures the learning in HANDOFF.md and DEBT.md. Gate 9 makes the regime survive. + +## Validated learning vs. activity + +Ries's most-cited observation: "If we are building the wrong thing, then optimising the product or its marketing won't yield significant results." The corollary is that hours spent building the wrong thing are not just wasted; they are negative-progress, because they entrench commitments to the wrong thing. + +Validated learning means the team explicitly distinguishes: + +- **Output.** Tickets closed, lines of code, sprint velocity. Internal-facing. +- **Outcome.** Cohort retention, conversion, NPS, the Sean Ellis 40% test, golden-path completion rate. Customer-facing. +- **Learning.** Which assumptions were confirmed or refuted; what the team now believes that it did not believe last cycle. + +A high-output, low-outcome, no-learning quarter is the catastrophic case. The 9-gate regime is built to make it loud: Gate 1's outcome metric, Gate 5's real-URL E2E, Gate 6's signed UAT, and Gate 7's prod_url 200 check are all outcome anchors. + +## Innovation accounting + +Ries proposes three steps to graduating from "leap-of-faith" to "validated": + +1. **Use a minimum viable product to establish real data on where the company is right now.** Not what you think; what is true today. +2. **Tune the engine** -- iterate on the product to improve the metrics from baseline toward the goal. +3. **Pivot or persevere** -- if iteration is not closing the gap, change strategy. + +The "pivot or persevere" decision is made on a fixed cadence (Ries suggests monthly; this skill recommends per-cycle). Without a fixed cadence, the persevere choice becomes a default and the pivot never happens. + +For Gate 1's outcome metric (Q5), the team writes the baseline AND the target AND the cadence on which the team will look at the gap. "Improve conversion" without those three is not an outcome metric; it is a wish. + +## Pivot taxonomy + +Ries names ten pivot types in chapter 8. The most operationally useful for AI product builders are: + +- **Zoom-in pivot.** A single feature of the product becomes the whole product. (Most successful AI tool pivots are zoom-ins.) +- **Customer-segment pivot.** Same product, different customer. (When the engineer-tool became a designer-tool.) +- **Customer-need pivot.** Different problem. (The hardest pivot; usually a near-restart.) +- **Platform pivot.** App-to-platform or platform-to-app. (Often capability-led; often wrong.) +- **Engine-of-growth pivot.** Switch growth model (viral / sticky / paid). Late-stage. + +Naming the candidate pivot in the kill-criteria section makes the pivot a real option, not a panic move. "If we miss the kill criteria, we pivot to a customer-segment pivot toward [specific segment]" is a pre-committed plan; the team is not improvising under stress. + +## Common anti-patterns + +**Vanity metrics.** Total signups, total page views, total stars. None of them tell you whether you are making validated progress on the riskiest assumption. Replace with cohort retention, paid conversion, golden-path completion rate. + +**The MVP that is not minimum.** "MVP" gets pattern-matched to "v1.0 with three features". A real MVP is the smallest experiment that exposes the riskiest assumption. A landing page can be an MVP. A Figma click-through can be an MVP. The actual product, if it costs six months, is not an MVP; it is the bet. + +**Persevere by default.** The team has not articulated kill criteria, so every cycle the answer to "should we pivot?" is "let's give it another cycle". After four cycles, the sunk cost is too large to confront. Pre-committed kill criteria break this. + +**Learning without writing it down.** A loop that ends with "we learned a lot" but produces no written conclusion did not learn anything; it ran a vibe. Each loop ends with a one-page learnings doc filed alongside the OST. + +## Reading list + +- Ries, _The Lean Startup_, Crown Business, 2011. +- Steve Blank, _The Four Steps to the Epiphany_, 2nd ed., 2013. +- Steve Blank and Bob Dorf, _The Startup Owner's Manual_, 2012. +- Ash Maurya, _Running Lean_, 3rd ed., 2022 (operational primer). +- Ries, "Innovation Accounting" -- chapter 7 of _The Lean Startup_, also summarised at http://theleanstartup.com/principles. diff --git a/skills/.experimental/product-init/references/nine-gate-spec.md b/skills/.experimental/product-init/references/nine-gate-spec.md new file mode 100644 index 00000000..e21a1dbb --- /dev/null +++ b/skills/.experimental/product-init/references/nine-gate-spec.md @@ -0,0 +1,145 @@ +--- +name: nine-gate-spec +description: Per-gate purpose, deliverable, audit script, and tooling for the 9-gate delivery regime. +type: reference +--- + +# Nine-Gate Specification + +Each gate is a unit of work whose closure is conditioned on a programmatic audit returning zero HIGH/CRITICAL findings. A gate that is "manually approved" without an audit is not closed; it is wished closed. The audits live in `scripts/` and are wired into CI by Gate 9. + +## Gate 1 -- Discovery Constitution + +**Purpose.** Convert a one-line idea into a constitution the team can be held to. The output is the contract every later gate refers back to. + +**Deliverables.** +- `PRODUCT.md` with frontmatter, one-sentence Golden Path, persona+pain, outcome metric, riskiest assumption, four-risk ledger, golden_path_step 1-7, prod_url placeholder. +- `SPEC.md` with scope and acceptance criteria. +- `PLAN.md` with appetite, kill criteria, deferred list (>=3 items, >=3 from the banned list). +- `TASKS.md` initial backlog. +- `COMPETITIVE_BENCHMARK.md` with at least one numeric comparison vs v0/Bolt/Lovable/Railway. + +**Audit.** `audit_constitution.py`. Checks file presence, frontmatter, required sections, single-sentence Golden Path (regex on terminators), and the 14 mandatory questions. + +**Tools.** `python-frontmatter`, `pyyaml`, regex. + +## Gate 2 -- Statement of Work + +**Purpose.** Freeze scope. The SoW is the appetite and the kill criteria, not a spec. + +**Deliverables.** +- Numeric appetite in PLAN.md (e.g. "4 weeks" or "$15k" or "1 sprint"). +- At least one falsifiable kill criterion. +- Deferred list with >=3 of: RBAC, compliance, marketplace, multi-region, observability, integrations. +- TASKS.md must not contain any banned-list term in MVP scope. + +**Audit.** `audit_sow.py`. + +**Tools.** Regex against PLAN.md/TASKS.md. + +## Gate 3 -- Design + +**Purpose.** Every screen the user touches is mapped to a `golden_path_step`. Off-step screens are killed before pixels are pushed. + +**Deliverables.** A `design/` directory with one file per screen, each carrying a `golden_path_step: N` frontmatter field and a one-line user goal. Wireframes/mockups checked in or linked. No screen exists without a step. + +**Audit.** Manual review (programmatic audit deferred to v1.1). The orchestrator's `gate 3` command currently prints a manual-review notice. The skill remains red on Gate 3 until a designer signs off in `design/SIGNOFF.md`. + +**Tools.** Figma, Excalidraw, or plain markdown wireframes; whatever leaves a versioned artefact. + +## Gate 4 -- Build + +**Purpose.** Every commit traces to an AC; debt is named, not hidden; new TODO/FIXME without a DEBT.md row is a hygiene failure. + +**Deliverables.** +- Every commit on the branch contains a ticket id (`[A-Z]+-\d+`). +- Every new TODO/FIXME/XXX/HACK has a DEBT.md row referencing `<file>:<line>`. +- No new `.skip`, `.only`, `it.todo`, or `xfail` in test files. +- No mock/stub/localhost references in non-test source (`audit_real_wiring.py`). + +**Audit.** `audit_build.py`, `audit_real_wiring.py`. + +**Tools.** `git diff origin/main...HEAD`, `git log --oneline`, ripgrep. + +## Gate 5 -- QA + +**Purpose.** The product works for a real user against real infrastructure. Tests are not theatre. + +**Deliverables.** +- Unit tests pass with zero skips (`audit_unit.py`). +- Integration tests do not mock HTTP/DB layers (`audit_integration.py`). +- E2E tests run against a non-localhost preview URL with at least one `@golden-path` test green (`audit_e2e.py`). +- API contract: oasdiff/graphql-inspector against origin/main shows no breaking change OR a DEBT.md entry (`audit_contract.py`). +- Diff coverage >= 80% (`audit_coverage.py`). +- Mutation score >= 60% if Stryker/mutmut configured (`audit_mutation.py`). +- Static analysis (eslint/tsc/ruff/mypy) clean with strict flags (`audit_static.py`). +- Console error/warning count = 0 on golden path (`audit_console_clean.py`). + +**Tools.** vitest/jest/pytest, Playwright, oasdiff, graphql-inspector, diff-cover, Stryker/mutmut, eslint/tsc/ruff/mypy. + +## Gate 6 -- UAT + +**Purpose.** A human walks the product on the live URL and signs off in writing. + +**Deliverables.** +- `e2e/uat/` with at least one `*.uat.spec.{ts,js,py}`. +- `UAT_REPORT.md` with `sha256:` of the artefact under test and `Signed-off-by:` line. +- Git tag matching `uat-v*`. + +**Audit.** `audit_uat.py`. + +**Tools.** Playwright (UAT scripts), `git tag`, sha256. + +## Gate 7 -- Deploy + +**Purpose.** Production is real, monitored, and rollback-tested. + +**Deliverables.** +- `prod_url` in PRODUCT.md frontmatter resolves with HTTP 200 and non-empty `<title>`. +- A workflow file under `.github/workflows/*.yml` references `smoke`. +- Either a git tag `rollback-drill-YYYY-MM-DD` within 14 days OR `runbooks/rollback-drills.md` modified within 14 days. + +**Audit.** `audit_deploy.py`, `audit_demo_url.py`. + +**Tools.** `requests`, GitHub Actions, `git tag`. + +## Gate 8 -- Handoff + +**Purpose.** The user receives source, runbook, credentials path, walkthrough, and a debt ledger. + +**Deliverables.** +- `README.md`, `runbooks/runbook.md`, `DEBT.md`, `HANDOFF.md` exist and are non-empty. +- `HANDOFF.md` has filled sections: Credentials Vault Link, Admin Walkthrough Video, Knowledge Transfer Date. + +**Audit.** `audit_handoff.py`. + +**Tools.** Filesystem checks, regex. + +## Gate 9 -- Warranty + +**Purpose.** The audit regime travels with the code. Future contributors cannot ship around it. + +**Deliverables.** +- `.github/workflows/*.yml` references `audit_constitution`, `audit_build`, `audit_qa` (or the full set). +- Branch protection on main includes those audit jobs as required checks. + +**Audit.** `audit_warranty.py`. + +**Tools.** YAML parsing (`pyyaml`), `gh api repos/:owner/:repo/branches/main/protection`. + +## Tooling stack quick reference + +| Tool | Purpose | Where used | +| --- | --- | --- | +| ripgrep | fast grep for debt markers, mock patterns | Gate 4, cross-cutting | +| diff-cover | diff-only line coverage threshold | Gate 5 | +| Stryker (Node) / mutmut (Py) | mutation testing | Gate 5 | +| Playwright | E2E + UAT against real preview URL | Gate 5, 6 | +| oasdiff | OpenAPI breaking-change detection | Gate 5 | +| graphql-inspector | GraphQL breaking-change detection | Gate 5 | +| eslint, tsc, ruff, mypy | static analysis (strict) | Gate 5 | +| `gh` CLI | branch protection inspection | Gate 9 | +| markdownlint, vale | doc hygiene | Gate 1, 8 | +| repolinter | repo structural conformance | Gate 9 | + +The tools must exist on `$PATH` for the audit to run. If a tool is missing, the audit emits an `INFO` finding with the install hint and continues. Missing tools are not pass-by-default at Gate 9; the warranty audit checks that CI itself has them installed. diff --git a/skills/.experimental/product-init/references/research-evidence.md b/skills/.experimental/product-init/references/research-evidence.md new file mode 100644 index 00000000..e0d2424c --- /dev/null +++ b/skills/.experimental/product-init/references/research-evidence.md @@ -0,0 +1,70 @@ +--- +name: research-evidence +description: Empirical grounding for the 9-gate regime; PMF base rates, discovery-debt arxiv evidence, and the gate each finding maps to. +type: reference +--- + +# Research Evidence + +This skill is opinionated, but the opinions are grounded. The regime is hard because the base rates of failure for AI-built and agency-built products are catastrophic, and because the failure modes have well-documented signatures in the public record. This file is the citation map; every gate in `nine-gate-spec.md` traces back to at least one finding here. + +## 1. The PMF base rate is not a forecast, it is gravity + +CB Insights, "The Top 12 Reasons Startups Fail" (2024 update, n = 431 startup post-mortems) finds that **42-43% of startups fail because there is no market need** for what they built. This is the single largest cause and dwarfs "ran out of cash" (29%) and "got outcompeted" (19%). Source: https://www.cbinsights.com/research/startup-failure-reasons-top/. The implication is operational: any process that reduces "no-market-need" risk by even a few percentage points pays for itself many times over. Gate 1 (Discovery Constitution) and Gate 2 (Statement of Work) exist primarily for this reason. The 14 mandatory discovery questions in `audit_constitution.py` force the team to stand at the centre of "no market need" and stare at it before a single line is written. + +## 2. Sean Ellis 40% test for product-market fit + +Sean Ellis's PMF survey question -- "How would you feel if you could no longer use this product?" with the 40% "very disappointed" threshold -- has been replicated across hundreds of B2B and B2C products. Source: https://www.startup-marketing.com/the-startup-pyramid/ and Ellis's work at Dropbox, LogMeIn, and Eventbrite. The Superhuman PMF Engine (Rahul Vohra, First Round Review, 2018) operationalised this as a weekly tracked metric tied to product changes: https://review.firstround.com/how-superhuman-built-an-engine-to-find-product-market-fit/. **Gate 1 Question 5 (outcome metric) and Question 13 (weekly discovery cadence)** trace directly to these papers; the requirement is that the team writes down the metric and the cadence before they build, not after. + +## 3. Discovery-debt has a measurable signature in the academic record + +Several arxiv papers establish that requirements/discovery work skipped early shows up later as integration failures, rework, and abandoned features: + +- arXiv:1709.04749 -- "Software Engineering Antipatterns" (Brown et al.) catalogues "Analysis Paralysis", "Premature Implementation", and "Death March" as recurring patterns whose root cause is shipping into ill-defined problem spaces. Each maps to symptoms `audit_constitution.py` and `audit_sow.py` are designed to surface (vague golden path, missing kill criteria, undocumented deferred list). +- arXiv:2103.07999 -- studies on requirements completeness in agile contexts find a strong negative correlation between explicit acceptance criteria written before sprint start and post-release defect density. **Gate 4** (`audit_build.py`) enforces commit-to-AC tickets so that the agile loop actually has the AC it claims to have. +- arXiv:1712.00674 -- empirical study of test smells finds that skipped/disabled tests, mocked external dependencies, and "test theatre" (high green-rate, low mutation-score) correlate strongly with defect escape rate. **Gate 5** (`audit_mutation.py`, `audit_integration.py`, `audit_unit.py`) operationalises these findings: skipped tests are HIGH, mocked HTTP/DB in integration tests is CRITICAL, and mutation score < 60% is HIGH. + +## 4. Sequoia Arc, Cagan, and the four risks + +Sequoia's Arc programme materials (https://www.sequoiacap.com/article/company-building-arc/) and Marty Cagan's _Inspired_ and _Empowered_ (SVPG) define four product risks: **Value, Usability, Feasibility, Viability**. Cagan's blog post "The Four Big Risks" (https://www.svpg.com/four-big-risks/) is the canonical source. Empirically, teams that name the four risks per feature ship measurably fewer "wrong-thing-built" rollbacks. **Gate 1 Question 7** forces the four-risk ledger; if a feature has unanswered risk on any axis, it cannot pass Gate 2. + +## 5. JTBD: Christensen's milkshake study + +Clayton Christensen's "Jobs to be Done" framework (HBR, "Marketing Malpractice", 2005, https://hbr.org/2005/12/marketing-malpractice-the-cause-and-the-cure) and the McDonald's milkshake study reframe the user from demographic to job-hirer. The functional/social/emotional dimensions of the job are what survive contact with reality; demographics rarely do. **Gate 1 Question 2** demands a persona statement plus three pains, structured as JTBD-style functional, social, and emotional outcomes. + +## 6. Continuous Discovery and the Opportunity Solution Tree + +Teresa Torres, _Continuous Discovery Habits_ (2021) and her ProductTalk archive (https://www.producttalk.org/opportunity-solution-tree/) prescribe a weekly touchpoint cadence with at least one user, mapped against an opportunity solution tree. Torres's data from Product Talk Academy cohorts shows weekly cadence teams ship 2-3x more validated features per quarter than monthly cadence teams. **Gate 1 Question 13** locks the cadence in writing. + +## 7. Shape Up and the appetite-first contract + +Basecamp's _Shape Up_ (Ryan Singer, 2019, https://basecamp.com/shapeup) replaces estimates with appetites: a number you would actually spend if the answer is "yes". Empirically, fixed-appetite teams kill more bad bets earlier because the appetite is the kill threshold. **Gate 1 Questions 8 (appetite), 9 (kill criteria), 10 (rabbit holes), 11 (no-gos / deferred list)** are the Shape Up pitch shape. `audit_sow.py` enforces a numeric appetite and at least three deferred items, with at least three from the banned list (RBAC, compliance, marketplace, multi-region, observability, integrations). + +## 8. Lean Startup and validated learning + +Eric Ries, _The Lean Startup_ (2011) and the Build-Measure-Learn loop frame every release as an experiment. The "innovation accounting" chapter argues that without explicit kill criteria, teams will rationalise survival of every feature. **Gate 1 Question 6 (riskiest assumption)** and **Question 9 (kill criteria)** make this explicit. `audit_sow.py` requires falsifiable kill conditions in PLAN.md. + +## 9. Working Backwards: the Amazon PR-FAQ + +Colin Bryar and Bill Carr, _Working Backwards_ (2021), document Amazon's PR-FAQ as the gating ritual: write the press release before the project starts. This forces the customer-facing outcome to be writable in plain English, surfacing capability-only features as embarrassing prose. The PR-FAQ template lives in `templates/pr-faq.md` and Question 4 (10-minute success signal) is the PR's headline distilled. + +## 10. Anti-patterns from lived industry experience + +Beyond the academic record, the immediate trigger for this skill is a single repeating field signature: a team built an "AI product builder", marked Jira tickets closed for months, and the actual product never worked end-to-end. AI plans gave false comfort; "done" was per-component; the frontend was wired but broken; the tests were green but mutation-dead; the demo URL was an empty shell. Every one of those failure modes is in `references/anti-patterns.md` with a programmatic counter-clamp. The 9-gate regime is built so that next time, the audits go red before the celebration goes out. + +## Mapping summary + +| Finding | Gate(s) | +| --- | --- | +| CB Insights 43% no-market-need | 1, 2 | +| Sean Ellis 40% / Superhuman engine | 1 (Q5, Q13) | +| arXiv 1709.04749 antipatterns | 1, 2, 4 | +| arXiv 2103.07999 AC completeness | 4 | +| arXiv 1712.00674 test smells | 5 | +| Sequoia Arc / Cagan four risks | 1 (Q7) | +| Christensen JTBD | 1 (Q2) | +| Torres OST | 1 (Q13) | +| Shape Up appetite | 1 (Q8-11), 2 | +| Lean Startup kill criteria | 1 (Q6, Q9), 2 | +| Working Backwards PR-FAQ | 1 (Q1, Q4) | +| Field anti-patterns | 4, 5, 6, 7 | diff --git a/skills/.experimental/product-init/references/shape-up.md b/skills/.experimental/product-init/references/shape-up.md new file mode 100644 index 00000000..8ff17e43 --- /dev/null +++ b/skills/.experimental/product-init/references/shape-up.md @@ -0,0 +1,94 @@ +--- +name: shape-up +description: Basecamp's Shape Up pitch (Problem, Appetite, Solution, Rabbit Holes, No-gos) and how it powers Q8/10/11. +type: reference +--- + +# Shape Up + +## Origin + +_Shape Up_ by Ryan Singer (2019, free at https://basecamp.com/shapeup) documents the product development methodology Basecamp evolved over fifteen years and used to build Basecamp 3 and Hey. Singer was Head of Product Strategy at Basecamp and wrote the book to externalise practices that were, until then, informal. The methodology has since been adopted by hundreds of small product teams who reject Scrum's estimate-driven cadence in favour of fixed-time, variable-scope cycles. + +## The five-section pitch + +The Shape Up pitch is the artefact every initiative produces before being scheduled into a six-week cycle. It has five sections: + +1. **Problem.** The raw user pain or business need. One paragraph, in user-vocabulary, with a specific instance. +2. **Appetite.** How much time the team is willing to spend on this. A small batch is two weeks; a big batch is six weeks. The appetite is fixed; the scope flexes around it. +3. **Solution.** A "fat-marker sketch" of the solution: enough fidelity to communicate the shape, not so much that engineering loses optionality. Singer is explicit that the sketch is intentionally underspecified. +4. **Rabbit holes.** The places where the team could spend the entire appetite if not warned off. Listed explicitly so the team can route around them. +5. **No-gos.** What is explicitly out of scope. The deferred list. + +Together, the five sections make the pitch a contract: the team will spend X time on this problem, attempt this shape of solution, avoid these rabbit holes, and not build these no-gos. If the appetite is exceeded, the work is killed -- not extended. + +## Why Shape Up powers Q8, Q10, and Q11 + +The 14 mandatory discovery questions adopt three Shape Up sections directly: + +- **Q8 -- numeric appetite.** "What is your appetite in weeks or budget?" `audit_sow.py` regex-greps PLAN.md for `appetite\s*[:=]?\s*\d+\s*(week|day|sprint|hour|\$)`. Without a number, the appetite is a wish. +- **Q10 -- rabbit holes.** "What rabbit holes will eat the appetite?" Examples for an AI product builder: state synchronisation between agent runs, cross-cloud deploy abstractions, semantic diffing of generated code. Naming them upfront lets the team build trip-wires (timeboxes, escape hatches, "just hardcode it for v1"). +- **Q11 -- deferred list / no-gos.** "What is explicitly out of v1?" Must include three or more of the banned MVP categories (`deferred-until-proven.md`). + +## The fixed-time, variable-scope contract + +The deepest Shape Up commitment is that **time is fixed and scope is flexible**, never the inverse. The team does not "estimate how long it will take and then do all of it". The team takes the appetite, picks the scope it can ship in that appetite, and ships exactly that. If the appetite runs out, the work is shipped at whatever-state-it-is-in or killed. There is no "we just need two more weeks". Two more weeks is a new cycle, requiring a new pitch. + +This contract is incompatible with most velocity-based agile rituals. Shape Up has no story points, no burndown chart, no daily standup as ceremony. It has the pitch, the betting table (where pitches are selected for the next cycle), the cycle itself (six weeks of focused work), and the cool-down (two weeks of cleanup, exploration, debt). The cycle is the unit; the sprint is not. + +For this skill, the appetite (Q8) is the kill threshold. If the team is consistently shipping past appetite, either the appetites are wrong or the scope is wrong; either way, the pitch contract is broken. + +## Hill chart, not burndown + +Shape Up replaces burndown with the **hill chart**: every work item is plotted on an "uphill / downhill" curve. Uphill = "still figuring out how to do it". Downhill = "now executing the known plan". The chart is updated by ICs themselves, not by a project manager pulling tickets. Items stuck uphill are the rabbit holes (Q10) that the pitch warned about; they are escalation candidates. + +This skill does not enforce hill charts (that would be over-prescription), but the pattern matters: the audit suite is built so that gates fail when the work is "obviously stuck" -- a stuck Gate 5 is the audit's version of "stuck uphill on testing". The signal is the same; the artefact differs. + +## Common anti-patterns + +**The infinite appetite.** "We will spend whatever it takes." This is not Shape Up; it is a death march. Real appetites are uncomfortable; they force scope choices. + +**The estimate disguised as appetite.** "Appetite: 6 weeks (we estimate it will take 6 weeks)." The appetite is what you are willing to spend, not what you predict it will cost. If the estimate equals the appetite, you have not made a bet; you have copied a guess. + +**The expanding solution.** The fat-marker sketch becomes a Figma mock, becomes a 30-page spec. By the time engineering starts, the optionality has been spent. Singer's rule: keep the sketch genuinely fat-marker; let the team make local decisions. + +**The unspecified no-gos.** "Out of scope: TBD." The no-gos must be enumerated. If it is not on the no-gos list, an engineer is allowed to assume it is in scope. Audit (`audit_sow.py`) catches the empty deferred list. + +## Worked example: AI product builder MVP pitch + +```yaml +problem: | + Solo technical founders trying to ship a B2B SaaS MVP spend 4-6 weekends + on infrastructure and CRUD scaffolding before any customer-facing + feature. They give up before they reach a paying customer. + +appetite: 6 weeks + +solution_sketch: | + A single-page intake form -> AI generates spec -> user edits -> AI + generates code -> tests run on a real preview URL -> one-click deploy + to Vercel/Fly. The user gets a working URL within 10 minutes of intake. + +rabbit_holes: + - Multi-cloud deploy abstraction. Use Vercel only in v1. + - Generic spec editor. Pre-fill from a small template library. + - Custom auth. Use NextAuth or equivalent provider in v1. + - Cross-agent state sync. Use a single agent in v1; multi-agent v2. + +no_gos: + - RBAC; one user per project. + - Compliance / SOC2 evidence collection. + - Marketplace of templates; ship 5 hardcoded templates. + - Multi-region. + - Observability dashboard; structured logs only. + - SAML / SCIM enterprise integrations. +``` + +This pitch passes `audit_sow.py`: numeric appetite, named rabbit holes, deferred list with five of the six banned categories. + +## Reading list + +- Singer, _Shape Up_, Basecamp, 2019, free online: https://basecamp.com/shapeup. +- "Shape Up FAQ" on Basecamp's site for adoption questions. +- "Three Levels of Product Hierarchy", Singer's follow-up on Substack, for the appetite -- pitch -- cycle relationship. +- David Heinemeier Hansson's "Reconsider" essay (https://world.hey.com/dhh/reconsider-41f44e9c) for the cultural backdrop. diff --git a/skills/.experimental/product-init/references/tooling-stack.md b/skills/.experimental/product-init/references/tooling-stack.md new file mode 100644 index 00000000..8beefcef --- /dev/null +++ b/skills/.experimental/product-init/references/tooling-stack.md @@ -0,0 +1,162 @@ +--- +name: tooling-stack +description: Exact installation commands, invocation patterns, and exit-code semantics for every tool the audit suite calls. +type: reference +--- + +# Tooling Stack + +Every tool the audit suite calls is documented here with: **install**, **invocation**, **exit-code semantics**. If a tool is missing, `lib/tool_runner.py` returns exit code 127 and the calling audit emits an `INFO` finding with the install hint. + +## Doc and prose hygiene + +### markdownlint +- **Install.** `npm i -g markdownlint-cli` (Node) or `markdownlint-cli2`. +- **Invocation.** `markdownlint "**/*.md" --ignore node_modules` +- **Exit codes.** `0` clean; `1` lint failures; `2` config error. +- **Used at.** Gate 1 (PRODUCT.md, SPEC.md, etc.), Gate 8 (HANDOFF.md, README.md). + +### vale +- **Install.** `brew install vale` or download from https://vale.sh. +- **Invocation.** `vale --output=line .` +- **Exit codes.** `0` clean; `1` issues found. +- **Used at.** Gate 1 / 8 prose review (optional). + +### gitlint +- **Install.** `pipx install gitlint`. +- **Invocation.** `gitlint --commits origin/main..HEAD` +- **Exit codes.** `0` clean; `1` violation. +- **Used at.** Gate 4 (commit message hygiene; complementary to `audit_build.py` ticket-id check). + +## Search and static analysis + +### ripgrep (rg) +- **Install.** `brew install ripgrep`. +- **Invocation.** `rg -n 'TODO|FIXME|XXX|HACK' --type py --type ts` +- **Exit codes.** `0` matches found; `1` no matches; `2` error. +- **Used at.** Gate 4 debt-marker scan, Gate 4 real-wiring scan. Audits use Python `re` directly so ripgrep is optional. + +### eslint +- **Install.** project-local `npm i -D eslint`. +- **Invocation.** `npx eslint . --max-warnings 0` +- **Exit codes.** `0` clean; `1` lint errors; `2` config error. +- **Used at.** `audit_static.py`. Strict flag is `--max-warnings 0`. + +### tsc +- **Install.** project-local `npm i -D typescript`. +- **Invocation.** `npx tsc --noEmit` +- **Exit codes.** `0` clean; non-zero on type errors. +- **Used at.** `audit_static.py`. + +### ruff +- **Install.** `pipx install ruff` or `uv pip install ruff`. +- **Invocation.** `ruff check .` +- **Exit codes.** `0` clean; `1` lint findings; `2` config. +- **Used at.** `audit_static.py`. + +### mypy +- **Install.** `pipx install mypy`. +- **Invocation.** `mypy --strict .` +- **Exit codes.** `0` clean; `1` type errors; `2` usage. +- **Used at.** `audit_static.py`. + +### ts-prune / vulture +- **ts-prune install.** `npm i -D ts-prune`. Detects unused TypeScript exports. +- **vulture install.** `pipx install vulture`. Detects unused Python code. +- **Invocation.** `npx ts-prune` / `vulture src/`. +- **Exit codes.** Both: non-zero on findings. +- **Used at.** Optional Gate 4 dead-code check; not yet wired but recommended. + +## Test and coverage + +### vitest +- **Install.** `npm i -D vitest`. +- **Invocation.** `npx vitest run --reporter=json`. +- **Exit codes.** `0` all pass; `1` failures. +- **Used at.** `audit_unit.py`. + +### jest +- **Install.** `npm i -D jest`. +- **Invocation.** `npx jest --json --ci`. +- **Exit codes.** `0` pass; `1` failures. +- **Used at.** `audit_unit.py`. + +### pytest +- **Install.** `pipx install pytest pytest-json-report`. +- **Invocation.** `pytest --tb=no -q --json-report --json-report-file=-`. +- **Exit codes.** `0` pass; `1` failures; `2` interrupted; `5` no tests. +- **Used at.** `audit_unit.py`. + +### diff-cover +- **Install.** `pipx install diff-cover`. +- **Invocation.** `diff-cover coverage.xml --compare-branch=origin/main --fail-under=80 --json-report diff-cover.json`. +- **Exit codes.** `0` met threshold; `1` below. +- **Used at.** `audit_coverage.py`. Falls back to coverage.xml line-rate or `coverage-final.json` if diff-cover is missing. + +### Stryker (Node mutation testing) +- **Install.** `npm i -D @stryker-mutator/core @stryker-mutator/vitest-runner`. +- **Invocation.** `npx stryker run`. +- **Output.** `reports/mutation/mutation-report.json` with `mutationScore`. +- **Used at.** `audit_mutation.py`. Threshold: 60%. + +### mutmut (Python mutation testing) +- **Install.** `pipx install mutmut`. +- **Invocation.** `mutmut run` then `mutmut results`. +- **Used at.** `audit_mutation.py`. Threshold: 60%. + +## E2E and contract + +### Playwright +- **Install.** `npm i -D @playwright/test && npx playwright install`. +- **Invocation.** `npx playwright test --reporter=json`. +- **Exit codes.** `0` pass; `1` failures. +- **Used at.** `audit_e2e.py`. Requires baseURL to be a real preview URL (not localhost). Tests must include at least one `@golden-path` tag. + +### oasdiff +- **Install.** `go install github.com/tufin/oasdiff@latest` or `brew install oasdiff`. +- **Invocation.** `oasdiff breaking <old.yaml> <new.yaml>`. +- **Exit codes.** `0` no breaking changes; non-zero with output on breaking. +- **Used at.** `audit_contract.py`. The audit stashes `origin/main:openapi.yaml` to a temp file for comparison. + +### graphql-inspector +- **Install.** `npm i -g @graphql-inspector/cli`. +- **Invocation.** `graphql-inspector diff <old.graphql> <new.graphql>`. +- **Used at.** `audit_contract.py`. + +## Deploy and operations + +### Vercel CLI +- **Install.** `npm i -g vercel`. +- **Invocation.** `vercel --prod`, `vercel inspect <url>`. +- **Used at.** Gate 7 deploy (manual or scripted). + +### Netlify CLI +- **Install.** `npm i -g netlify-cli`. +- **Invocation.** `netlify deploy --prod`. +- **Used at.** Gate 7. + +### Argo CD / Argo Rollouts +- **Install.** Cluster-side; `kubectl argo rollouts` plugin. +- **Used at.** Gate 7 (canary rollouts and rollback drills) for k8s shops. + +### gh CLI +- **Install.** `brew install gh && gh auth login`. +- **Invocation.** `gh api repos/:owner/:repo/branches/main/protection`. +- **Exit codes.** `0` ok; non-zero on auth/permission. +- **Used at.** `audit_warranty.py` to verify branch protection required checks. + +## Repo conformance + +### repolinter +- **Install.** `npm i -g repolinter`. +- **Invocation.** `repolinter lint .`. +- **Exit codes.** `0` clean; `1` violations. +- **Used at.** Optional Gate 9 enforcement; verifies that LICENSE, README, etc. are present and conformant. + +## Tool-runner contract + +`scripts/lib/tool_runner.py::run(cmd, cwd=None, timeout=120)` returns a `ToolResult(exit_code, stdout, stderr)`. On `FileNotFoundError`, exit code is `127` with stderr `binary not found: <name>`. On timeout, exit code is `124`. Audits MUST treat 127 as `INFO` (tool optional/missing), not as `CRITICAL`. The exception is `audit_warranty.py`, which expects CI to have the tools and treats their absence as a HIGH finding because Gate 9 is the warranty. + +## Why exit codes matter + +Audits compose. The orchestrator's `audit` subcommand merges every script's `Report` into a single aggregate. Exit code 0 means no HIGH/CRITICAL findings; anything else fails the gate. Tools that exit 127 because they are not installed locally would otherwise close gates by default; the `INFO` mapping prevents that. CI should fail fast on missing tools by adding an explicit "preflight" job that runs `command -v <tool>` for each tool the audit suite expects. diff --git a/skills/.experimental/product-init/references/working-backwards.md b/skills/.experimental/product-init/references/working-backwards.md new file mode 100644 index 00000000..efdee71d --- /dev/null +++ b/skills/.experimental/product-init/references/working-backwards.md @@ -0,0 +1,79 @@ +--- +name: working-backwards +description: Amazon's PR-FAQ method, the five customer questions, and how it forces customer-first scope at Gate 1. +type: reference +--- + +# Working Backwards + +## Origin + +"Working Backwards" is the Amazon product development practice articulated publicly by Colin Bryar and Bill Carr (former Amazon executives) in their 2021 book _Working Backwards: Insights, Stories, and Secrets from Inside Amazon_ (St. Martin's Press). The technique was institutionalised at Amazon in the early 2000s by Jeff Bezos and S Team. The thesis is the inverse of capability-led product development: instead of starting from "what can we build with the technology we have?", start from "what would the customer-facing announcement read like the day this ships?". + +The artefact is the **PR-FAQ**: a press release plus a frequently-asked-questions document. The team writes the PR-FAQ before any code, design, or roadmap commitment. The PR-FAQ becomes the gate; if you cannot write a credible one-page PR, the product is not yet ready to be built. + +## The PR-FAQ format + +A canonical Amazon PR-FAQ is structured in seven sections: + +1. **Heading.** Product name + tagline. Twelve words or fewer. +2. **Sub-heading.** The customer for the product and the problem solved, in one sentence. +3. **Summary paragraph.** What it does, who it is for, and why it matters. Four sentences. +4. **Problem paragraph.** The customer's pain in their own words. +5. **Solution paragraph.** How this product solves the pain. +6. **Quotes.** One quote from a leader at the company; one quote from a hypothetical customer. The customer quote is the load-bearing one. +7. **Call to action.** How to get started, where to learn more. + +The FAQ portion supplements the PR with the hard questions: "Why now?", "How is this different from competitor X?", "What does it cost?", "What is the riskiest assumption?", "What is the v1 scope and what is explicitly not in v1?". + +## The five customer questions + +Bryar and Carr distil the PR-FAQ ritual into five customer-anchored questions every team must answer before building: + +1. **Who is the customer?** Specific, named, segmentable. Not "developers" but "solo technical founders building a B2B SaaS MVP in 2026 who have not yet incorporated". +2. **What is the customer problem or opportunity?** Stated in the customer's words, with the cost of the problem quantified where possible. +3. **What is the most important customer benefit?** One benefit. Force-ranked. The PR-FAQ headline tests whether you can pick. +4. **How do you know what the customer needs or wants?** Evidence: interviews, sign-ups, paid pilots, retention data, support tickets. "We just know" is not an answer. +5. **What does the customer experience look like?** The first session, the first success, the first re-visit, the first share with a colleague. Concrete, narrative. + +The five questions overlap with the 14 mandatory discovery questions in this skill. Q1 (Golden Path), Q2 (persona + pain), Q4 (10-min success signal), Q5 (outcome metric), and Q12 (numeric competitive benchmark) all trace back to the Amazon PR-FAQ. + +## Why Working Backwards prevents capability-bloat + +The capability-led failure mode goes: "We have a vector database. We have an LLM. We have a deploy pipeline. Let us combine them and announce the result." The PR-FAQ forces the inverse: write the announcement first. If the announcement reads "We combined three commodity components and produced an output the customer did not ask for", the team sees the failure on day one instead of month four. + +The AI-product-builder example from this skill's origin story is exactly this trap. A capability-led PR-FAQ would have read: "Today we are launching the Multi-Agent Orchestrator that can plan, generate, test, and deploy code with five backends and a queue." A customer-first PR-FAQ would have read: "Today, a solo technical founder went from idea to deployed SaaS at solofounder.example in twelve minutes, paid $50, and is in production." The first sentence sells the team. The second sentence sells the customer. Working Backwards forces the second. + +## How the skill operationalises Working Backwards + +`templates/pr-faq.md` (already shipped under `templates/`) is the PR-FAQ skeleton the team fills before code starts. Gate 1's `audit_constitution.py` requires: + +- A one-sentence Golden Path -- the equivalent of the PR sub-heading. +- A persona+pain -- the equivalent of the customer + problem section. +- A 10-minute success signal (Q4) -- the equivalent of the PR-FAQ headline benefit. +- A numeric competitive benchmark (Q12) -- the equivalent of the FAQ "How is this different from X?" answer. + +The audit cannot judge prose quality, but it catches "the team has not written this" vs. "the team has written this". The Gate 1 review meeting reads the PR-FAQ aloud and asks: "If TechCrunch picked this up tomorrow, would the customer recognise themselves in it?". If not, back to the PR-FAQ. + +## The "no-PowerPoint" rule + +Bezos famously banned PowerPoint in S Team meetings in 2004, replacing it with six-page narrative memos. The reason: bullet points let writers hide. A PR-FAQ written as bullets reads "we will support X, Y, Z capabilities". A PR-FAQ written as prose forces the writer to make causal claims ("because X, Y, Z, the customer feels..."), which are falsifiable. Falsifiable claims are the prerequisite for kill criteria. + +This skill applies the same rule: PRODUCT.md and PLAN.md are markdown narratives, not slide decks. The audits look for headings and prose, not for bullet-density. + +## Common anti-patterns + +**The aspirational PR.** "Today we changed the way humanity builds software." The CEO loves this; the customer skips it. Replace with a specific outcome: "Today, Sarah, a B2B SaaS founder, deployed her CRM idea to production in 12 minutes." + +**The capability PR.** "Multi-agent orchestrator with five model backends, queue, and observability." Capability-led, customer-empty. Replace with a benefit-led outcome. + +**The roadmap-FAQ.** A PR-FAQ that reads "v1 will support... v2 will add... v3 will introduce...". A PR-FAQ should describe a single moment: launch day. Roadmap belongs in PLAN.md, not in the PR-FAQ. + +**The vague quote.** "I love it!" -- John Doe, customer. Replace with a quote that names a specific outcome at a specific time: "I went from idea to a paying customer in 48 hours. The previous quarter I had spent six weekends on a half-built version with [competitor]." + +## Reading list + +- Bryar and Carr, _Working Backwards_, St. Martin's, 2021. +- Bezos, "2004 Shareholder Letter" (origin of the narrative-memo rule), aboutamazon.com. +- The "Day 1 vs Day 2" framing -- Bezos 2017 letter. +- Templates collection: https://www.workingbackwards.com. diff --git a/skills/.experimental/product-init/runtime/claude-code.md b/skills/.experimental/product-init/runtime/claude-code.md new file mode 100644 index 00000000..67efff44 --- /dev/null +++ b/skills/.experimental/product-init/runtime/claude-code.md @@ -0,0 +1,37 @@ +--- +runtime: claude-code +skill: product-init +--- + +# Runtime Adapter — Claude Code + +## Path Resolution + +``` +SKILL_DIR=~/.claude/skills/product-init +VENV=$SKILL_DIR/.venv/bin/python +SCRIPTS=$SKILL_DIR/scripts/ +``` + +## Builder Map + +| Task Type | Agent Call | +|-------------------|-----------| +| Backend/API/logic | `Agent(subagent_type="mistral-large:mistral-large-rescue", prompt="...")` | +| Frontend/UI | `Skill(skill="frontend-design:frontend-design")` → `Agent(subagent_type="general-purpose")` | +| Senior reasoning | `Agent(subagent_type="codex:codex-rescue", prompt="...")` | +| Config/docs | `Agent(subagent_type="alibaba:alibaba-rescue", prompt="...")` | +| Small fixes | `Agent(subagent_type="general-purpose")` | + +## Sub-Skill Invocation + +- **frontend-design**: `Skill(skill="frontend-design:frontend-design")` +- **filter_task**: `Bash("$VENV $SCRIPTS/filter_task.py <args>")` + +## File Operations + +Write, Edit, Read, and Bash tools are natively available. No shims required. + +## Auto-Trigger + +Activated by `Skill` tool invocation or the `/product-init` slash command. diff --git a/skills/.experimental/product-init/runtime/codex.md b/skills/.experimental/product-init/runtime/codex.md new file mode 100644 index 00000000..a246189e --- /dev/null +++ b/skills/.experimental/product-init/runtime/codex.md @@ -0,0 +1,43 @@ +--- +runtime: codex +skill: product-init +--- + +# Runtime Adapter — Codex + +## Path Resolution + +Resolution order: +1. `$PRODUCT_INIT_SKILL_DIR` (env var override) +2. `~/.claude/skills/product-init/` (shared install fallback) + +``` +VENV=$SKILL_DIR/.venv/bin/python +SCRIPTS=$SKILL_DIR/scripts/ +``` + +## Builder Map + +No model selection available — Codex handles routing internally. + +| Task Type | Command | +|-----------|---------| +| All tasks | `node "${CODEX_PLUGIN_ROOT}/scripts/codex-companion.mjs" task "<prompt>"` | + +## Sub-Skill Fallback + +No `Skill` tool available. Embed frontend-design discipline inline in every builder prompt: + +> "Design with editorial typography, avoid generic AI-slop aesthetics, use real working +> code with exceptional attention to creative details. No hero gradients, no card-grid spam." + +**filter_task**: `$VENV $SCRIPTS/filter_task.py <args>` — same Bash call as claude-code runtime. + +## File Operations + +Write files via `apply_patch` or direct file write. `Bash` is available. + +## Limitations + +- No parallel Agent calls — execute builders sequentially. +- No Skill tool — all sub-skill behavior embedded inline in prompts. diff --git a/skills/.experimental/product-init/runtime/openclaw.md b/skills/.experimental/product-init/runtime/openclaw.md new file mode 100644 index 00000000..5e5bb57b --- /dev/null +++ b/skills/.experimental/product-init/runtime/openclaw.md @@ -0,0 +1,44 @@ +--- +runtime: openclaw +skill: product-init +--- + +# Runtime Adapter — OpenClaw (+ Hermes) + +## Path Resolution + +Resolution order: +1. `$PRODUCT_INIT_SKILL_DIR` (env var override) +2. `~/.openclaw/skills/product-init/` (symlinked from `~/.claude/skills/product-init/`) +3. `~/.claude/skills/product-init/` (fallback) + +``` +VENV=$SKILL_DIR/.venv/bin/python +SCRIPTS=$SKILL_DIR/scripts/ +``` + +## Builder Map + +| Task Type | Command | +|-----------------|---------| +| Backend/logic | `openclaw agent dispatch --model mistral-large/mistral-large-instruct-2411 --task "<prompt>"` | +| Heavy reasoning | `openclaw agent dispatch --model openrouter/nousresearch/hermes-3-llama-3.1-405b --task "<prompt>"` | +| Codex-style | `openclaw agent dispatch --model openai/gpt-5.4 --task "<prompt>"` | +| Fast/small | `openclaw agent dispatch --model xiaomi/mimo-v2-flash --task "<prompt>"` | + +## Sub-Skill Fallback + +No `Skill` tool. Embed frontend-design discipline inline (same text as codex runtime). + +## Model Context + +- Primary: `claude-sonnet-4-6` +- Fallbacks: `hermes3` (`openrouter/nousresearch/hermes-3-llama-3.1-405b`), `kimi-k2.5` + +## Install + +Run once to link the shared skill into OpenClaw: + +```bash +ln -sf ~/.claude/skills/product-init ~/.openclaw/skills/product-init +``` diff --git a/skills/.experimental/product-init/scripts/audit_build.py b/skills/.experimental/product-init/scripts/audit_build.py new file mode 100644 index 00000000..eb2b0f84 --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_build.py @@ -0,0 +1,109 @@ +#!/usr/bin/env python3 +"""Gate 4 - Build hygiene audit. + +- New TODO/FIXME/XXX/HACK markers must have corresponding DEBT.md entries. +- New skipped tests are HIGH. +- Every commit message must contain a ticket id like ABC-123. +""" +from __future__ import annotations + +import argparse +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 +from lib.tool_runner import run # noqa: E402 + +DEBT_RE = re.compile(r"\b(TODO|FIXME|XXX|HACK)\b") +SKIP_RE = re.compile(r"(\.skip\b|\.only\b|it\.todo\b|xfail\b|@pytest\.mark\.skip)") +TEST_FILE_RE = re.compile(r"(test|spec|__tests__)", re.IGNORECASE) +TICKET_RE = re.compile(r"\b[A-Z][A-Z0-9]+-\d+\b") + + +def is_git(project: Path) -> bool: + return (project / ".git").exists() or run(["git", "-C", str(project), "rev-parse", "--git-dir"]).ok + + +def parse_diff_added(diff_text: str): + cur_file = None + cur_line = 0 + out = [] + for raw in diff_text.splitlines(): + if raw.startswith("+++ b/"): + cur_file = raw[6:] + cur_line = 0 + elif raw.startswith("@@"): + m = re.search(r"\+(\d+)", raw) + cur_line = int(m.group(1)) if m else 0 + elif raw.startswith("+") and not raw.startswith("+++"): + out.append((cur_file, cur_line, raw[1:])) + cur_line += 1 + elif not raw.startswith("-"): + cur_line += 1 + return out + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate4-build") + gate = "Gate 4: Build" + + if not is_git(project_dir): + rep.add_finding(Severity.LOW, gate, "git-repo", "not a git repo", + "Initialize git so build hygiene can be enforced.") + return rep + + diff = run(["git", "-C", str(project_dir), "diff", "origin/main...HEAD"]) + if diff.exit_code != 0: + diff = run(["git", "-C", str(project_dir), "diff", "HEAD"]) + added = parse_diff_added(diff.stdout) + + debt_text = "" + debt_path = project_dir / "DEBT.md" + if debt_path.exists(): + debt_text = debt_path.read_text(encoding="utf-8", errors="ignore") + + for fpath, lineno, content in added: + if not fpath: + continue + if DEBT_RE.search(content): + needle = f"{fpath}:{lineno}" + if needle not in debt_text: + rep.add_finding( + Severity.HIGH, gate, "todo-without-debt-ledger", + f"{needle} adds TODO/FIXME but no row in DEBT.md", + f"Add row to DEBT.md referencing `{needle}` or remove the marker.", + ) + if TEST_FILE_RE.search(fpath) and SKIP_RE.search(content): + rep.add_finding( + Severity.HIGH, gate, "skipped-test", + f"{fpath}:{lineno} adds skipped/only test", + "Re-enable the test or delete it; do not check in skipped tests.", + ) + + log = run(["git", "-C", str(project_dir), "log", "--oneline", "origin/main..HEAD"]) + if log.exit_code == 0 and log.stdout.strip(): + for line in log.stdout.splitlines(): + sha, _, msg = line.partition(" ") + if not TICKET_RE.search(msg): + rep.add_finding( + Severity.MEDIUM, gate, "commit-ticket", + f"commit {sha} '{msg}' missing TICKET-123 reference", + "Amend commits to include the ticket id (e.g. ABC-123).", + ) + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 4: Build audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_console_clean.py b/skills/.experimental/product-init/scripts/audit_console_clean.py new file mode 100644 index 00000000..8a23e981 --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_console_clean.py @@ -0,0 +1,68 @@ +#!/usr/bin/env python3 +"""Cross-cutting: golden path must produce 0 console errors/warnings. + +Parses Playwright trace JSON files for `type: console` entries with level +error/warning. +""" +from __future__ import annotations + +import argparse +import json +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 + + +def walk_obj(obj, count): + if isinstance(obj, dict): + if obj.get("type") == "console" and obj.get("level") in ("error", "warning"): + count[0] += 1 + for v in obj.values(): + walk_obj(v, count) + elif isinstance(obj, list): + for v in obj: + walk_obj(v, count) + + +def audit(project_dir: Path) -> Report: + rep = Report(name="cross-console-clean") + gate = "Cross: Console Clean" + total = [0] + found_any = False + for d in ("test-results", "playwright-report"): + root = project_dir / d + if not root.exists(): + continue + for f in root.rglob("*.json"): + found_any = True + try: + data = json.loads(f.read_text(encoding="utf-8", errors="ignore")) + except Exception: + continue + walk_obj(data, total) + if not found_any: + rep.add_finding(Severity.INFO, gate, "no-traces", + "no Playwright trace files in test-results/ or playwright-report/", + "Run `npx playwright test` first.") + return rep + if total[0] > 0: + rep.add_finding(Severity.HIGH, gate, "console-pollution", + f"{total[0]} console error/warning events on golden path", + "Fix console errors/warnings; required count is 0.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Cross: console clean") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_constitution.py b/skills/.experimental/product-init/scripts/audit_constitution.py new file mode 100644 index 00000000..85748a4a --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_constitution.py @@ -0,0 +1,178 @@ +#!/usr/bin/env python3 +"""Gate 1 - Discovery Constitution audit. + +Checks PRODUCT.md, SPEC.md, PLAN.md, TASKS.md, COMPETITIVE_BENCHMARK.md exist, +have valid frontmatter, required sections, single-sentence Golden Path, +and that all 14 mandatory discovery questions are answered. +""" +from __future__ import annotations + +import argparse +import os +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 + +try: + import frontmatter # python-frontmatter +except ImportError: + frontmatter = None + + +REQUIRED_FILES = { + "PRODUCT.md": ["Golden Path", "Persona", "Outcome Metric"], + "SPEC.md": ["Scope", "Acceptance"], + "PLAN.md": ["Appetite", "Kill Criteria", "Deferred"], + "TASKS.md": ["Tasks"], + "COMPETITIVE_BENCHMARK.md": ["Benchmark"], +} + +QUESTIONS = [ + ("Q1", r"golden\s*path"), + ("Q2", r"persona"), + ("Q3", r"current\s*alternative"), + ("Q4", r"10[- ]?min(ute)?\s*success"), + ("Q5", r"outcome\s*metric"), + ("Q6", r"riskiest\s*assumption"), + ("Q7", r"(four[- ]?risk|value.*usability.*feasibility.*viability)"), + ("Q8", r"appetite"), + ("Q9", r"kill\s*criteria"), + ("Q10", r"rabbit\s*hole"), + ("Q11", r"deferred"), + ("Q12", r"(competitive\s*benchmark|v0|bolt|lovable|railway)"), + ("Q13", r"(discovery\s*cadence|weekly\s*touchpoint)"), + ("Q14", r"golden_path_step"), +] + + +def load_text(path: Path) -> str: + try: + return path.read_text(encoding="utf-8") + except OSError: + return "" + + +def has_frontmatter(text: str) -> bool: + if frontmatter is not None: + try: + post = frontmatter.loads(text) + return bool(post.metadata) + except Exception: + return False + return text.lstrip().startswith("---") + + +def extract_section(text: str, heading: str) -> str: + # Match any heading line whose visible text contains `heading` (case-insensitive substring). + heading_line_re = re.compile(r"^(#{1,6})[ \t]+(.+?)[ \t]*$", re.MULTILINE) + needle = heading.lower() + match = None + for m in heading_line_re.finditer(text): + title = m.group(2).strip() + if needle in title.lower(): + match = m + break + if not match: + return "" + matched_level = len(match.group(1)) + start = match.end() + # Next "section boundary" is a heading at the same or higher level (fewer-or-equal #s). + boundary_pattern = r"^#{1," + str(matched_level) + r"}[ \t]+" + nxt = re.search(boundary_pattern, text[start:], re.MULTILINE) + return text[start: start + nxt.start()] if nxt else text[start:] + + +def is_filled(body: str) -> bool: + cleaned = re.sub(r"\[FILL[^\]]*\]", "", body, flags=re.IGNORECASE).strip() + return len(cleaned) > 5 + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate1-discovery-constitution") + gate = "Gate 1: Discovery" + + for fname, sections in REQUIRED_FILES.items(): + path = project_dir / fname + if not path.exists(): + rep.add_finding( + Severity.CRITICAL, + gate, + f"file-exists:{fname}", + f"missing {path}", + f"Create {fname} from templates/", + ) + continue + text = load_text(path) + if not has_frontmatter(text): + rep.add_finding( + Severity.HIGH, + gate, + f"frontmatter:{fname}", + "no YAML frontmatter detected", + "Add `---` block with name/description/version", + ) + for sec in sections: + body = extract_section(text, sec) + if not body or not is_filled(body): + rep.add_finding( + Severity.HIGH, + gate, + f"section:{fname}#{sec}", + "section missing or unfilled", + f"Fill `## {sec}` in {fname}", + ) + + # Golden Path single-sentence check + product_path = project_dir / "PRODUCT.md" + if product_path.exists(): + body = extract_section(load_text(product_path), "Golden Path") + sentence = body.strip().split("\n") + sentence = [s for s in sentence if s.strip() and not s.strip().startswith("#")] + joined = " ".join(sentence).strip() + # one sentence: exactly one sentence terminator + terminators = re.findall(r"[.!?](\s|$)", joined) + if not joined: + rep.add_finding( + Severity.CRITICAL, gate, + "golden-path:one-sentence", + "Golden Path empty", + "Write exactly one sentence describing user-to-deployed-product flow.", + ) + elif len(terminators) != 1: + rep.add_finding( + Severity.HIGH, gate, + "golden-path:one-sentence", + f"found {len(terminators)} sentence terminators", + "Reduce Golden Path to exactly one sentence.", + ) + + # 14 questions + aggregate = "\n\n".join( + load_text(project_dir / f) for f in REQUIRED_FILES if (project_dir / f).exists() + ).lower() + for qid, pat in QUESTIONS: + if not re.search(pat, aggregate, re.IGNORECASE): + rep.add_finding( + Severity.HIGH, gate, + f"question:{qid}", + f"no match for /{pat}/ in discovery docs", + f"Answer {qid} explicitly in PRODUCT.md/PLAN.md.", + ) + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 1: Discovery Constitution audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_contract.py b/skills/.experimental/product-init/scripts/audit_contract.py new file mode 100644 index 00000000..e90cdf9a --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_contract.py @@ -0,0 +1,101 @@ +#!/usr/bin/env python3 +"""Gate 5 - API contract audit. oasdiff for OpenAPI, graphql-inspector for GraphQL.""" +from __future__ import annotations + +import argparse +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 +from lib.tool_runner import run # noqa: E402 + + +def find_first(project: Path, names): + for n in names: + p = project / n + if p.exists(): + return p + return None + + +def get_main_version(project: Path, rel_path: str) -> str: + res = run(["git", "-C", str(project), "show", f"origin/main:{rel_path}"]) + return res.stdout if res.ok else "" + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate5-contract") + gate = "Gate 5: QA / Contract" + + debt_text = "" + debt = project_dir / "DEBT.md" + if debt.exists(): + debt_text = debt.read_text(encoding="utf-8", errors="ignore") + + openapi = find_first(project_dir, ["openapi.yaml", "openapi.yml", "openapi.json"]) + if openapi: + rel = openapi.relative_to(project_dir).as_posix() + old_text = get_main_version(project_dir, rel) + if old_text: + tmp_old = project_dir / ".audit_openapi_main.tmp" + tmp_old.write_text(old_text, encoding="utf-8") + try: + res = run(["oasdiff", "breaking", str(tmp_old), str(openapi)], cwd=str(project_dir)) + if res.exit_code == 127: + rep.add_finding(Severity.INFO, gate, "oasdiff-missing", + "oasdiff binary not installed", + "Install oasdiff: https://github.com/Tufin/oasdiff") + elif res.exit_code != 0 and res.stdout.strip(): + if "openapi" not in debt_text.lower(): + rep.add_finding(Severity.CRITICAL, gate, "openapi-breaking", + f"breaking changes detected:\n{res.stdout[:500]}", + "Document the breaking change in DEBT.md or revert the API change.") + finally: + try: + tmp_old.unlink() + except OSError: + pass + graphql = find_first(project_dir, ["schema.graphql", "schema.gql"]) + if graphql: + rel = graphql.relative_to(project_dir).as_posix() + old_text = get_main_version(project_dir, rel) + if old_text: + tmp_old = project_dir / ".audit_schema_main.tmp" + tmp_old.write_text(old_text, encoding="utf-8") + try: + res = run(["graphql-inspector", "diff", str(tmp_old), str(graphql)], cwd=str(project_dir)) + if res.exit_code == 127: + rep.add_finding(Severity.INFO, gate, "graphql-inspector-missing", + "graphql-inspector not installed", + "npm i -g @graphql-inspector/cli") + elif res.exit_code != 0 and "BREAKING" in (res.stdout + res.stderr).upper(): + if "graphql" not in debt_text.lower(): + rep.add_finding(Severity.CRITICAL, gate, "graphql-breaking", + "breaking GraphQL change without DEBT.md entry", + "Document or revert.") + finally: + try: + tmp_old.unlink() + except OSError: + pass + + if not openapi and not graphql: + rep.add_finding(Severity.INFO, gate, "no-contract", + "no openapi.* or schema.graphql found", + "If your service has an API, ship a contract file.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 5: Contract audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_coverage.py b/skills/.experimental/product-init/scripts/audit_coverage.py new file mode 100644 index 00000000..29b3697c --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_coverage.py @@ -0,0 +1,87 @@ +#!/usr/bin/env python3 +"""Gate 5 - diff coverage audit (>= 80% on changed lines).""" +from __future__ import annotations + +import argparse +import json +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 +from lib.tool_runner import run # noqa: E402 + +THRESHOLD = 80.0 + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate5-coverage") + gate = "Gate 5: QA / Coverage" + + res = run(["diff-cover", "coverage.xml", "--compare-branch=origin/main", "--json-report", "diff-cover.json"], + cwd=str(project_dir)) + if res.exit_code == 127: + cov_xml = project_dir / "coverage.xml" + cov_json = project_dir / "coverage" / "coverage-final.json" + if cov_xml.exists(): + text = cov_xml.read_text(encoding="utf-8", errors="ignore") + m = re.search(r'line-rate="([\d.]+)"', text) + pct = float(m.group(1)) * 100 if m else 0 + if pct < THRESHOLD: + rep.add_finding(Severity.HIGH, gate, "coverage-line-rate", + f"coverage.xml line-rate {pct:.1f}% < {THRESHOLD}%", + "Raise coverage on changed code or install diff-cover.") + elif cov_json.exists(): + try: + data = json.loads(cov_json.read_text(encoding="utf-8")) + covered = total = 0 + for f in data.values(): + s = f.get("s", {}) if isinstance(f, dict) else {} + total += len(s) + covered += sum(1 for v in s.values() if v) + pct = (covered / total * 100) if total else 0 + if pct < THRESHOLD: + rep.add_finding(Severity.HIGH, gate, "coverage-istanbul", + f"overall coverage {pct:.1f}% < {THRESHOLD}%", + "Add tests; install diff-cover for diff-only thresholds.") + except Exception: + rep.add_finding(Severity.INFO, gate, "coverage-parse", + "could not parse coverage-final.json", + "Verify coverage report format.") + else: + rep.add_finding(Severity.INFO, gate, "coverage-tool-missing", + "diff-cover not installed and no coverage report present", + "pip install diff-cover, run with coverage.xml") + return rep + + report_path = project_dir / "diff-cover.json" + if report_path.exists(): + try: + data = json.loads(report_path.read_text(encoding="utf-8")) + pct = data.get("total_percent_covered", 100) + if pct < THRESHOLD: + rep.add_finding(Severity.HIGH, gate, "diff-cover", + f"diff coverage {pct:.1f}% < {THRESHOLD}%", + "Add tests for changed lines.") + except Exception: + pass + elif res.exit_code != 0: + rep.add_finding(Severity.HIGH, gate, "diff-cover-failed", + f"diff-cover failed: {res.stderr[:200]}", + "Inspect diff-cover output.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 5: Coverage audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_demo_url.py b/skills/.experimental/product-init/scripts/audit_demo_url.py new file mode 100644 index 00000000..d4159215 --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_demo_url.py @@ -0,0 +1,122 @@ +#!/usr/bin/env python3 +"""Cross-cutting: prod_url returns HTTP 200, body > 500 bytes, has non-empty <title>.""" +from __future__ import annotations + +import argparse +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 + +try: + import frontmatter +except ImportError: + frontmatter = None + +try: + import requests +except ImportError: + requests = None + + +def get_prod_url(project_dir: Path): + p = project_dir / "PRODUCT.md" + if not p.exists(): + return None + text = p.read_text(encoding="utf-8", errors="ignore") + if frontmatter: + try: + post = frontmatter.loads(text) + return post.metadata.get("prod_url") + except Exception: + pass + m = re.search(r"^prod_url\s*:\s*(\S+)\s*$", text, re.MULTILINE) + return m.group(1) if m else None + + +TBD_RE = re.compile(r"^TBD[-_].+", re.IGNORECASE) + + +def find_debt_ack(project_dir: Path): + """Return (debt_id, row_text) of an infra DEBT row that acknowledges the + prod_url placeholder, or None.""" + p = project_dir / "DEBT.md" + if not p.exists(): + return None + keyword_re = re.compile(r"prod_url|gate\s*7|deploy", re.IGNORECASE) + for line in p.read_text(encoding="utf-8", errors="ignore").splitlines(): + if not line.startswith("|"): + continue + cells = [c.strip() for c in line.strip().strip("|").split("|")] + if len(cells) < 4: + continue + if cells[3].lower() != "infra": + continue + if keyword_re.search(line): + return cells[0], line.strip() + return None + + +def audit(project_dir: Path) -> Report: + rep = Report(name="cross-demo-url") + gate = "Cross: Demo URL" + + url = get_prod_url(project_dir) + if not url: + rep.add_finding(Severity.HIGH, gate, "prod-url-missing", + "no prod_url in PRODUCT.md frontmatter", + "Add prod_url to PRODUCT.md.") + return rep + if isinstance(url, str) and TBD_RE.match(url.strip()): + ack = find_debt_ack(project_dir) + if ack: + debt_id, _ = ack + rep.add_finding(Severity.HIGH, gate, "prod-url-tbd", + f"prod_url is placeholder `{url}`; acknowledged by DEBT row {debt_id} (infra).", + "Replace prod_url with a real URL after Gate 7 deploy; close DEBT row.") + else: + rep.add_finding(Severity.CRITICAL, gate, "prod-url-tbd", + f"prod_url is placeholder `{url}` and no infra DEBT row acknowledges it", + "Either deploy and set prod_url, or add an infra DEBT row referencing prod_url/gate 7/deploy.") + return rep + if not requests: + rep.add_finding(Severity.MEDIUM, gate, "requests-missing", + "requests library not installed", + "pip install requests") + return rep + try: + r = requests.get(url, timeout=15) + except Exception as e: + rep.add_finding(Severity.CRITICAL, gate, "fetch-error", + f"{url}: {e}", "Make prod URL reachable.") + return rep + if r.status_code != 200: + rep.add_finding(Severity.CRITICAL, gate, "http-status", + f"{url} returned {r.status_code}", "Fix prod deploy.") + return rep + if len(r.content) < 500: + rep.add_finding(Severity.HIGH, gate, "empty-shell", + f"body length {len(r.content)} bytes < 500", + "Looks like an empty shell; verify rendered content.") + m = re.search(r"<title[^>]*>(.*?)", r.text, re.IGNORECASE | re.DOTALL) + if not m or not m.group(1).strip(): + rep.add_finding(Severity.HIGH, gate, "empty-title", + f"{url} has no ", + "Set the <title>.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Cross: demo URL") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_deploy.py b/skills/.experimental/product-init/scripts/audit_deploy.py new file mode 100644 index 00000000..887da6da --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_deploy.py @@ -0,0 +1,166 @@ +#!/usr/bin/env python3 +"""Gate 7 - Deploy audit. prod_url returns 200, has <title>, smoke job present, rollback drill <14d.""" +from __future__ import annotations + +import argparse +import datetime as dt +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 +from lib.tool_runner import run # noqa: E402 + +try: + import frontmatter +except ImportError: + frontmatter = None + +try: + import requests +except ImportError: + requests = None + + +def get_prod_url(project_dir: Path): + p = project_dir / "PRODUCT.md" + if not p.exists(): + return None + text = p.read_text(encoding="utf-8", errors="ignore") + if frontmatter: + try: + post = frontmatter.loads(text) + return post.metadata.get("prod_url") + except Exception: + pass + m = re.search(r"^prod_url\s*:\s*(\S+)\s*$", text, re.MULTILINE) + return m.group(1) if m else None + + +TBD_RE = re.compile(r"^TBD[-_].+", re.IGNORECASE) + + +def find_debt_ack(project_dir: Path): + """Return (debt_id, row_text) of an infra DEBT row that acknowledges the + prod_url placeholder, or None.""" + p = project_dir / "DEBT.md" + if not p.exists(): + return None + keyword_re = re.compile(r"prod_url|gate\s*7|deploy", re.IGNORECASE) + for line in p.read_text(encoding="utf-8", errors="ignore").splitlines(): + if not line.startswith("|"): + continue + cells = [c.strip() for c in line.strip().strip("|").split("|")] + if len(cells) < 4: + continue + if cells[3].lower() != "infra": + continue + if keyword_re.search(line): + return cells[0], line.strip() + return None + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate7-deploy") + gate = "Gate 7: Deploy" + + prod_url = get_prod_url(project_dir) + is_tbd = isinstance(prod_url, str) and bool(TBD_RE.match(prod_url.strip())) + if is_tbd: + ack = find_debt_ack(project_dir) + if ack: + debt_id, _ = ack + rep.add_finding(Severity.HIGH, gate, "prod-url-tbd", + f"prod_url is placeholder `{prod_url}`; acknowledged by DEBT row {debt_id} (infra).", + "Replace prod_url with a real URL after Gate 7 deploy; close DEBT row.") + else: + rep.add_finding(Severity.CRITICAL, gate, "prod-url-tbd", + f"prod_url is placeholder `{prod_url}` and no infra DEBT row acknowledges it", + "Either deploy and set prod_url, or add an infra DEBT row referencing prod_url/gate 7/deploy.") + prod_url = None # skip live fetch checks below + if not prod_url and not is_tbd: + rep.add_finding(Severity.CRITICAL, gate, "prod-url", + "no `prod_url` in PRODUCT.md frontmatter", + "Add `prod_url: https://...` to PRODUCT.md frontmatter.") + elif not prod_url: + pass # TBD case already reported above + elif not requests: + rep.add_finding(Severity.MEDIUM, gate, "requests-missing", + "requests library not installed", + "pip install requests") + else: + ok = False + last_err = "" + for _ in range(3): + try: + r = requests.head(prod_url, timeout=10, allow_redirects=True) + if r.status_code == 200: + ok = True + break + last_err = f"HTTP {r.status_code}" + except Exception as e: + last_err = str(e) + if not ok: + rep.add_finding(Severity.CRITICAL, gate, "prod-url-200", + f"{prod_url}: {last_err}", + "Make production reachable; HEAD must return 200.") + else: + try: + r = requests.get(prod_url, timeout=15) + m = re.search(r"<title[^>]*>(.*?)", r.text, re.IGNORECASE | re.DOTALL) + if not m or not m.group(1).strip(): + rep.add_finding(Severity.HIGH, gate, "prod-title", + f"{prod_url}: empty ", + "Production page has no title; likely an empty shell.") + except Exception as e: + rep.add_finding(Severity.HIGH, gate, "prod-get", + f"GET failed: {e}", + "Investigate prod URL.") + + workflows = list((project_dir / ".github" / "workflows").glob("*.yml")) if (project_dir / ".github" / "workflows").exists() else [] + smoke_seen = any("smoke" in w.read_text(encoding="utf-8", errors="ignore").lower() for w in workflows) + if not smoke_seen: + rep.add_finding(Severity.HIGH, gate, "smoke-job", + "no `smoke` keyword in .github/workflows/*.yml", + "Add a post-deploy smoke job that hits prod_url.") + + today = dt.date.today() + drill_recent = False + res = run(["git", "-C", str(project_dir), "tag", "--list", "rollback-drill-*"]) + if res.ok: + for tag in res.stdout.split(): + m = re.match(r"rollback-drill-(\d{4}-\d{2}-\d{2})", tag) + if m: + try: + d = dt.date.fromisoformat(m.group(1)) + if (today - d).days <= 14: + drill_recent = True + break + except ValueError: + pass + if not drill_recent: + runbook = project_dir / "runbooks" / "rollback-drills.md" + if runbook.exists(): + mtime = dt.date.fromtimestamp(runbook.stat().st_mtime) if False else dt.datetime.fromtimestamp(runbook.stat().st_mtime).date() + if (today - mtime).days <= 14: + drill_recent = True + if not drill_recent: + rep.add_finding(Severity.HIGH, gate, "rollback-drill", + "no `rollback-drill-YYYY-MM-DD` tag or recent runbooks/rollback-drills.md within 14 days", + "Run a rollback drill, document it, and tag/commit the result.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 7: Deploy audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_e2e.py b/skills/.experimental/product-init/scripts/audit_e2e.py new file mode 100644 index 00000000..cf3e24d8 --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_e2e.py @@ -0,0 +1,171 @@ +#!/usr/bin/env python3 +"""Gate 5 - E2E (Playwright) audit. Demands a non-localhost baseURL and a green @golden-path test.""" +from __future__ import annotations + +import argparse +import json +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 +from lib.tool_runner import run # noqa: E402 + + +def find_pw_config(project: Path): + for name in ("playwright.config.ts", "playwright.config.js", "playwright.config.mjs", "playwright.config.cjs"): + p = project / name + if p.exists(): + return p + return None + + +def parse_baseurl(text: str): + """Return (literal_url, env_var_name) tuple. Either may be None.""" + # Capture the full RHS of `baseURL:` (or `=`) up to a comma/newline/closing brace. + m = re.search(r"baseURL\s*[:=]\s*([^,\n}]+)", text) + if not m: + return (None, None) + rhs = m.group(1).strip().rstrip(",") + literal = None + env_var = None + lit_m = re.search(r"['\"]([^'\"]+)['\"]", rhs) + if lit_m: + literal = lit_m.group(1) + env_m = re.search(r"process\.env\.([A-Z_][A-Z0-9_]*)", rhs) + if env_m: + env_var = env_m.group(1) + return (literal, env_var) + + +def env_var_documented(project_dir: Path, var: str) -> bool: + # Check .env.example or any .env* file + for envfile in list(project_dir.glob(".env*")): + try: + txt = envfile.read_text(encoding="utf-8", errors="ignore") + if re.search(rf"^{re.escape(var)}\s*=", txt, re.MULTILINE): + return True + except OSError: + pass + # Check workflow files + wf_dir = project_dir / ".github" / "workflows" + if wf_dir.exists(): + for wf in wf_dir.glob("*.yml"): + try: + txt = wf.read_text(encoding="utf-8", errors="ignore") + if re.search(rf"{re.escape(var)}\s*:\s*https?://", txt): + return True + except OSError: + pass + for wf in wf_dir.glob("*.yaml"): + try: + txt = wf.read_text(encoding="utf-8", errors="ignore") + if re.search(rf"{re.escape(var)}\s*:\s*https?://", txt): + return True + except OSError: + pass + return False + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate5-e2e") + gate = "Gate 5: QA / E2E" + cfg = find_pw_config(project_dir) + if not cfg: + rep.add_finding(Severity.HIGH, gate, "playwright-config", + "no playwright.config.* found", + "Add Playwright with a real preview/staging baseURL.") + return rep + cfg_text = cfg.read_text(encoding="utf-8", errors="ignore") + literal, env_var = parse_baseurl(cfg_text) + if not literal and not env_var: + rep.add_finding(Severity.CRITICAL, gate, "baseurl", + "baseURL not set in playwright config", + "Set baseURL to a real preview deploy URL.") + elif literal and ("localhost" in literal or "127.0.0.1" in literal): + rep.add_finding(Severity.CRITICAL, gate, "baseurl-localhost", + f"baseURL is local: {literal}", + "Point baseURL at the deployed preview URL, not localhost.") + elif literal and literal.startswith(("http://", "https://")): + # literal HTTPS (or HTTP non-localhost) URL — accepted; env-var fallback is fine + pass + elif env_var: + # Bare env-var reference: require documentation in .env* or workflow + if not env_var_documented(project_dir, env_var): + rep.add_finding(Severity.HIGH, gate, "baseurl-env-undocumented", + f"baseURL uses process.env.{env_var} with no documented value", + f"Document {env_var} in .env.example or set it in a workflow.") + + pw_bin = project_dir / "node_modules" / ".bin" / "playwright" + if pw_bin.exists(): + res = run([str(pw_bin), "test", "--reporter=json"], cwd=str(project_dir), timeout=900) + try: + data = json.loads(res.stdout) + except Exception: + rep.add_finding(Severity.HIGH, gate, "playwright-run", + "could not parse playwright JSON", + "Run `npx playwright test --reporter=json` locally to debug.") + data = None + if data: + golden_seen = False + failed = 0 + for suite in data.get("suites", []): + _walk(suite, rep, gate, locals_obj := {"golden": False, "failed": 0}) + golden_seen = golden_seen or locals_obj["golden"] + failed += locals_obj["failed"] + if not golden_seen: + rep.add_finding(Severity.CRITICAL, gate, "golden-path-tag", + "no test tagged @golden-path", + "Tag the end-to-end happy path test with @golden-path.") + if failed: + rep.add_finding(Severity.CRITICAL, gate, "e2e-failed", + f"{failed} failing e2e tests", "Fix all e2e failures.") + else: + rep.add_finding(Severity.MEDIUM, gate, "playwright-not-installed", + "node_modules/.bin/playwright missing", + "Install Playwright: `npm i -D @playwright/test`.") + + # console error scan in trace dirs + for d in ("test-results", "playwright-report"): + root = project_dir / d + if not root.exists(): + continue + for f in root.rglob("*.json"): + try: + txt = f.read_text(encoding="utf-8", errors="ignore") + except OSError: + continue + if re.search(r'"type"\s*:\s*"console".*"(error|warning)"', txt): + rep.add_finding(Severity.HIGH, gate, "console-pollution", + f"console error/warning in {f.relative_to(project_dir)}", + "Drive console.error/warn count to 0 on the golden path.") + break + return rep + + +def _walk(suite, rep, gate, acc): + for t in suite.get("specs", []): + title = t.get("title", "") + if "@golden-path" in title: + acc["golden"] = True + for run_obj in t.get("tests", []): + for r in run_obj.get("results", []): + if r.get("status") not in ("passed", "skipped"): + acc["failed"] += 1 + for child in suite.get("suites", []): + _walk(child, rep, gate, acc) + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 5: E2E audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_handoff.py b/skills/.experimental/product-init/scripts/audit_handoff.py new file mode 100644 index 00000000..1456a73c --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_handoff.py @@ -0,0 +1,77 @@ +#!/usr/bin/env python3 +"""Gate 8 - Handoff package audit.""" +from __future__ import annotations + +import argparse +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 + +REQUIRED = ["README.md", "runbooks/runbook.md", "DEBT.md", "HANDOFF.md"] +HANDOFF_SECTIONS = ["Credentials Vault Link", "Admin Walkthrough Video", "Knowledge Transfer Date"] + + +def extract_section(text: str, heading: str) -> str: + # Match any heading line whose visible text contains `heading` (case-insensitive substring). + heading_line_re = re.compile(r"^(#{1,6})[ \t]+(.+?)[ \t]*$", re.MULTILINE) + needle = heading.lower() + match = None + for m in heading_line_re.finditer(text): + title = m.group(2).strip() + if needle in title.lower(): + match = m + break + if not match: + return "" + matched_level = len(match.group(1)) + start = match.end() + boundary_pattern = r"^#{1," + str(matched_level) + r"}[ \t]+" + nxt = re.search(boundary_pattern, text[start:], re.MULTILINE) + return text[start: start + nxt.start()] if nxt else text[start:] + + +def section_filled(text: str, heading: str) -> bool: + body = extract_section(text, heading) + if not body: + return False + cleaned = re.sub(r"\[FILL[^\]]*\]|TBD|TODO", "", body, flags=re.IGNORECASE).strip() + return len(cleaned) > 5 + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate8-handoff") + gate = "Gate 8: Handoff" + + for rel in REQUIRED: + path = project_dir / rel + if not path.exists() or path.stat().st_size == 0: + rep.add_finding(Severity.CRITICAL, gate, f"file:{rel}", + f"{rel} missing or empty", + f"Create {rel} with real content.") + + handoff = project_dir / "HANDOFF.md" + if handoff.exists(): + text = handoff.read_text(encoding="utf-8", errors="ignore") + for sec in HANDOFF_SECTIONS: + if not section_filled(text, sec): + rep.add_finding(Severity.HIGH, gate, f"handoff-section:{sec}", + f"`## {sec}` missing or unfilled", + f"Fill `## {sec}` in HANDOFF.md.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 8: Handoff audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_integration.py b/skills/.experimental/product-init/scripts/audit_integration.py new file mode 100644 index 00000000..90ee1abe --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_integration.py @@ -0,0 +1,69 @@ +#!/usr/bin/env python3 +"""Gate 5 - Integration test audit. Hunts for forbidden HTTP/DB mocks.""" +from __future__ import annotations + +import argparse +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 + +MOCK_RE = re.compile(r"vi\.mock|jest\.mock|unittest\.mock|@patch|MagicMock|sinon\.stub") +NETWORK_RE = re.compile( + r"\b(requests|httpx|aiohttp|axios|fetch|node-fetch|got|undici|urllib)\b" +) +DB_RE = re.compile( + r"\b(sqlite|psycopg|psycopg2|prisma|sequelize|knex|sqlalchemy|mongoose|mongo|typeorm|pg|mysql)\b" +) +INT_FILE_RE = re.compile(r"(integration[/\\]|\.integration\.(test|spec)\.|\.integration_test\.)", re.IGNORECASE) + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate5-integration") + gate = "Gate 5: QA / Integration" + found = 0 + for path in project_dir.rglob("*"): + if not path.is_file(): + continue + if any(part in {"node_modules", ".git", "dist", "build", ".venv", "venv"} for part in path.parts): + continue + if not INT_FILE_RE.search(str(path)): + continue + found += 1 + try: + text = path.read_text(encoding="utf-8", errors="ignore") + except OSError: + continue + if MOCK_RE.search(text): + mocks_network = NETWORK_RE.search(text) or DB_RE.search(text) + sev = Severity.CRITICAL if mocks_network else Severity.MEDIUM + ev = ( + f"{path.relative_to(project_dir)} mocks " + f"{'network/DB layer' if mocks_network else 'something'}" + ) + rep.add_finding( + sev, gate, "integration-mock", + ev, + "Integration tests must hit real HTTP/DB. Move to unit suite or use real backend.", + ) + if found == 0: + rep.add_finding(Severity.HIGH, gate, "integration-coverage", + "no integration tests found", + "Add tests under integration/ or *.integration.test.* hitting real services.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 5: Integration audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_mutation.py b/skills/.experimental/product-init/scripts/audit_mutation.py new file mode 100644 index 00000000..1f7a15f2 --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_mutation.py @@ -0,0 +1,83 @@ +#!/usr/bin/env python3 +"""Gate 5 - Mutation testing audit (Stryker / mutmut).""" +from __future__ import annotations + +import argparse +import json +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 +from lib.tool_runner import run # noqa: E402 + +THRESHOLD = 60.0 + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate5-mutation") + gate = "Gate 5: QA / Mutation" + + stryker_cfg = None + for name in ("stryker.conf.mjs", "stryker.conf.js", "stryker.conf.json", "stryker.config.mjs"): + if (project_dir / name).exists(): + stryker_cfg = project_dir / name + break + mutmut_cfg = None + for name in ("mutmut.cfg", "setup.cfg", "pyproject.toml"): + p = project_dir / name + if p.exists() and "mutmut" in p.read_text(encoding="utf-8", errors="ignore").lower(): + mutmut_cfg = p + break + + if not stryker_cfg and not mutmut_cfg: + rep.add_finding(Severity.INFO, gate, "mutation-not-configured", + "no Stryker or mutmut config", + "Optional but recommended: configure mutation testing.") + return rep + + if stryker_cfg: + res = run(["npx", "--no", "stryker", "run"], cwd=str(project_dir), timeout=1800) + json_report = project_dir / "reports" / "mutation" / "mutation-report.json" + if json_report.exists(): + try: + data = json.loads(json_report.read_text(encoding="utf-8")) + score = data.get("mutationScore") or data.get("score") or 0 + except Exception: + score = 0 + else: + m = re.search(r"mutation score[:\s]+(\d+\.?\d*)", res.stdout, re.IGNORECASE) + score = float(m.group(1)) if m else 0 + if score < THRESHOLD: + rep.add_finding(Severity.HIGH, gate, "stryker-score", + f"Stryker mutation score {score:.1f}% < {THRESHOLD}%", + "Strengthen tests so they kill more mutants.") + + if mutmut_cfg: + res = run(["mutmut", "run"], cwd=str(project_dir), timeout=1800) + result = run(["mutmut", "results"], cwd=str(project_dir)) + text = result.stdout + killed = len(re.findall(r"killed", text, re.IGNORECASE)) + survived = len(re.findall(r"survived", text, re.IGNORECASE)) + total = killed + survived + score = (killed / total * 100) if total else 0 + if total and score < THRESHOLD: + rep.add_finding(Severity.HIGH, gate, "mutmut-score", + f"mutmut score {score:.1f}% < {THRESHOLD}%", + "Strengthen tests; kill more mutants.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 5: Mutation audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_qa.py b/skills/.experimental/product-init/scripts/audit_qa.py new file mode 100644 index 00000000..8a4d760d --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_qa.py @@ -0,0 +1,84 @@ +#!/usr/bin/env python3 +"""Gate 5 - QA aggregator. Runs all 8 sub-audits and merges findings.""" +from __future__ import annotations + +import argparse +import json +import subprocess +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 + +SUB_AUDITS = [ + "audit_unit", + "audit_integration", + "audit_e2e", + "audit_contract", + "audit_coverage", + "audit_mutation", + "audit_static", + "audit_console_clean", +] + + +def run_sub(name: str, project_dir: str) -> tuple[dict | None, int]: + script = Path(__file__).resolve().parent / f"{name}.py" + if not script.exists(): + return (None, 0) + proc = subprocess.run( + [sys.executable, "-B", str(script), "--project-dir", project_dir, "--json"], + capture_output=True, + text=True, + timeout=1200, + ) + try: + data = json.loads(proc.stdout) + except Exception: + data = None + return (data, proc.returncode) + + +def audit(project_dir: Path) -> tuple[Report, int]: + rep = Report(name="gate5-qa-aggregate") + max_exit = 0 + for sub in SUB_AUDITS: + data, rc = run_sub(sub, str(project_dir)) + max_exit = max(max_exit, rc) + if not data: + rep.add_finding( + Severity.HIGH, + "Gate 5: QA", + f"sub-audit:{sub}", + f"could not parse JSON output from {sub}", + f"Run `python3 scripts/{sub}.py --project-dir <dir> --json` to debug.", + ) + continue + for f in data.get("findings", []): + try: + sev = Severity(f.get("severity", "INFO")) + except ValueError: + sev = Severity.INFO + rep.add_finding( + sev, + f.get("gate", "Gate 5: QA"), + f"{sub}:{f.get('check', '')}", + f.get("evidence", ""), + f.get("fix", ""), + ) + return rep, max_exit + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 5: QA aggregate audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep, sub_max = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return max(rep.exit_code, sub_max) + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_real_wiring.py b/skills/.experimental/product-init/scripts/audit_real_wiring.py new file mode 100644 index 00000000..5501e8d1 --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_real_wiring.py @@ -0,0 +1,71 @@ +#!/usr/bin/env python3 +"""Cross-cutting: detect mock/stub/localhost references in non-test source.""" +from __future__ import annotations + +import argparse +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 + +ROOTS = ["src", "app", "frontend", "backend", "lib"] +SKIP_PARTS = {"node_modules", ".git", "dist", "build", ".venv", "venv", "__pycache__", "coverage"} +TEST_RE = re.compile(r"(\.test\.|\.spec\.|__tests__|/tests?/)", re.IGNORECASE) +SUSPECT_RE = re.compile( + r"\b(import\s+.*\bmock\b|fakeApi|stubApi|MockAdapter|localhost|127\.0\.0\.1)", + re.IGNORECASE, +) + + +def audit(project_dir: Path) -> Report: + rep = Report(name="cross-real-wiring") + gate = "Cross: Real Wiring" + scanned = 0 + hits = 0 + for root in ROOTS: + rdir = project_dir / root + if not rdir.exists(): + continue + for p in rdir.rglob("*"): + if not p.is_file(): + continue + if any(part in SKIP_PARTS for part in p.parts): + continue + if TEST_RE.search(str(p)): + continue + if p.suffix not in {".ts", ".tsx", ".js", ".jsx", ".py", ".mjs", ".cjs", ".vue", ".svelte"}: + continue + try: + text = p.read_text(encoding="utf-8", errors="ignore") + except OSError: + continue + scanned += 1 + for m in SUSPECT_RE.finditer(text): + lineno = text[: m.start()].count("\n") + 1 + hits += 1 + rep.add_finding( + Severity.HIGH, gate, "mock-or-localhost-in-source", + f"{p.relative_to(project_dir)}:{lineno} -> {m.group(0)[:80]}", + "Replace with real client; route via env-configured base URL.", + ) + if scanned == 0: + rep.add_finding(Severity.INFO, gate, "no-source-roots", + "no src/app/frontend/backend/lib directory found", + "Confirm project layout; nothing to scan.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Cross: real-wiring audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_sow.py b/skills/.experimental/product-init/scripts/audit_sow.py new file mode 100644 index 00000000..9e8116f4 --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_sow.py @@ -0,0 +1,128 @@ +#!/usr/bin/env python3 +"""Gate 2 - Statement of Work audit.""" +from __future__ import annotations + +import argparse +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 + +BANNED = ["RBAC", "compliance", "marketplace", "multi-region", "observability", "integrations"] + +DEFERRED_HEADING_MARKERS = ("deferred", "post-mvp", "post mvp", "out of scope", "out-of-scope", "backlog", "not in scope") + + +def split_active_vs_deferred(text: str) -> tuple[str, str]: + """Partition markdown text into (active_mvp, deferred) by scanning headings.""" + heading_re = re.compile(r"^(#{1,6})[ \t]+(.+?)[ \t]*$", re.MULTILINE) + matches = list(heading_re.finditer(text)) + if not matches: + return text, "" + active_parts: list[str] = [] + deferred_parts: list[str] = [] + # Prefix before first heading is always active. + active_parts.append(text[: matches[0].start()]) + for i, m in enumerate(matches): + end = matches[i + 1].start() if i + 1 < len(matches) else len(text) + block = text[m.start(): end] + title = m.group(2).strip().lower() + is_deferred = any(marker in title for marker in DEFERRED_HEADING_MARKERS) + if is_deferred: + deferred_parts.append(block) + else: + active_parts.append(block) + return "".join(active_parts), "".join(deferred_parts) + + +def extract_block(text: str, heading_substr: str) -> str: + """Return body of the first heading whose title contains heading_substr (ci).""" + heading_re = re.compile(r"^(#{1,6})[ \t]+(.+?)[ \t]*$", re.MULTILINE) + needle = heading_substr.lower() + matches = list(heading_re.finditer(text)) + for i, m in enumerate(matches): + title = m.group(2).strip().lower() + if needle in title: + start = m.end() + end = matches[i + 1].start() if i + 1 < len(matches) else len(text) + return text[start:end] + return "" + + +def read(path: Path) -> str: + try: + return path.read_text(encoding="utf-8") + except OSError: + return "" + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate2-sow") + gate = "Gate 2: SoW" + plan_path = project_dir / "PLAN.md" + tasks_path = project_dir / "TASKS.md" + + if not plan_path.exists(): + rep.add_finding(Severity.CRITICAL, gate, "plan-exists", "PLAN.md missing", + "Create PLAN.md from templates/") + return rep + + plan = read(plan_path) + + # Bug 4 fix: Find ## Appetite section, then look for any numeric value inside. + appetite_block = extract_block(plan, "appetite") + appetite_numeric_re = re.compile( + r"\d+\s*(week|day|sprint|hour|month)s?\b|\$\s*\d|\d+\s*(usd|tl|eur|gbp)\b", + re.IGNORECASE, + ) + if not appetite_block or not appetite_numeric_re.search(appetite_block): + rep.add_finding(Severity.HIGH, gate, "appetite-numeric", + "no numeric appetite (e.g., '4 weeks', '$15k')", + "Add `Appetite: N weeks` under ## Appetite.") + + # Bug 5 fix: kill criteria bullets — accept dash/star OR numbered list. + kill_block = extract_block(plan, "kill criteria") + bullets = re.findall(r"^\s*(?:[-*]|\d+\.)\s+\S", kill_block, re.MULTILINE) + if len(bullets) < 1: + rep.add_finding(Severity.HIGH, gate, "kill-criteria", + "no kill criteria bullets", + "Add at least one falsifiable kill condition.") + + deferred_block = extract_block(plan, "deferred") + deferred_bullets = re.findall(r"^\s*(?:[-*]|\d+\.)\s+(.+)", deferred_block, re.MULTILINE) + if len(deferred_bullets) < 3: + rep.add_finding(Severity.HIGH, gate, "deferred-list-size", + f"only {len(deferred_bullets)} deferred items", + "List at least 3 deferred items.") + hits = sum(1 for b in BANNED if b.lower() in deferred_block.lower()) + if hits < 3: + rep.add_finding(Severity.HIGH, gate, "deferred-list-coverage", + f"only {hits} of {BANNED} mentioned in deferred", + "Defer at least 3 of: RBAC, compliance, marketplace, multi-region, observability, integrations.") + + # Bug 3 fix: only flag BANNED terms in the active MVP partition of TASKS.md. + if tasks_path.exists(): + tasks = read(tasks_path) + active_tasks, _deferred_tasks = split_active_vs_deferred(tasks) + for term in BANNED: + if re.search(rf"\b{re.escape(term)}\b", active_tasks, re.IGNORECASE): + rep.add_finding(Severity.CRITICAL, gate, f"mvp-bans:{term}", + f"'{term}' present in TASKS.md", + f"Move {term} to PLAN.md deferred list.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 2: SoW audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_static.py b/skills/.experimental/product-init/scripts/audit_static.py new file mode 100644 index 00000000..094543ce --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_static.py @@ -0,0 +1,110 @@ +#!/usr/bin/env python3 +"""Gate 5 - Static analysis audit. eslint/ruff/mypy/tsc all-strict.""" +from __future__ import annotations + +import argparse +import json +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 +from lib.tool_runner import run # noqa: E402 + + +def has_node_dep(project: Path, name: str) -> bool: + pkg = project / "package.json" + if not pkg.exists(): + return False + try: + data = json.loads(pkg.read_text(encoding="utf-8")) + except Exception: + return False + deps = {**data.get("dependencies", {}), **data.get("devDependencies", {})} + return name in deps + + +def has_python_tool(project: Path, name: str) -> bool: + for f in ("pyproject.toml", "requirements.txt", "setup.cfg"): + p = project / f + if p.exists() and name in p.read_text(encoding="utf-8", errors="ignore").lower(): + return True + return False + + +def _missing_pkg(res) -> bool: + blob = ((res.stdout or "") + "\n" + (res.stderr or "")).lower() + needles = ( + "npx canceled due to missing packages", + "could not determine executable", + "command not found", + "binary not found", + "cannot find module", + "enoent", + ) + return any(n in blob for n in needles) + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate5-static") + gate = "Gate 5: QA / Static" + any_run = False + + if has_node_dep(project_dir, "eslint"): + any_run = True + res = run(["npx", "--no", "eslint", ".", "--max-warnings", "0"], cwd=str(project_dir), timeout=600) + if res.exit_code not in (0, 127): + if _missing_pkg(res): + rep.add_finding(Severity.MEDIUM, gate, "eslint-not-installed", + "eslint binary missing locally; run `npm ci` to install before audit.", + "Install dependencies; CI is expected to satisfy this.") + else: + rep.add_finding(Severity.CRITICAL, gate, "eslint", + (res.stdout or res.stderr)[:500], + "Fix all eslint errors and warnings.") + if has_node_dep(project_dir, "typescript"): + any_run = True + res = run(["npx", "--no", "tsc", "--noEmit"], cwd=str(project_dir), timeout=600) + if res.exit_code not in (0, 127): + if _missing_pkg(res): + rep.add_finding(Severity.MEDIUM, gate, "tsc-not-installed", + "tsc binary missing locally; run `npm ci` to install before audit.", + "Install dependencies; CI is expected to satisfy this.") + else: + rep.add_finding(Severity.CRITICAL, gate, "tsc-noemit", + (res.stdout or res.stderr)[:500], + "Fix TypeScript type errors.") + if has_python_tool(project_dir, "ruff"): + any_run = True + res = run(["ruff", "check", "."], cwd=str(project_dir), timeout=300) + if res.exit_code not in (0, 127): + rep.add_finding(Severity.CRITICAL, gate, "ruff", + (res.stdout or res.stderr)[:500], + "Fix ruff lint errors.") + if has_python_tool(project_dir, "mypy"): + any_run = True + res = run(["mypy", "--strict", "."], cwd=str(project_dir), timeout=600) + if res.exit_code not in (0, 127): + rep.add_finding(Severity.CRITICAL, gate, "mypy-strict", + (res.stdout or res.stderr)[:500], + "Fix mypy --strict errors.") + + if not any_run: + rep.add_finding(Severity.MEDIUM, gate, "static-analysis-missing", + "no eslint/tsc/ruff/mypy detected", + "Add at least one strict static analyzer.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 5: Static audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_uat.py b/skills/.experimental/product-init/scripts/audit_uat.py new file mode 100644 index 00000000..7efa0450 --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_uat.py @@ -0,0 +1,70 @@ +#!/usr/bin/env python3 +"""Gate 6 - UAT audit. Looks for uat specs, signed UAT report, and uat-v* tag.""" +from __future__ import annotations + +import argparse +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 +from lib.tool_runner import run # noqa: E402 + + +UAT_FILE_RE = re.compile(r".*\.uat\.spec\.(ts|js|py)$", re.IGNORECASE) + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate6-uat") + gate = "Gate 6: UAT" + + uat_dir = project_dir / "e2e" / "uat" + matches = [] + if uat_dir.exists(): + matches = [p for p in uat_dir.rglob("*") if p.is_file() and UAT_FILE_RE.match(p.name)] + if not matches: + rep.add_finding(Severity.HIGH, gate, "uat-spec", + "no e2e/uat/*.uat.spec.{ts,js,py} found", + "Add at least one UAT spec walking the live URL.") + + report_path = project_dir / "UAT_REPORT.md" + if not report_path.exists(): + rep.add_finding(Severity.CRITICAL, gate, "uat-report", + "UAT_REPORT.md missing", + "Generate UAT_REPORT.md with sha256 of build and Signed-off-by line.") + else: + text = report_path.read_text(encoding="utf-8", errors="ignore") + if not re.search(r"^sha256:", text, re.MULTILINE | re.IGNORECASE): + rep.add_finding(Severity.HIGH, gate, "uat-sha256", + "no `sha256:` line in UAT_REPORT.md", + "Add `sha256: <hash>` of the artifact under test.") + if not re.search(r"^Signed-off-by:", text, re.MULTILINE | re.IGNORECASE): + rep.add_finding(Severity.HIGH, gate, "uat-signoff", + "no `Signed-off-by:` line", + "Have the user sign off explicitly.") + + res = run(["git", "-C", str(project_dir), "tag", "--list", "uat-v*"]) + if res.ok and not res.stdout.strip(): + rep.add_finding(Severity.HIGH, gate, "uat-tag", + "no git tag matching uat-v*", + "Tag the commit accepted by the user, e.g. `git tag uat-v1.0.0`.") + elif res.exit_code == 127: + rep.add_finding(Severity.LOW, gate, "git-missing", + "git not available", + "Install git to enable tag verification.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 6: UAT audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_unit.py b/skills/.experimental/product-init/scripts/audit_unit.py new file mode 100644 index 00000000..516f7113 --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_unit.py @@ -0,0 +1,138 @@ +#!/usr/bin/env python3 +"""Gate 5 - Unit test audit. Detects vitest/jest/pytest, runs, parses results.""" +from __future__ import annotations + +import argparse +import json +import re +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 +from lib.tool_runner import run # noqa: E402 + + +def detect_node(project: Path): + pkg = project / "package.json" + if not pkg.exists(): + return None + try: + data = json.loads(pkg.read_text(encoding="utf-8")) + except Exception: + return None + deps = {**data.get("dependencies", {}), **data.get("devDependencies", {})} + if "vitest" in deps: + return "vitest" + if "jest" in deps: + return "jest" + return None + + +def detect_python(project: Path): + for f in ("pyproject.toml", "setup.cfg", "pytest.ini", "tox.ini"): + p = project / f + if p.exists() and "pytest" in p.read_text(encoding="utf-8", errors="ignore").lower(): + return "pytest" + if any(project.rglob("conftest.py")): + return "pytest" + return None + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate5-unit") + gate = "Gate 5: QA / Unit" + + runner_node = detect_node(project_dir) + runner_py = detect_python(project_dir) + + if not runner_node and not runner_py: + rep.add_finding(Severity.MEDIUM, gate, "test-runner", + "no vitest/jest/pytest detected", + "Add a unit test runner; this gate cannot pass without one.") + return rep + + if runner_node == "vitest": + res = run(["npx", "--no", "vitest", "run", "--reporter=json"], cwd=str(project_dir), timeout=600) + parse_node_json(res.stdout, rep, gate, res.stderr) + elif runner_node == "jest": + res = run(["npx", "--no", "jest", "--json", "--ci"], cwd=str(project_dir), timeout=600) + parse_node_json(res.stdout, rep, gate, res.stderr) + if runner_py: + res = run(["pytest", "--tb=no", "-q", "--json-report", "--json-report-file=-"], + cwd=str(project_dir), timeout=600) + parse_pytest(res.stdout, res.stderr, rep, gate) + return rep + + +def _looks_like_missing_package(text: str, stderr: str = "") -> bool: + blob = (text or "") + "\n" + (stderr or "") + needles = ( + "npx canceled due to missing packages", + "could not determine executable", + "command not found", + "binary not found", + "Cannot find module", + "ENOENT", + ) + return any(n.lower() in blob.lower() for n in needles) + + +def parse_node_json(text: str, rep: Report, gate: str, stderr: str = "") -> None: + try: + data = json.loads(text) + except Exception: + if _looks_like_missing_package(text, stderr): + rep.add_finding(Severity.MEDIUM, gate, "unit-runner-not-installed", + "test runner not installed locally (sandbox/CI may differ)", + "Run `npm ci` so vitest/jest is available, then re-run.") + else: + rep.add_finding(Severity.HIGH, gate, "unit-runner", + "could not parse JSON output from test runner", + "Re-run locally and inspect output.") + return + failed = data.get("numFailedTests", 0) + skipped = data.get("numPendingTests", 0) + data.get("numTodoTests", 0) + if failed: + rep.add_finding(Severity.CRITICAL, gate, "unit-failed", + f"{failed} failing tests", "Fix all failing unit tests.") + if skipped: + rep.add_finding(Severity.HIGH, gate, "unit-skipped", + f"{skipped} skipped/todo tests", + "Skipped tests are forbidden at gate close; re-enable or remove.") + + +def parse_pytest(stdout: str, stderr: str, rep: Report, gate: str) -> None: + blob = stdout + try: + data = json.loads(blob) + summary = data.get("summary", {}) + failed = summary.get("failed", 0) + skipped = summary.get("skipped", 0) + summary.get("xfailed", 0) + except Exception: + text = stdout + "\n" + stderr + m = re.search(r"(\d+)\s+failed", text) + failed = int(m.group(1)) if m else 0 + m = re.search(r"(\d+)\s+skipped", text) + skipped = int(m.group(1)) if m else 0 + if failed: + rep.add_finding(Severity.CRITICAL, gate, "pytest-failed", + f"{failed} failing pytest tests", "Fix failing tests.") + if skipped: + rep.add_finding(Severity.HIGH, gate, "pytest-skipped", + f"{skipped} skipped/xfail", + "Re-enable or remove; skipped tests block the gate.") + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 5: Unit test audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/audit_warranty.py b/skills/.experimental/product-init/scripts/audit_warranty.py new file mode 100644 index 00000000..a0b7ac4d --- /dev/null +++ b/skills/.experimental/product-init/scripts/audit_warranty.py @@ -0,0 +1,91 @@ +#!/usr/bin/env python3 +"""Gate 9 - Warranty: audit scripts wired into CI + branch protection.""" +from __future__ import annotations + +import argparse +import json +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from lib.report import Report, Severity # noqa: E402 +from lib.tool_runner import run # noqa: E402 + +try: + import yaml +except ImportError: + yaml = None + +REQUIRED_SCRIPTS = ["audit_constitution", "audit_build", "audit_qa"] + + +def audit(project_dir: Path) -> Report: + rep = Report(name="gate9-warranty") + gate = "Gate 9: Warranty" + + wf_dir = project_dir / ".github" / "workflows" + if not wf_dir.exists(): + rep.add_finding(Severity.CRITICAL, gate, "workflows-missing", + ".github/workflows missing", + "Add CI workflows that run the audit suite.") + return rep + + seen = set() + for f in wf_dir.glob("*.yml"): + text = f.read_text(encoding="utf-8", errors="ignore") + for s in REQUIRED_SCRIPTS + ["audit_unit", "audit_e2e", "audit_integration", + "audit_deploy", "audit_handoff", "audit_uat", "audit_sow"]: + if s in text: + seen.add(s) + if yaml: + try: + yaml.safe_load(text) + except Exception as e: + rep.add_finding(Severity.HIGH, gate, f"yaml-syntax:{f.name}", + str(e)[:200], + "Fix YAML syntax in workflow.") + for req in REQUIRED_SCRIPTS: + if req not in seen: + rep.add_finding(Severity.HIGH, gate, f"ci-missing:{req}", + f"{req} not referenced in any workflow", + f"Wire {req}.py into a CI job.") + + gh = run(["gh", "--version"]) + if gh.exit_code == 0: + info = run(["gh", "api", "repos/:owner/:repo/branches/main/protection"], cwd=str(project_dir)) + if info.ok: + try: + data = json.loads(info.stdout) + contexts = data.get("required_status_checks", {}).get("contexts", []) + for req in REQUIRED_SCRIPTS: + if not any(req in c for c in contexts): + rep.add_finding(Severity.HIGH, gate, f"branch-protection:{req}", + f"{req} not in required status checks", + f"Add {req} to main branch protection required checks.") + except Exception: + rep.add_finding(Severity.LOW, gate, "branch-protection-parse", + "could not parse gh api output", + "Check `gh api repos/:owner/:repo/branches/main/protection` manually.") + else: + rep.add_finding(Severity.LOW, gate, "branch-protection-fetch", + "gh api failed (auth or repo); manual check required", + "Run `gh auth login` and `gh api .../branches/main/protection`.") + else: + rep.add_finding(Severity.LOW, gate, "gh-cli-missing", + "gh CLI not installed; cannot verify branch protection", + "Install gh or verify branch protection manually.") + return rep + + +def main() -> int: + p = argparse.ArgumentParser(description="Gate 9: Warranty audit") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + args = p.parse_args() + rep = audit(Path(args.project_dir).resolve()) + print(rep.to_json() if args.json else rep.to_markdown()) + return rep.exit_code + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/filter_task.py b/skills/.experimental/product-init/scripts/filter_task.py new file mode 100644 index 00000000..30f8d45a --- /dev/null +++ b/skills/.experimental/product-init/scripts/filter_task.py @@ -0,0 +1,61 @@ +#!/usr/bin/env python3 +"""Golden-path filter. Score a task description against 7 golden_path_steps. + +Step 1 intake; 2 spec; 3 code; 4 test; 5 deploy; 6 handoff; 7 support. +If max score < 0.3 -> DEFER. Else print matched step + confidence. +""" +from __future__ import annotations + +import argparse +import re +import sys + +KEYWORDS = { + 1: {"intake", "discovery", "interview", "persona", "idea", "brief", "kickoff", "intake-form"}, + 2: {"spec", "specification", "requirements", "ac", "acceptance", "user-story", "design", "wireframe", "mockup", "figma"}, + 3: {"code", "implement", "build", "refactor", "feature", "endpoint", "component", "frontend", "backend", "api"}, + 4: {"test", "tests", "unit", "integration", "e2e", "playwright", "vitest", "jest", "pytest", "coverage", "mutation"}, + 5: {"deploy", "release", "preview", "staging", "production", "vercel", "netlify", "ship", "rollback", "smoke"}, + 6: {"handoff", "documentation", "docs", "runbook", "credentials", "walkthrough", "knowledge", "transfer", "onboarding"}, + 7: {"support", "incident", "warranty", "monitor", "alert", "bugfix", "hotfix", "patch", "maintenance", "follow-up"}, +} + +STEP_NAMES = { + 1: "Intake", + 2: "Spec", + 3: "Code", + 4: "Test", + 5: "Deploy", + 6: "Handoff", + 7: "Support", +} + + +def tokenize(text: str): + return [t.lower() for t in re.findall(r"[A-Za-z][A-Za-z0-9-]+", text)] + + +def score(tokens, keys): + if not tokens: + return 0.0 + hits = sum(1 for t in tokens if t in keys) + return hits / max(len(tokens), 1) * 5 # boost so single hit on short input passes + + +def main() -> int: + p = argparse.ArgumentParser(description="Golden Path filter") + p.add_argument("task", nargs="+", help="task description") + args = p.parse_args() + text = " ".join(args.task) + tokens = tokenize(text) + scored = {step: score(tokens, kws) for step, kws in KEYWORDS.items()} + best_step, best_score = max(scored.items(), key=lambda kv: kv[1]) + if best_score < 0.3: + print("DEFER: no golden_path_step match") + return 1 + print(f"golden_path_step={best_step} ({STEP_NAMES[best_step]}) confidence={best_score:.2f}") + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/lib/__init__.py b/skills/.experimental/product-init/scripts/lib/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/skills/.experimental/product-init/scripts/lib/report.py b/skills/.experimental/product-init/scripts/lib/report.py new file mode 100644 index 00000000..00eb9181 --- /dev/null +++ b/skills/.experimental/product-init/scripts/lib/report.py @@ -0,0 +1,98 @@ +"""Finding/Report primitives shared by every audit script.""" +from __future__ import annotations + +import json +from dataclasses import asdict, dataclass, field +from enum import Enum +from typing import List + + +class Severity(str, Enum): + INFO = "INFO" + LOW = "LOW" + MEDIUM = "MEDIUM" + HIGH = "HIGH" + CRITICAL = "CRITICAL" + + @property + def rank(self) -> int: + return ["INFO", "LOW", "MEDIUM", "HIGH", "CRITICAL"].index(self.value) + + +@dataclass +class Finding: + severity: Severity + gate: str + check: str + evidence: str + fix: str + + def to_dict(self) -> dict: + d = asdict(self) + d["severity"] = self.severity.value + return d + + +@dataclass +class Report: + name: str = "audit" + findings: List[Finding] = field(default_factory=list) + + def add_finding( + self, + severity: Severity, + gate: str, + check: str, + evidence: str, + fix: str, + ) -> None: + self.findings.append(Finding(severity, gate, check, evidence, fix)) + + def merge(self, other: "Report") -> None: + self.findings.extend(other.findings) + + @property + def exit_code(self) -> int: + for f in self.findings: + if f.severity in (Severity.HIGH, Severity.CRITICAL): + return 1 + return 0 + + def counts(self) -> dict: + out = {s.value: 0 for s in Severity} + for f in self.findings: + out[f.severity.value] += 1 + return out + + def to_markdown(self) -> str: + lines = [f"# Audit Report: {self.name}", ""] + c = self.counts() + lines.append( + f"**Counts**: CRITICAL={c['CRITICAL']} HIGH={c['HIGH']} MEDIUM={c['MEDIUM']} " + f"LOW={c['LOW']} INFO={c['INFO']}" + ) + lines.append(f"**Exit code**: {self.exit_code}") + lines.append("") + if not self.findings: + lines.append("_No findings._") + return "\n".join(lines) + lines.append("| Severity | Gate | Check | Evidence | Fix |") + lines.append("| --- | --- | --- | --- | --- |") + for f in self.findings: + ev = f.evidence.replace("|", "\\|").replace("\n", " ") + fx = f.fix.replace("|", "\\|").replace("\n", " ") + lines.append( + f"| {f.severity.value} | {f.gate} | {f.check} | {ev} | {fx} |" + ) + return "\n".join(lines) + + def to_json(self) -> str: + return json.dumps( + { + "name": self.name, + "exit_code": self.exit_code, + "counts": self.counts(), + "findings": [f.to_dict() for f in self.findings], + }, + indent=2, + ) diff --git a/skills/.experimental/product-init/scripts/lib/tool_runner.py b/skills/.experimental/product-init/scripts/lib/tool_runner.py new file mode 100644 index 00000000..032a8258 --- /dev/null +++ b/skills/.experimental/product-init/scripts/lib/tool_runner.py @@ -0,0 +1,41 @@ +"""Subprocess wrapper that swallows missing-binary errors into a structured result.""" +from __future__ import annotations + +import subprocess +from dataclasses import dataclass +from typing import Optional, Sequence + + +@dataclass +class ToolResult: + exit_code: int + stdout: str + stderr: str + + @property + def ok(self) -> bool: + return self.exit_code == 0 + + +def run(cmd: Sequence[str], cwd: Optional[str] = None, timeout: int = 120) -> ToolResult: + """Run a command and return ToolResult. + + Returns exit_code=127 with stderr "binary not found: <cmd>" if the executable is missing. + Returns exit_code=124 on timeout. + """ + try: + proc = subprocess.run( + list(cmd), + cwd=cwd, + timeout=timeout, + capture_output=True, + text=True, + check=False, + ) + return ToolResult(proc.returncode, proc.stdout or "", proc.stderr or "") + except FileNotFoundError: + return ToolResult(127, "", f"binary not found: {cmd[0]}") + except subprocess.TimeoutExpired as e: + return ToolResult(124, e.stdout or "", (e.stderr or "") + f"\ntimeout after {timeout}s") + except OSError as e: + return ToolResult(126, "", f"os error: {e}") diff --git a/skills/.experimental/product-init/scripts/orchestrator.py b/skills/.experimental/product-init/scripts/orchestrator.py new file mode 100644 index 00000000..122374e7 --- /dev/null +++ b/skills/.experimental/product-init/scripts/orchestrator.py @@ -0,0 +1,170 @@ +#!/usr/bin/env python3 +"""product-init orchestrator. Subcommands: init, gate, filter, audit.""" +from __future__ import annotations + +import argparse +import json +import os +import shutil +import subprocess +import sys +from pathlib import Path + +# Runtime-portable skill dir resolution: +# 1. $PRODUCT_INIT_SKILL_DIR env var +# 2. ~/.openclaw/skills/product-init/ (if exists and is a real dir) +# 3. ~/.claude/skills/product-init/ (canonical install) +# 4. parent of this script (fallback for dev/testing) +def _resolve_skill_dir() -> Path: + if env := os.environ.get("PRODUCT_INIT_SKILL_DIR"): + return Path(env).resolve() + for candidate in ( + Path.home() / ".openclaw" / "skills" / "product-init", + Path.home() / ".claude" / "skills" / "product-init", + ): + resolved = candidate.resolve() + if resolved.is_dir(): + return resolved + return Path(__file__).resolve().parent.parent + +SCRIPTS_DIR = Path(__file__).resolve().parent +SKILL_DIR = _resolve_skill_dir() +TEMPLATES_DIR = SKILL_DIR / "templates" + +sys.path.insert(0, str(SCRIPTS_DIR)) +from lib.report import Report, Severity # noqa: E402 + +GATE_SCRIPTS = { + 1: ["audit_constitution.py"], + 2: ["audit_sow.py"], + 3: [], # design - manual review (no programmatic audit yet) + 4: ["audit_build.py", "audit_real_wiring.py"], + 5: [ + "audit_unit.py", + "audit_integration.py", + "audit_e2e.py", + "audit_contract.py", + "audit_coverage.py", + "audit_mutation.py", + "audit_static.py", + "audit_console_clean.py", + ], + 6: ["audit_uat.py"], + 7: ["audit_deploy.py", "audit_demo_url.py"], + 8: ["audit_handoff.py"], + 9: ["audit_warranty.py"], +} + + +def run_script(name: str, project_dir: Path, as_json: bool = True): + cmd = [sys.executable, str(SCRIPTS_DIR / name), "--project-dir", str(project_dir)] + if as_json: + cmd.append("--json") + proc = subprocess.run(cmd, capture_output=True, text=True) + return proc + + +def cmd_init(args) -> int: + project = Path(args.project_dir).resolve() + project.mkdir(parents=True, exist_ok=True) + if not TEMPLATES_DIR.exists(): + print(f"Templates not found at {TEMPLATES_DIR}", file=sys.stderr) + return 2 + copied = [] + for tpl in TEMPLATES_DIR.glob("*.md"): + dest = project / tpl.name + if dest.exists() and not args.force: + continue + shutil.copy2(tpl, dest) + copied.append(tpl.name) + print(f"Initialized product-init in {project}") + print(f"Idea: {args.idea}") + print(f"Copied {len(copied)} templates: {', '.join(copied) if copied else '(none new)'}") + print("Next: fill PRODUCT.md, SPEC.md, PLAN.md, TASKS.md, COMPETITIVE_BENCHMARK.md, then run `gate 1`.") + return 0 + + +def cmd_gate(args) -> int: + project = Path(args.project_dir).resolve() + n = args.n + scripts = GATE_SCRIPTS.get(n) + if scripts is None: + print(f"Unknown gate {n}", file=sys.stderr) + return 2 + if not scripts: + print(f"Gate {n}: manual review required (no programmatic audit).") + return 0 + aggregate = Report(name=f"gate{n}") + for s in scripts: + proc = run_script(s, project, as_json=True) + try: + data = json.loads(proc.stdout) + for f in data.get("findings", []): + aggregate.add_finding( + Severity(f["severity"]), f["gate"], f["check"], f["evidence"], f["fix"] + ) + except Exception: + print(f"[warn] could not parse {s} output: {proc.stdout[:200]}", file=sys.stderr) + print(aggregate.to_json() if args.json else aggregate.to_markdown()) + return aggregate.exit_code + + +def cmd_filter(args) -> int: + proc = subprocess.run( + [sys.executable, str(SCRIPTS_DIR / "filter_task.py"), args.task], + capture_output=True, text=True, + ) + print(proc.stdout, end="") + if proc.stderr: + print(proc.stderr, file=sys.stderr, end="") + return proc.returncode + + +def cmd_audit(args) -> int: + project = Path(args.project_dir).resolve() + aggregate = Report(name="full-audit") + for n in sorted(GATE_SCRIPTS): + for s in GATE_SCRIPTS[n]: + proc = run_script(s, project, as_json=True) + try: + data = json.loads(proc.stdout) + for f in data.get("findings", []): + aggregate.add_finding( + Severity(f["severity"]), f["gate"], f["check"], f["evidence"], f["fix"] + ) + except Exception: + print(f"[warn] could not parse {s}: {proc.stdout[:200]}", file=sys.stderr) + print(aggregate.to_json() if args.json else aggregate.to_markdown()) + return aggregate.exit_code + + +def main() -> int: + p = argparse.ArgumentParser(prog="product-init") + p.add_argument("--project-dir", default=".") + p.add_argument("--json", action="store_true") + sub = p.add_subparsers(dest="cmd", required=True) + + p_init = sub.add_parser("init", help="Bootstrap a new product project") + p_init.add_argument("idea", help="One-line product idea") + p_init.add_argument("--force", action="store_true") + p_init.set_defaults(func=cmd_init) + + p_gate = sub.add_parser("gate", help="Run audits for gate N") + p_gate.add_argument("n", type=int, choices=range(1, 10)) + p_gate.add_argument("--json", action="store_true") + p_gate.set_defaults(func=cmd_gate) + + p_filter = sub.add_parser("filter", help="Filter a task against the golden path") + p_filter.add_argument("task", help="Task description") + p_filter.set_defaults(func=cmd_filter) + + p_audit = sub.add_parser("audit", help="Run all audits") + p_audit.add_argument("--json", action="store_true") + p_audit.set_defaults(func=cmd_audit) + + args = p.parse_args() + return args.func(args) + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/skills/.experimental/product-init/scripts/requirements.txt b/skills/.experimental/product-init/scripts/requirements.txt new file mode 100644 index 00000000..8500d8d7 --- /dev/null +++ b/skills/.experimental/product-init/scripts/requirements.txt @@ -0,0 +1,4 @@ +pyyaml +jsonschema +python-frontmatter +requests diff --git a/skills/.experimental/product-init/templates/COMPETITIVE_BENCHMARK.md b/skills/.experimental/product-init/templates/COMPETITIVE_BENCHMARK.md new file mode 100644 index 00000000..af2ed776 --- /dev/null +++ b/skills/.experimental/product-init/templates/COMPETITIVE_BENCHMARK.md @@ -0,0 +1,20 @@ +--- +name: Competitive Benchmark +description: Benchmarks the product against competitors. +gate: 1 +audit_script: audit_constitution.py +--- + +<!-- INSTRUCTION: This template is filled by the Product Owner and Marketing Team at the start of the project because it helps in understanding the competitive landscape. --> + +| Metric | v0.dev | Bolt | Lovable | Railway | [Yours] | +|--------|--------|------|---------|---------|---------| +| Prompt to UI (seconds) | [FILL: Number] | [FILL: Number] | [FILL: Number] | [FILL: Number] | [FILL: Number] | +| Prompt to Full-Stack (minutes) | [FILL: Number] | [FILL: Number] | [FILL: Number] | [FILL: Number] | [FILL: Number] | +| Prompt to Deploy URL (minutes) | [FILL: Number] | [FILL: Number] | [FILL: Number] | [FILL: Number] | [FILL: Number] | + +## Per-Sprint Review +- **Sprint+1 Target:** [FILL: Target for Sprint+1] +- **Sprint+4 Target:** [FILL: Target for Sprint+4] + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/DEBT.md b/skills/.experimental/product-init/templates/DEBT.md new file mode 100644 index 00000000..f2e3e2db --- /dev/null +++ b/skills/.experimental/product-init/templates/DEBT.md @@ -0,0 +1,14 @@ +--- +name: Tech Debt Ledger +description: Tracks technical debt. +gate: 4 +audit_script: audit_build.py +--- + +<!-- INSTRUCTION: This template is filled by the Engineering Team during the build phase because it helps in managing technical debt. --> + +| ID | commit_sha | file:line | category | reason | owner | due_date | status | +|----|------------|-----------|----------|--------|-------|----------|--------| +| [EXAMPLE] | [FILL: Commit SHA] | [FILL: File:Line] | [FILL: Category] | [FILL: Reason] | [FILL: Owner] | [FILL: Due Date] | [FILL: Status] | + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/PLAN.md b/skills/.experimental/product-init/templates/PLAN.md new file mode 100644 index 00000000..a3848114 --- /dev/null +++ b/skills/.experimental/product-init/templates/PLAN.md @@ -0,0 +1,46 @@ +--- +name: Plan +description: Outlines the 5 phases of the project. +gate: 2 +audit_script: audit_sow.py +--- + +<!-- INSTRUCTION: This template is filled by the Project Manager and Team Leads after the initial planning because it outlines the project phases and success criteria. --> + +## Phase 0: Constitution +- **Appetite:** [FILL: Weeks + Budget] +- **Success Criteria:** [FILL: Criteria for success] +- **Kill Criteria:** [FILL: Criteria for killing the phase] +- **Deferred Items:** [FILL: Deferred items] + +## Phase 1: Single Golden Path +- **Appetite:** [FILL: Weeks + Budget] +- **Success Criteria:** [FILL: Criteria for success] +- **Kill Criteria:** [FILL: Criteria for killing the phase] +- **Deferred Items:** [FILL: Deferred items] + +## Phase 2: GitHub Native Delivery +- **Appetite:** [FILL: Weeks + Budget] +- **Success Criteria:** [FILL: Criteria for success] +- **Kill Criteria:** [FILL: Criteria for killing the phase] +- **Deferred Items:** [FILL: Deferred items] + +## Phase 3: Outcome Board +- **Appetite:** [FILL: Weeks + Budget] +- **Success Criteria:** [FILL: Criteria for success] +- **Kill Criteria:** [FILL: Criteria for killing the phase] +- **Deferred Items:** [FILL: Deferred items] + +## Phase 4: Builder Quality Loop +- **Appetite:** [FILL: Weeks + Budget] +- **Success Criteria:** [FILL: Criteria for success] +- **Kill Criteria:** [FILL: Criteria for killing the phase] +- **Deferred Items:** [FILL: Deferred items] + +## Phase 5: Platform Hardening +- **Appetite:** [FILL: Weeks + Budget] +- **Success Criteria:** [FILL: Criteria for success] +- **Kill Criteria:** [FILL: Criteria for killing the phase] +- **Deferred Items:** [FILL: Deferred items] + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/PRODUCT.md b/skills/.experimental/product-init/templates/PRODUCT.md new file mode 100644 index 00000000..8e07db93 --- /dev/null +++ b/skills/.experimental/product-init/templates/PRODUCT.md @@ -0,0 +1,37 @@ +--- +name: Product Constitution +description: Defines the product's constitution and key metrics. +gate: 1 +audit_script: audit_constitution.py +golden_path_steps: ["Step 1", "Step 2", "Step 3", "Step 4", "Step 5", "Step 6", "Step 7"] +--- + +<!-- INSTRUCTION: This template is filled by the Product Owner at the start of the project because it sets the foundation for the product. --> + +## Golden Path +[FILL: One sentence describing the golden path] + +## Target User +### Persona 1 +- **Pain Point:** [FILL: Pain point for Persona 1] + +### Persona 2 +- **Pain Point:** [FILL: Pain point for Persona 2] + +### Persona 3 +- **Pain Point:** [FILL: Pain point for Persona 3] + +## Current Alternative + Switching Cost +- **Current Alternative:** [FILL: Current alternative solution] +- **Switching Cost:** [FILL: Cost of switching to your product] + +## 10-Minute Success Signal +[FILL: Describe what a user should be able to do in 10 minutes] + +## Outcome Metric +- **Metric:** [FILL: Measurable outcome metric] + +## prod_url +- **URL:** [FILL: Production URL] + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/SPEC.md b/skills/.experimental/product-init/templates/SPEC.md new file mode 100644 index 00000000..18863b45 --- /dev/null +++ b/skills/.experimental/product-init/templates/SPEC.md @@ -0,0 +1,10 @@ +--- +name: Specification +description: Defines the main user flow and acceptance criteria. +gate: 2 +audit_script: audit_sow.py +--- + +<!-- INSTRUCTION: This template is filled by the Product Manager and Engineers after the initial planning because it details the user flow and acceptance criteria. --> + +## Flow Diagram diff --git a/skills/.experimental/product-init/templates/TASKS.md b/skills/.experimental/product-init/templates/TASKS.md new file mode 100644 index 00000000..f6a054a9 --- /dev/null +++ b/skills/.experimental/product-init/templates/TASKS.md @@ -0,0 +1,20 @@ +--- +name: Tasks +description: Task table for the project. +gate: 4 +audit_script: audit_build.py +--- + +<!-- INSTRUCTION: This template is filled by the Project Manager and Team Members during the build phase because it tracks the tasks and their status. --> + +| ID | Title | golden_path_step | Owner | Acceptance Criteria | DEBT introduced? | Status | +|----|-------|------------------|-------|----------------------|------------------|--------| +| T-1 | [FILL: Task Title] | 1 | [FILL: Owner] | [FILL: AC] | [FILL: Yes/No] | [FILL: Status] | +| T-2 | [FILL: Task Title] | 2 | [FILL: Owner] | [FILL: AC] | [FILL: Yes/No] | [FILL: Status] | +| T-3 | [FILL: Task Title] | 3 | [FILL: Owner] | [FILL: AC] | [FILL: Yes/No] | [FILL: Status] | + +## DEFERRED (post-MVP) +- [ ] [FILL: Deferred Task 1] +- [ ] [FILL: Deferred Task 2] + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/definition-of-done.md b/skills/.experimental/product-init/templates/definition-of-done.md new file mode 100644 index 00000000..eec0ddee --- /dev/null +++ b/skills/.experimental/product-init/templates/definition-of-done.md @@ -0,0 +1,25 @@ +--- +name: Definition of Done +description: Per-feature DoD and Golden Path release-gate. +gate: 5 +audit_script: audit_qa.py +--- + +<!-- INSTRUCTION: This template is filled by the QA Team and Developers during the QA phase because it defines the criteria for feature completion and release. --> + +## Per-Feature DoD +- [ ] [FILL: Feature 1] + - [ ] Tied to golden_path_step: [FILL: Step] + - [ ] AC met: [FILL: AC ID] + - [ ] Tests green: [FILL: Yes/No] + - [ ] No new debt markers: [FILL: Yes/No] +- [ ] [FILL: Feature 2] + - [ ] Tied to golden_path_step: [FILL: Step] + - [ ] AC met: [FILL: AC ID] + - [ ] Tests green: [FILL: Yes/No] + - [ ] No new debt markers: [FILL: Yes/No] + +## Golden Path Release Gate +- [ ] Single end-to-end E2E required green for any release + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/handoff-package.md b/skills/.experimental/product-init/templates/handoff-package.md new file mode 100644 index 00000000..8654f97d --- /dev/null +++ b/skills/.experimental/product-init/templates/handoff-package.md @@ -0,0 +1,18 @@ +--- +name: Handoff Package +description: Checklist for handoff. +gate: 8 +audit_script: audit_handoff.py +--- + +<!-- INSTRUCTION: This template is filled by the Project Manager and Team Leads during the handoff phase because it ensures all necessary information is transferred. --> + +- [ ] **Code Repository link:** [FILL: Code Repository link] +- [ ] **Runbook Location:** [FILL: Runbook Location] +- [ ] **Credentials Vault (1Password/Doppler link):** [FILL: Credentials Vault link] +- [ ] **Admin Walkthrough Video (Loom link):** [FILL: Loom link] +- [ ] **Knowledge Transfer Session (date+attendees):** [FILL: Date and Attendees] +- [ ] **DEBT.md State (count of open items):** [FILL: Count of open items] +- [ ] **Source Escrow (location):** [FILL: Source Escrow location] + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/jira-epic-skeleton.md b/skills/.experimental/product-init/templates/jira-epic-skeleton.md new file mode 100644 index 00000000..4cd9b8be --- /dev/null +++ b/skills/.experimental/product-init/templates/jira-epic-skeleton.md @@ -0,0 +1,140 @@ +--- +name: jira-epic-skeleton +description: Six outcome-epics (not capability-epics) for the agentic delivery workflow, each mapped to a golden_path_step. +type: template +--- + +# Jira Epic Skeleton -- Outcome Epics + +The single most damaging organisational anti-pattern in this skill's lineage is the **capability epic**: epics named "Auth Epic", "Generation Epic", "Deploy Epic". Capability epics close green while the user-facing outcome remains broken. This skeleton replaces them with six **outcome epics**, one per Golden Path step where a user observable state changes. Each epic only closes when the named outcome holds for a real user on the live URL. + +> Rule of thumb: an outcome-epic name reads as a complete sentence in the past tense from the user's point of view. "Idea Captured" works. "Auth Epic" does not. + +--- + +## Epic 1 -- Idea Captured + +**Outcome statement.** "I, the user, told the system my idea, my persona, and my pain. The system has stored it in a way I can re-read tomorrow." + +**Sample stories.** +- US-1.1 As a solo founder I can paste a one-line idea and a 3-bullet persona+pain into the intake form. +- US-1.2 As a returning user I can re-open my intake and edit it. +- US-1.3 As a stakeholder I can read the intake as a markdown file in the repo. + +**AC pattern.** +- Given an authenticated user, when they submit the intake form, then a `PRODUCT.md` is created/updated in the project repo with frontmatter `golden_path_step: 1` and the Golden Path one-sentence is present. +- Given an existing intake, when the user re-opens it, then the form is pre-filled. +- Given the intake submitted, when `audit_constitution.py` runs, then it reports zero CRITICAL findings. + +**golden_path_step:** 1. + +--- + +## Epic 2 -- Spec Approved + +**Outcome statement.** "The system gave me back a specification I recognise as my idea. I edited it, approved it, and the approval is recorded." + +**Sample stories.** +- US-2.1 As a user I see an AI-generated SPEC.md derived from my intake. +- US-2.2 As a user I edit SPEC.md inline and save. +- US-2.3 As a user I click Approve and a PR is opened on my behalf to merge SPEC.md. + +**AC pattern.** +- Given an intake exists, when the spec generation completes, then SPEC.md exists with frontmatter `golden_path_step: 2`, scope, and acceptance criteria. +- Given the user approves, when the approval workflow runs, then a git commit `docs(spec): approve <idea>` lands on the project default branch. +- Given the approval, when `audit_sow.py` runs, then it reports zero HIGH/CRITICAL findings. + +**golden_path_step:** 2. + +--- + +## Epic 3 -- Code Generated + +**Outcome statement.** "The system wrote the code that implements the approved spec. The code compiles, lints, types pass, and lives in version control." + +**Sample stories.** +- US-3.1 As a user I see code generation begin within 5 seconds of approving the spec. +- US-3.2 As a user I receive a link to the PR with the generated code. +- US-3.3 As a user I can request changes; the system updates the PR. + +**AC pattern.** +- Given an approved spec, when generation completes, then a PR exists with non-empty diff, eslint/tsc/ruff/mypy clean (per `audit_static.py`), and no mocks/localhost in non-test source (per `audit_real_wiring.py`). +- Given the PR, when commits are inspected, then every commit message contains a ticket id matching `[A-Z]+-\d+` (per `audit_build.py`). +- Given the PR, when `audit_build.py` runs, then there are no new TODO/FIXME without DEBT.md rows. + +**golden_path_step:** 3. + +--- + +## Epic 4 -- Tests Passed + +**Outcome statement.** "Tests run and pass on real infrastructure. The user can read a green test report. Mutation, contract, and console-clean checks are green too." + +**Sample stories.** +- US-4.1 As a user I see a test results page with unit, integration, and E2E sections. +- US-4.2 As a user I see at least one `@golden-path` test green against a non-localhost preview URL. +- US-4.3 As a user I see zero console errors and zero skipped tests. + +**AC pattern.** +- Given the generated code, when CI runs, then `audit_unit.py`, `audit_integration.py`, `audit_e2e.py`, `audit_contract.py`, `audit_coverage.py`, `audit_mutation.py`, `audit_static.py`, `audit_console_clean.py` all return exit code 0. +- Given a Playwright run, when the JSON report is parsed, then `@golden-path` test count >= 1 and all are passing. +- Given the integration suite, when scanned, then no test mocks `requests/axios/fetch/prisma/psycopg`. + +**golden_path_step:** 4. + +--- + +## Epic 5 -- Preview Deployed + +**Outcome statement.** "The product is reachable on a real URL. A user can hit it from a different machine. The page renders and has a non-empty title." + +**Sample stories.** +- US-5.1 As a user I receive a preview URL within 60 seconds of tests passing. +- US-5.2 As a user I open the URL in a fresh browser and the golden path completes end-to-end. +- US-5.3 As a user I see structured logs in a viewer (cloud provider's free tier suffices). + +**AC pattern.** +- Given tests are green, when deploy runs, then the prod_url in PRODUCT.md frontmatter responds HTTP 200 (per `audit_demo_url.py` and `audit_deploy.py`). +- Given the URL, when curled, then body length > 500 bytes and `<title>` is non-empty. +- Given the deploy, when `.github/workflows/*.yml` is inspected, then a smoke job exists. + +**golden_path_step:** 5. + +--- + +## Epic 6 -- User Accepted + +**Outcome statement.** "A real human walked the live URL, recognised it as their product, and signed off in writing. We have the artefact tagged in git." + +**Sample stories.** +- US-6.1 As a user I receive a UAT script and walk it on the live URL. +- US-6.2 As a user I sign UAT_REPORT.md with my name and the date. +- US-6.3 As a delivery lead I tag the accepted commit `uat-v1.0.0`. + +**AC pattern.** +- Given a deployed preview, when the user completes the UAT script, then `UAT_REPORT.md` exists with `sha256:` and `Signed-off-by:` lines (per `audit_uat.py`). +- Given the sign-off, when `git tag --list 'uat-v*'` runs, then a tag exists. +- Given handoff, when `audit_handoff.py` runs, then README.md, runbooks/runbook.md, DEBT.md, HANDOFF.md exist and HANDOFF.md sections "Credentials Vault Link", "Admin Walkthrough Video", "Knowledge Transfer Date" are filled. + +**golden_path_step:** 6 (continues into 7 for warranty). + +--- + +## Mapping summary + +| Epic | golden_path_step | Closing audit script(s) | +| --- | --- | --- | +| 1 Idea Captured | 1 | audit_constitution.py | +| 2 Spec Approved | 2 | audit_sow.py | +| 3 Code Generated | 3 | audit_build.py, audit_real_wiring.py, audit_static.py | +| 4 Tests Passed | 4 | audit_unit.py, audit_integration.py, audit_e2e.py, audit_contract.py, audit_coverage.py, audit_mutation.py, audit_console_clean.py | +| 5 Preview Deployed | 5 | audit_deploy.py, audit_demo_url.py | +| 6 User Accepted | 6 (+ 7) | audit_uat.py, audit_handoff.py, audit_warranty.py | + +## Why six and not nine + +The 9-gate regime has nine gates. The Jira board has six epics. The mismatch is intentional: the six outcome-epics are user-facing milestones; the nine gates are internal quality bars. Multiple gates support a single outcome-epic; no outcome-epic is owned by a single gate. The user does not care about Gate 5 vs. Gate 7; the user cares about "did I get a working URL". Organise visible work around the user; organise invisible quality work around the gates. + +## What this skeleton replaces + +Capability-style epics ("Auth", "Generation", "Deploy") are deleted from the backlog at Gate 1. If a piece of work cannot be filed under one of the six outcome-epics, it does not belong in MVP. It goes to PLAN.md's deferred list with a one-line justification. This is the operational form of the Golden Path Doctrine. diff --git a/skills/.experimental/product-init/templates/opportunity-solution-tree.md b/skills/.experimental/product-init/templates/opportunity-solution-tree.md new file mode 100644 index 00000000..61a657d1 --- /dev/null +++ b/skills/.experimental/product-init/templates/opportunity-solution-tree.md @@ -0,0 +1,31 @@ +--- +name: Opportunity Solution Tree +description: Torres OST for the project. +gate: 3 +audit_script: audit_design.py +--- + +<!-- INSTRUCTION: This template is filled by the Product Owner and Design Team during the design phase because it helps in identifying opportunities and solutions. --> + +## Outcome +[FILL: Desired outcome] + +## Opportunities +- [FILL: Opportunity 1] + - **Solution 1:** [FILL: Solution 1] + - **Assumption Test 1:** [FILL: Assumption test 1] + - **Assumption Test 2:** [FILL: Assumption test 2] +- [FILL: Opportunity 2] + - **Solution 1:** [FILL: Solution 1] + - **Assumption Test 1:** [FILL: Assumption test 1] + - **Assumption Test 2:** [FILL: Assumption test 2] +- [FILL: Opportunity 3] + - **Solution 1:** [FILL: Solution 1] + - **Assumption Test 1:** [FILL: Assumption test 1] + - **Assumption Test 2:** [FILL: Assumption test 2] + +## Weekly Customer Touchpoint +- [FILL: Weekly touchpoint 1] +- [FILL: Weekly touchpoint 2] + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/pr-faq.md b/skills/.experimental/product-init/templates/pr-faq.md new file mode 100644 index 00000000..9e3452cc --- /dev/null +++ b/skills/.experimental/product-init/templates/pr-faq.md @@ -0,0 +1,60 @@ +--- +name: Press Release and FAQ +description: Amazon Working Backwards document. +gate: 1 +audit_script: audit_constitution.py +--- + +<!-- INSTRUCTION: This template is filled by the Product Owner and Marketing Team at the start of the project because it helps in aligning the team on the product vision. --> + +## Press Release +### Heading +[FILL: Heading] + +### Subheading +[FILL: Subheading] + +### Opening Paragraph +[FILL: Opening paragraph in customer voice] + +### Problem +[FILL: Problem statement] + +### Solution +[FILL: Solution description] + +### Customer Quote +[FILL: Customer quote] + +### Leader Quote +[FILL: Leader quote] + +### Getting Started +[FILL: Instructions on how to get started] + +## Internal FAQ +### Engineering +- [FILL: Question 1]: [FILL: Answer 1] +- [FILL: Question 2]: [FILL: Answer 2] + +### Business +- [FILL: Question 1]: [FILL: Answer 1] +- [FILL: Question 2]: [FILL: Answer 2] + +### Scope +- [FILL: Question 1]: [FILL: Answer 1] +- [FILL: Question 2]: [FILL: Answer 2] + +## External FAQ +- [FILL: Customer Question 1]: [FILL: Answer 1] +- [FILL: Customer Question 2]: [FILL: Answer 2] +- [FILL: Customer Question 3]: [FILL: Answer 3] + +## 5 Core Questions +- [FILL: Question 1]: [FILL: Answer 1] +- [FILL: Question 2]: [FILL: Answer 2] +- [FILL: Question 3]: [FILL: Answer 3] +- [FILL: Question 4]: [FILL: Answer 4] +- [FILL: Question 5]: [FILL: Answer 5] + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/pr-template.md b/skills/.experimental/product-init/templates/pr-template.md new file mode 100644 index 00000000..3f9be56b --- /dev/null +++ b/skills/.experimental/product-init/templates/pr-template.md @@ -0,0 +1,23 @@ +--- +name: Pull Request Template +description: Template for pull requests. +gate: 4 +audit_script: audit_build.py +--- + +<!-- INSTRUCTION: This template is filled by the Developers during the build phase because it standardizes the pull request process. --> + +- **golden_path_step:** [FILL: 1-7] +- **AC ID (link to TASKS.md):** [FILL: AC ID] +- **DEBT introduced? (yes->link to DEBT.md row):** [FILL: Yes/No] +- **E2E test added? (link to test):** [FILL: Yes/No] +- **Console errors triggered? (must be no):** [FILL: Yes/No] +- **Preview URL (Vercel/Netlify link):** [FILL: Preview URL] + +## Self-Checklist +- [ ] All tests pass +- [ ] Code is reviewed +- [ ] Documentation is updated +- [ ] Changes are committed and pushed + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/riskiest-assumption-test.md b/skills/.experimental/product-init/templates/riskiest-assumption-test.md new file mode 100644 index 00000000..33980d78 --- /dev/null +++ b/skills/.experimental/product-init/templates/riskiest-assumption-test.md @@ -0,0 +1,38 @@ +--- +name: Riskiest Assumption Test +description: Single assumption test card. +gate: 3 +audit_script: audit_design.py +--- + +<!-- INSTRUCTION: This template is filled by the Product Owner and Design Team during the design phase because it helps in validating the riskiest assumptions. --> + +## Assumption +[FILL: One sentence describing the assumption] + +## Why this is the riskiest +[FILL: Explanation of why this is the riskiest assumption] + +## Test Method +- **Method:** [FILL: Interview/Landing Page/Wizard of Oz/Concierge] + +## Sample Size +- **Size:** [FILL: Number of participants] + +## Success Threshold +- **Threshold:** [FILL: Numeric threshold for success] + +## Kill Threshold +- **Threshold:** [FILL: Numeric threshold for killing the assumption] + +## Timeline +- **Start Date:** [FILL: Start date] +- **End Date:** [FILL: End date] + +## Owner +- **Owner:** [FILL: Owner name] + +## Result +- **Outcome:** [FILL: Outcome of the test] + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/shape-up-pitch.md b/skills/.experimental/product-init/templates/shape-up-pitch.md new file mode 100644 index 00000000..11c8e8d8 --- /dev/null +++ b/skills/.experimental/product-init/templates/shape-up-pitch.md @@ -0,0 +1,30 @@ +--- +name: Shape Up Pitch +description: Basecamp pitch for the project. +gate: 2 +audit_script: audit_sow.py +--- + +<!-- INSTRUCTION: This template is filled by the Product Owner and Design Team after the initial planning because it helps in pitching the project. --> + +## Problem +[FILL: Raw motivation for the project] + +## Appetite +- **Weeks:** [FILL: Number of weeks] +- **People:** [FILL: Number of people] + +## Solution +- **Sketches/Breadboards:** [FILL: Sketches or breadboards placeholder] + +## Rabbit Holes +- [FILL: Specific patch 1] +- [FILL: Specific patch 2] +- [FILL: Specific patch 3] + +## No-Gos +- [FILL: Explicit exclusion 1] +- [FILL: Explicit exclusion 2] +- [FILL: Explicit exclusion 3] + +<!-- DONE WHEN: All sections are filled, and the audit script passes. --> diff --git a/skills/.experimental/product-init/templates/uat-script.md b/skills/.experimental/product-init/templates/uat-script.md new file mode 100644 index 00000000..0490057d --- /dev/null +++ b/skills/.experimental/product-init/templates/uat-script.md @@ -0,0 +1,30 @@ +--- +name: UAT Script +description: User Acceptance Testing walkthrough. +gate: 6 +audit_script: audit_uat.py +--- + +<!-- INSTRUCTION: This template is filled by the QA Team and Users during the UAT phase because it provides a structured way to perform UAT. --> + +## Header +- **Date:** [FILL: Date] +- **Version:** [FILL: Version] + +## Steps +1. **Action:** [FILL: Action 1] + - **Expected Result:** [FILL: Expected Result 1] + - **Actual Result:** [FILL: Actual Result 1] + - **Pass/Fail:** [FILL: Pass/Fail] +2. **Action:** [FILL: Action 2] + - **Expected Result:** [FILL: Expected Result 2] + - **Actual Result:** [FILL: Actual Result 2] + - **Pass/Fail:** [FILL: Pass/Fail] + +## Footer +- **SHA256 of this file:** [FILL: SHA256 hash] +- **Loom Link:** [FILL: Loom link] +- **Signed By:** [FILL: Signer] +- **Signed At:** [FILL: Date and Time] + +<!-- DONE WHEN: All sections are filled, and the audit script passes. -->