Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
ca5539f
feat(skills): add maintainer loop skills for automated PR triage and …
cv Apr 7, 2026
0d37065
refactor(skills): consolidate maintainer skills into morning/day/evening
cv Apr 8, 2026
b9dcc41
fix(skills): restart ordered list numbering per section for markdownlint
cv Apr 8, 2026
200e78f
refactor(skills): extract deterministic steps into scripts, fix revie…
cv Apr 8, 2026
3a9f6ec
fix(skills): add computed ageDays to triage and progress script output
cv Apr 8, 2026
43002be
fix(skills): use ageHours instead of ageDays for PR age in script output
cv Apr 8, 2026
badbe78
refactor(skills): replace gh-pr-merge-now with built-in gh queries
cv Apr 8, 2026
ade03ce
fix(skills): reduce triage PR fetch limit to 50 to avoid GraphQL timeout
cv Apr 8, 2026
647c0a4
fix(skills): add review-ready bucket and fix merge state classification
cv Apr 8, 2026
91b71fb
feat(skills): paginate PR fetching via REST API to see all 243 open PRs
cv Apr 8, 2026
2f223b2
fix(skills): default to maintainer salvage, not contributor wait
cv Apr 8, 2026
bd9264c
fix(skills): block approval on DIRTY merge state in merge gate workflow
cv Apr 8, 2026
cd9acad
Merge branch 'main' into cv/maintainer-skills
cv Apr 8, 2026
4eb088f
feat(skills): boost triage score for security and priority:high labels
cv Apr 8, 2026
f07051c
docs: exempt nemoclaw-maintainer-* skills from autogenerated rule
cv Apr 8, 2026
aa855bf
docs: list autogenerated skills explicitly instead of broad wildcard
cv Apr 8, 2026
84ab75b
merge: resolve CLAUDE.md conflict with main's symlink restructure
cv Apr 8, 2026
49c96af
fix(skills): address CodeRabbit findings across maintainer scripts
cv Apr 8, 2026
9f90cc9
Merge branch 'main' into cv/maintainer-skills
cv Apr 8, 2026
14c0305
fix(skills): use semver comparison for straggler detection, log parse…
cv Apr 8, 2026
869170b
Merge branch 'main' into cv/maintainer-skills
cv Apr 8, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions .agents/skills/nemoclaw-maintainer-day/HOTSPOTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Hotspots Workflow

Find files hurting throughput and reduce their future blast radius.

## Step 1: Run the Hotspot Script

```bash
node --experimental-strip-types --no-warnings .agents/skills/nemoclaw-maintainer-day/scripts/hotspots.ts
```

This combines 30-day git churn on `main` with open PR file overlap, flags risky areas, and outputs a ranked JSON list.

Pipe into state:

```bash
node --experimental-strip-types --no-warnings .agents/skills/nemoclaw-maintainer-day/scripts/hotspots.ts | node --experimental-strip-types --no-warnings .agents/skills/nemoclaw-maintainer-day/scripts/state.ts set-hotspots
```

## Step 2: Prioritize

Review the ranked output. Most urgent: high `combinedScore` + `isRisky: true` + weak tests.

## Step 3: Choose Cooling Strategy

Smallest change to reduce future collisions:

- extract stable logic from giant file into tested helper
- split parsing from execution
- add regression tests around repeated breakage
- deduplicate workflow logic
- narrow interfaces with typed helpers

Prefer changes that also improve testability.

## Step 4: Keep Small

One file cluster per pass. Stop if next step is large redesign → follow [SEQUENCE-WORK.md](SEQUENCE-WORK.md).

## Step 5: Validate

Run relevant tests. If risky code, also follow [TEST-GAPS.md](TEST-GAPS.md).

## Notes

- Goal is lower future merge pain, not aesthetic cleanup.
- No giant refactors inside contributor PRs.
Comment on lines +1 to +46
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid direct edits to generated skill markdown.

Please make the change in the source docs and regenerate this artifact.

As per coding guidelines, "Files in .agents/skills/nemoclaw-/.md are autogenerated by scripts/docs-to-skills.py and must never be edited directly".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/skills/nemoclaw-maintainer-day/HOTSPOTS.md around lines 1 - 46, Do
not edit HOTSPOTS.md directly; instead update the original Hotspots source
document that feeds the markdown generator, then run the docs-to-skills.py
generator to regenerate the skill artifact (ensure the change appears in the
generated HOTSPOTS.md), and commit both the source doc and the regenerated
output so CI remains consistent.

47 changes: 47 additions & 0 deletions .agents/skills/nemoclaw-maintainer-day/MERGE-GATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Merge Gate Workflow

Run the last maintainer check before approval. Never merge automatically.

## Gates

For the full priority list see [PR-REVIEW-PRIORITIES.md](PR-REVIEW-PRIORITIES.md). A PR is approval-ready only when **all** hard gates pass:

1. **CI green** — all required checks in `statusCheckRollup`.
2. **No conflicts** — `mergeStateStatus` clean.
3. **No major CodeRabbit** — ignore style nits; block on correctness/security bugs.
4. **Risky code tested** — see [RISKY-AREAS.md](RISKY-AREAS.md). Confirm tests exist (added or pre-existing).

## Step 1: Run the Gate Checker

```bash
node --experimental-strip-types --no-warnings .agents/skills/nemoclaw-maintainer-day/scripts/check-gates.ts <pr-number>
```

This checks all 4 gates programmatically and returns structured JSON with `allPass` and per-gate `pass`/`details`.

## Step 2: Interpret Results

The script handles the deterministic checks. You handle judgment calls:

- **Conflicts (DIRTY):** Do NOT approve — GitHub invalidates approvals when new commits are pushed. Salvage first (rebase), wait for CI, then re-run the gate checker. Follow [SALVAGE-PR.md](SALVAGE-PR.md).
- **CI failing but narrow:** Follow the salvage workflow in [SALVAGE-PR.md](SALVAGE-PR.md).
- **CI pending:** Wait and re-check. Do not approve while checks are still running.
- **CodeRabbit:** Script flags unresolved major/critical threads. Review the `snippet` to confirm it's a real issue vs style nit. If doubt, leave unapproved.
- **Tests:** If `riskyCodeTested.pass` is false, follow [TEST-GAPS.md](TEST-GAPS.md).

## Step 3: Approve or Report

**Approve only when:** `allPass` is true AND `mergeStateStatus` is not DIRTY. Approving a PR with conflicts is wasted effort — the rebase will invalidate the approval.

The correct sequence for a conflicted PR: **salvage (rebase) → CI green → approve → report ready for merge.**

**All pass + no conflicts:** Approve and summarize why.

**Any fail:**

| Gate | Status | What is needed |
|------|--------|----------------|
| CI | Failing | Fix flaky timeout test |
| Conflicts | DIRTY | Rebase onto main first — approval would be invalidated |

Use full GitHub links.
Comment on lines +1 to +47
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Generated skill file should not be edited directly.

This path is designated as autogenerated. Please apply content changes in the source docs and regenerate via scripts/docs-to-skills.py to avoid drift/overwrite on the next regeneration.

As per coding guidelines, "Files in .agents/skills/nemoclaw-/.md are autogenerated by scripts/docs-to-skills.py and must never be edited directly".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/skills/nemoclaw-maintainer-day/MERGE-GATE.md around lines 1 - 47,
This file is autogenerated and must not be edited directly; revert any changes
to .agents/skills/nemoclaw-maintainer-day/MERGE-GATE.md and instead make your
content edits in the canonical source docs, then run the generator script
scripts/docs-to-skills.py to regenerate the skill file; reference the
autogenerated path (.agents/skills/nemoclaw-maintainer-day/MERGE-GATE.md) and
the generator (scripts/docs-to-skills.py) when updating so future regenerations
don't overwrite manual edits.

41 changes: 41 additions & 0 deletions .agents/skills/nemoclaw-maintainer-day/PR-REVIEW-PRIORITIES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# PR Review Priorities

Ordered list of what NemoClaw maintainers look for in a pull request. Higher items block approval; lower items inform queue ranking.

## Hard gates (all must pass to approve)

1. **Security correctness** — no sandbox escape, SSRF, credential exposure, policy bypass, or installer trust violation. PRs touching risky areas (see [RISKY-AREAS.md](RISKY-AREAS.md)) get a deep security pass before anything else.
2. **CI green** — all required checks in `statusCheckRollup` must pass.
3. **No merge conflicts** — `mergeStateStatus` must be clean.
4. **No unresolved major/critical CodeRabbit findings** — correctness and safety findings block; style nits do not. Use judgment on borderline cases.
5. **Tests for touched risky code** — risky areas must have test coverage, either added in the PR or pre-existing. No exceptions.

## Quality expectations (block if violated, but fixable via salvage)

1. **Narrow scope** — each PR has one clear objective. Unrelated config changes, drive-by refactors, and tool setting diffs get reverted to `main`.
2. **Contributor intent preserved** — the fix must match what the contributor intended. Stop and ask when the diff would change semantics or when intent is unclear.
3. **Small, mergeable changes** — prefer substrate-first slicing: extract helper, add tests for current behavior, land fix on top. One file cluster per pass. If the next step is a large redesign, route to sequencing.

## Queue ranking signals (inform priority, not approval)

1. **Actionability** — PRs closest to done rank highest. A merge-ready PR outranks a near-miss; a near-miss outranks a blocked item.
2. **Security-sensitive and actionable** — PRs touching risky code get a priority bump, but only when they are not otherwise blocked.
3. **Staleness** — PRs idle for more than 7 days get a mild bump to prevent rot.
4. **Hotspot relief** — PRs that reduce future conflict pressure in high-churn files are preferred over equivalent work elsewhere.

## Daily cadence

The team follows a daily ship cycle. All maintainer skills operate within this rhythm.

1. **Morning** (`/nemoclaw-maintainer-morning`) — triage the backlog, pick items for the day, label them with the target version (e.g., `v0.0.8`).
2. **During the day** (`/nemoclaw-maintainer-day`) — land PRs using the maintainer loop. Version labels make progress visible on dashboards.
3. **Evening** (`/nemoclaw-maintainer-evening`) — check what shipped, bump open items to the next version (`v0.0.9`), generate a QA-focused summary, and cut the tag.
4. **Overnight** — QA team (different timezone) tests the tag. Any issues they file enter the next morning's triage like any other issue.

Version labels are living markers: they always mean "ship in this version." If an item doesn't make the cut, the label moves to the next patch version.

## Explicitly not priorities

- **Code style and formatting** — not a reason to block or delay. No opportunistic reformatting.
- **Documentation completeness** — not required for approval unless the PR changes user-facing behavior.
- **Architectural elegance** — the goal is lower future merge pain, not aesthetic cleanup.
14 changes: 14 additions & 0 deletions .agents/skills/nemoclaw-maintainer-day/RISKY-AREAS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# NemoClaw Risky Code Areas

PRs touching these areas need tests before approval.

| Area | Key paths |
|------|-----------|
| Installer / bootstrap shell | `install.sh`, `setup.sh`, `brev-setup.sh`, `scripts/*.sh` |
| Onboarding / host glue | `bin/lib/onboard.js`, `bin/*.js` |
| Sandbox / policy / SSRF | `nemoclaw/src/blueprint/`, `nemoclaw-blueprint/`, policy presets |
| Workflow / enforcement | `.github/workflows/`, prek hooks, DCO, signing, version/tag flows |
| Credentials / inference / network | credential helpers, inference provider routing, approval flows |

A PR in a risky area is only promoted in the queue when it is actually actionable.
If risky and under-tested, follow the test gaps or security sweep workflows.
84 changes: 84 additions & 0 deletions .agents/skills/nemoclaw-maintainer-day/SALVAGE-PR.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Salvage PR Workflow

Take one near-mergeable PR and make the smallest safe change to unblock it.

**Default to maintainer salvage.** When a maintainer picks an item from the queue, the assumption is they're doing the work now — rebase, fix conflicts, add missing tests, push. Do not default to "ask the contributor and wait" because that blocks the daily cadence. Only defer to the contributor when the fix requires understanding intent that isn't clear from the diff.

## Step 1: Gather Context

```bash
gh pr view <number> --repo NVIDIA/NemoClaw \
--json number,title,url,body,baseRefName,headRefName,author,files,commits,comments,reviews,statusCheckRollup,mergeStateStatus,reviewDecision

gh pr diff <number> --repo NVIDIA/NemoClaw
```

Also read: maintainer and CodeRabbit comments, linked issues, recent `main` changes in touched files. Understand the PR's purpose before coding.

## Step 2: Assess Fit

**Maintainer does it now:** rebase and resolve conflicts, add missing tests for risky code, fix one or two failing checks, apply small correctness fixes from review, narrow gate cleanup.

**Defer to contributor only when:** the fix requires a design change the maintainer can't judge from the diff, contributor intent is ambiguous and the wrong guess would change semantics, or the PR spans multiple subsystems the maintainer isn't familiar with.

## Step 3: Check Out and Reproduce

```bash
gh pr checkout <number>
git fetch origin --prune
```

Reproduce locally. Run narrowest relevant commands first.

## Step 4: Review PR Scope Before Fixing

Before fixing, review **all** changed files in the PR — not just the ones causing failures. Flag any files that expand the PR's scope unnecessarily (config changes, unrelated refactors, tool settings). Revert those to `main` if they aren't needed for the feature to work.

## Step 5: Fix Narrowly

Smallest change that clears the blocker. No opportunistic reformatting.

If risky code is touched (see [RISKY-AREAS.md](RISKY-AREAS.md)), treat missing tests as part of the fix — follow [TEST-GAPS.md](TEST-GAPS.md) when needed.

## Step 6: Conflicts

Resolve only mechanical conflicts (import ordering, adjacent additions, branch drift). Stop and summarize if the conflict changes behavior.

## Step 7: Validate

```bash
npm test # root integration tests
cd nemoclaw && npm test # plugin tests
npm run typecheck:cli # CLI type check
make check # all linters
```

Use only commands matching the changed area.

## Step 8: Push

Push when: fix is small, improves mergeability, validation passed, you have push permission. Never force-push. If you cannot push, prepare a comment describing the fix.

**Fork PRs:** Most PRs come from contributor forks. Check where to push:

```bash
gh pr view <number> --repo NVIDIA/NemoClaw --json headRepositoryOwner,headRepository,headRefName,maintainerCanModify
```

If `maintainerCanModify` is true, push directly to the fork:

```bash
git push [email protected]:<owner>/<repo>.git <local-branch>:<headRefName>
```

Do **not** push to `origin` — that creates a separate branch on NVIDIA/NemoClaw that won't appear in the PR.

## Step 9: Route to Merge Gate

If PR looks ready, follow [MERGE-GATE.md](MERGE-GATE.md).

## Notes

- Goal is safe backlog reduction, not finishing the PR at any cost.
- Never hide unresolved reviewer concerns.
- Use full GitHub links.
Comment on lines +1 to +84
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This skill markdown should be generated, not manually edited.

Please update source documentation and regenerate this file to keep skill artifacts canonical.

As per coding guidelines, "Files in .agents/skills/nemoclaw-/.md are autogenerated by scripts/docs-to-skills.py and must never be edited directly".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/skills/nemoclaw-maintainer-day/SALVAGE-PR.md around lines 1 - 84,
The SALVAGE-PR.md content was edited manually but must be generated; revert
manual edits to the "Salvage PR Workflow" markdown and regenerate it from the
canonical source using the docs-to-skills generator (run the docs-to-skills.py
tool to rebuild the skill artifact), then commit the autogenerated output
instead of hand-editing; ensure the generator is run and its output replaces the
current SALVAGE-PR.md so the file matches the canonical source.

44 changes: 44 additions & 0 deletions .agents/skills/nemoclaw-maintainer-day/SECURITY-SWEEP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Security Sweep Workflow

Review a security-sensitive item before it enters the normal fast path.

## Step 1: Identify the Security Item

The morning triage and `find-review-pr` already surface security-labeled PRs. Start from the item selected in the day loop's action step. If running standalone, check the triage queue for PRs touching risky areas (see [RISKY-AREAS.md](RISKY-AREAS.md)).

## Step 2: Gather Context

Read the PR or issue, all comments, linked items, changed files, diff, current checks, and recent relevant `main` commits.

## Step 3: Classify Risk

Which bucket applies?

- **escape or policy bypass**
- **credential or secret exposure**
- **installer or release integrity**
- **workflow or governance bypass**
- **input validation or SSRF weakness**
- **test gap in risky code**

If none apply, route back to normal action selection.

## Step 4: Deep Security Pass

Load `security-code-review` for the nine-category review whenever the item changes behavior in a security-sensitive area. Do not skip this step just because the diff is small.

## Step 5: Decide Action

### Salvage-now

All true: risk is understood, fix is small/local, required tests are clear, no unresolved design question. Follow [SALVAGE-PR.md](SALVAGE-PR.md) and [TEST-GAPS.md](TEST-GAPS.md).

### Blocked

Any true: fix changes core trust assumptions, review found real vulnerability needing redesign, PR adds risk without tests, reviewer disagreement. Summarize blocker clearly; do not approve.

## Notes

- Backlog reduction never outranks a credible security concern.
- No security-sensitive approvals without both deep review and tests.
- Use full GitHub links.
43 changes: 43 additions & 0 deletions .agents/skills/nemoclaw-maintainer-day/SEQUENCE-WORK.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Sequence Work Workflow

Turn a large problem into a short sequence of mergeable slices.

## Step 1: Read the Full Problem Surface

Read greedily: issue body and comments, linked issues, linked PRs with review comments and status, touched code/tests/docs, recent `main` changes if the area is active.

Do not sequence work from the title alone.

## Step 2: Identify What Is Moving

Inventory overlapping open PRs and recently merged changes. For each: blocker that must land first? Dependency to build on? Conflict to avoid? Noise?

## Step 3: Define Slices

Each slice should have: one core objective, short file list, explicit tests, merge dependency list, stop condition.

Prefer substrate-first sequencing:

1. Extract stable helper or type boundary
2. Add regression tests for current behavior
3. Land behavioral fix or refactor on top
4. Remove duplication afterward

## Step 4: Rank

Use repo priorities: (1) backlog reduction, (2) security, (3) test coverage, (4) hotspot cooling.

A slice that unblocks several PRs moves up. An elegant but non-urgent slice moves down.

## Step 5: Output

| Order | Slice | Why now | Depends on | Tests |
|-------|-------|---------|------------|-------|
| 1 | Extract timeout parsing from onboard | Enables safe tests, reduces conflicts | None | Unit tests for invalid env values |

Also include: outstanding blockers, which slices are safe for the maintainer loop, where human design decisions are needed.

## Notes

- Every slice maps to real files, tests, and merge behavior — not abstract architecture.
- Prefer small serially mergeable changes over one ambitious cleanup branch.
Loading
Loading