Skip to content

Commit 62e111d

Browse files
committed
Merge branch 'main' into nmulepati/fix-590-remove-implicit-default-provider
2 parents d7ea07c + abe7667 commit 62e111d

313 files changed

Lines changed: 22652 additions & 8273 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.agents/agents/docs-searcher.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ Brief summary of what was found and any recommendations for the user.
6363
- Only include results that are actually relevant to the search topic
6464
- If no relevant documentation is found, clearly state that
6565
- Keep excerpts concise but include enough context to be useful
66-
- Prioritize user guides and examples over API reference when both exist
66+
- Prioritize user guides, concepts, tutorials, and recipes according to the user's task
6767
- If the docs/ folder doesn't exist or is empty, report that clearly
6868

6969
## Search Strategy

.agents/recipes/_fix-policy.md

Lines changed: 38 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,9 @@ A finding may be converted to a fix only if all hold:
2525
| `packages/data-designer-config` | `make test-config` |
2626
| `packages/data-designer-engine` | `make test-engine` |
2727
| `packages/data-designer` | `make test-interface` |
28-
- **Single concern**: one finding per PR.
28+
- **Single concern**: one finding per PR, except suite-declared batchable
29+
mechanical fixes. A batch must share one suite/category and satisfy the
30+
localized-fix bar as a single combined diff.
2931
- **Allowlisted paths**: matches the suite's path allowlist.
3032

3133
If the top-ranked candidate fails the bar, try the next. If none of the top
@@ -79,6 +81,9 @@ Each daily recipe maintains two arrays in
7981
Also: `draft_until_proven` (boolean, per-suite, default `true` for
8082
code-quality and unset elsewhere) controls draft-PR mode.
8183

84+
Batch PRs still record one `attempted_fixes` entry per finding. Multiple
85+
entries may point to the same `pr_number` and `branch`.
86+
8287
### `fix_backlog` rules (audit phase populates this)
8388

8489
- Append every detected finding in an eligible category. If `id` is already
@@ -90,6 +95,9 @@ code-quality and unset elsewhere) controls draft-PR mode.
9095
- Cap at 200 entries (drop oldest by `first_seen`).
9196
- Populated **before** the `known_issues` filter so fixable findings persist
9297
even when their report row is suppressed for being unchanged.
98+
- Batchable categories must include enough information in `data` to group
99+
siblings safely. For package-scoped Python fixes, derive `test_target` from
100+
the package containing the source file.
93101

94102
### `attempted_fixes` rules
95103

@@ -101,9 +109,9 @@ code-quality and unset elsewhere) controls draft-PR mode.
101109
`open` attempts that have a `pr_number`: query the PR and flip the
102110
attempt to `merged` or `closed` if it is no longer open. Then recover
103111
from crashes that left state un-updated: list open PRs (`gh pr list`)
104-
whose bodies contain the
105-
`<!-- agentic-ci finding=<id> suite=<suite> -->` marker, parse out
106-
each `<id>`, and back-fill any missing `attempted_fixes` entries with
112+
whose bodies contain one or more
113+
`<!-- agentic-ci finding=<id> suite=<suite> -->` markers, parse out
114+
every `<id>`, and back-fill any missing `attempted_fixes` entries with
107115
`outcome: "open"` and the parsed `pr_number` and `branch`.
108116
- Prune: drop `merged` entries older than 90 days. Do **not** prune
109117
`closed` or `abandoned` entries by age — pruning a single-strike entry
@@ -175,7 +183,7 @@ Earlier criteria override later ones:
175183

176184
4. **Recency** — newer findings rank above long-standing ones.
177185

178-
Record the chosen finding's id, scores, and rationale at the top of
186+
Record the chosen finding id(s), scores, and rationale at the top of
179187
`/tmp/audit-{{suite}}.md`.
180188

181189
## Standard fix procedure
@@ -191,29 +199,38 @@ declare only the parts that vary (eligible categories, branch type,
191199
`merged`; surface two-strike entries in the report's
192200
`Repeatedly-failed fix attempts` section and drop them from selection.
193201
3. Rank the remainder per the Ranking section.
194-
4. For each candidate, top 5 max:
195-
1. Re-verify the finding still applies (re-grep / re-read). If not,
196-
remove from `fix_backlog` and continue.
197-
2. Apply the fix. If the diff exceeds the localized-fix bar or touches
198-
a non-allowlisted path, abandon and continue.
199-
3. If the category sets `test_required: true`, run the per-package
202+
4. For each primary candidate, top 5 max:
203+
1. If the suite declares the category batchable, collect sibling
204+
`fix_backlog` entries for the same suite/category that share the same
205+
test target and branch type. Do not discover new findings; use only
206+
existing backlog entries. Batch at most 3 entries to stay within the
207+
localized-fix file cap.
208+
2. Re-verify every finding still applies (re-grep / re-read). If a
209+
sibling no longer applies, remove it from `fix_backlog`; if the
210+
primary no longer applies, remove it from `fix_backlog` and continue
211+
to the next primary candidate.
212+
3. Apply the fix or batch. If the combined diff exceeds the
213+
localized-fix bar or touches a non-allowlisted path, abandon and
214+
continue.
215+
4. If the category sets `test_required: true`, run the per-package
200216
test target (see the mapping table in "Localized fix bar" above)
201-
for the package containing the change. On failure: abandon and
217+
for the package containing the change(s). On failure: abandon and
202218
continue.
203-
4. Branch: `agentic-ci/<type>/<suite>-YYYYMMDD-<short-slug>`. Commit:
219+
5. Branch: `agentic-ci/<type>/<suite>-YYYYMMDD-<short-slug>`. Commit:
204220
`<type>(agentic-ci): <one-line>`. Push.
205-
5. Write the PR body to `/tmp/pr-body-{{suite}}.md`, including the
206-
hidden metadata block:
221+
6. Write the PR body to `/tmp/pr-body-{{suite}}.md`, including one
222+
hidden metadata block per fixed finding:
207223
`<!-- agentic-ci finding=<id> suite=<suite> -->`
208-
6. `gh pr create --body-file /tmp/pr-body-{{suite}}.md` with `--draft`
224+
7. `gh pr create --body-file /tmp/pr-body-{{suite}}.md` with `--draft`
209225
iff `draft_until_proven` is true for the suite.
210-
7. `gh pr edit <num> --add-label agentic-ci --add-label agentic-ci/<suite>`.
211-
8. Record `attempted_fixes` entry with `outcome: "open"` and exit.
226+
8. `gh pr edit <num> --add-label agentic-ci --add-label agentic-ci/<suite>`.
227+
9. Record one `attempted_fixes` entry per fixed finding with
228+
`outcome: "open"` and exit.
212229
5. If all 5 candidates were abandoned, append a one-line note to the
213230
report and exit cleanly. The state already reflects the abandonments.
214231

215232
On any failure mid-flow: record `outcome: "abandoned"` for the chosen
216-
finding (with `pr_number: null`), leave any pushed branch in place
233+
finding(s) (with `pr_number: null`), leave any pushed branch in place
217234
(`pr-stale.yml` will reap it; branch deletion is forbidden), and continue
218235
to the next candidate.
219236

@@ -223,6 +240,8 @@ to the next candidate.
223240
interactive-only and shells the body inline; CI needs determinism.
224241
- **Title**: conventional, `<type>(agentic-ci): <one-line>`.
225242
- **Labels**: `agentic-ci`, `agentic-ci/<suite>`.
243+
- **Batch markers**: batch PRs include one hidden finding marker per fixed
244+
finding so crash recovery can reconstruct every `attempted_fixes` entry.
226245
- **Draft PRs**: `code-quality` opens draft until a maintainer flips
227246
`draft_until_proven` to `false` in runner-state, after at least two
228247
non-draft PRs from that suite have landed clean. This flip is

.agents/recipes/_phase-fix.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,16 +9,20 @@ This invocation runs the **FIX** phase only.
99
Do NOT redo audit work — that is, do NOT re-scan whole packages or
1010
rebuild `fix_backlog` from scratch. The "no re-scan" rule does NOT
1111
override the per-candidate re-verification step required by
12-
`_fix-policy.md` §"Standard fix procedure" step 4.1: when you pick a
13-
candidate, you MUST re-grep / re-read the specific file or symbol it
14-
points at to confirm the finding still applies before editing.
12+
`_fix-policy.md` §"Standard fix procedure": when you pick a candidate,
13+
you MUST re-grep / re-read the specific file or symbol it points at to
14+
confirm the finding still applies before editing.
1515
Re-verification of a single candidate is required; re-scanning the
1616
codebase to discover new findings is forbidden.
1717
- Pick the highest-ranked eligible candidate from `fix_backlog`, apply
1818
the fix, run the package's tests if applicable, commit, push, and open
19-
the PR using `gh pr create --body-file`.
19+
the PR using `gh pr create --body-file`. If the recipe and
20+
`_fix-policy.md` declare the category batchable, you may add sibling
21+
entries from the existing `fix_backlog` after re-verifying each one.
22+
Do not scan for findings that are not already in `fix_backlog`.
2023
- Record the attempt in `attempted_fixes` (whether successful, abandoned,
21-
or failed through the top-5 fallback) before exiting.
24+
or failed through the top-5 fallback) before exiting. Batch PRs record
25+
one attempt per fixed finding, all pointing to the same PR and branch.
2226
- If no candidate qualifies after trying up to 5 of them, exit cleanly,
2327
append a short note to `/tmp/audit-{{suite}}.md` describing what was
2428
tried, and update `attempted_fixes` accordingly. Do NOT open a PR.

.agents/recipes/_runner.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,9 @@ Rules:
6767
passwords) in your output, even if you encounter them in code.
6868
- **Stay in scope.** Only perform the task described in the recipe. Do not
6969
explore unrelated areas of the codebase.
70+
- **No subagents.** Do not use Task, Explore, or other delegated/local agents.
71+
The CI key may not have access to their default models; do the work in the
72+
main agent session.
7073
- **Cost awareness.** Minimize unnecessary file reads and tool calls. If you
7174
have the information you need, stop.
7275

.agents/recipes/code-quality/recipe.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Audit code quality gaps not covered by ruff - complexity trends, ex
44
trigger: schedule
55
tool: claude-code
66
timeout_minutes: 20
7-
max_turns: 30
7+
max_turns: 50
88
permissions:
99
contents: write
1010
---
@@ -152,7 +152,7 @@ Examples of things to test (pick 2-3 per run, and invent new ones):
152152
- Column names with special characters or very long strings
153153
- Recently changed validators (check `git log --oneline -10 -- packages/*/src/data_designer/config/`)
154154

155-
**API reference:**
155+
**Useful imports:**
156156

157157
```python
158158
from data_designer.config.config_builder import DataDesignerConfigBuilder

.agents/recipes/docs-and-references/recipe.md

Lines changed: 29 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -33,11 +33,31 @@ even when their report row is suppressed for being unchanged.
3333

3434
## Instructions
3535

36+
### Turn budget
37+
38+
This suite must finish before the `max_turns` limit. Do not attempt a
39+
repo-wide audit in one run.
40+
41+
1. Read runner memory.
42+
2. Write `/tmp/audit-{{suite}}.md` immediately with the required headings and
43+
empty tables. If the run is interrupted later, the workflow must still have
44+
a usable partial report.
45+
3. Use targeted searches to find candidates, then read only the files needed
46+
to verify a specific finding.
47+
4. Stop after either:
48+
- 20 tool calls
49+
- 2 new findings in a section
50+
- all sections have been sampled
51+
5. Finalize the report, update runner memory, and stop. If no new findings
52+
were verified, replace the report with `NO_FINDINGS`.
53+
3654
### 1. Docstring vs signature drift
3755

3856
This repo uses Google-style docstrings (`Args:`, `Returns:`, `Raises:`).
39-
Scan public functions and methods in `packages/` for mismatches between the
40-
docstring and the actual function signature:
57+
Sample public functions and methods in `packages/` for mismatches between the
58+
docstring and the actual function signature. Do not scan every source file.
59+
Use `rg "Args:|Returns:|Raises:" packages/*/src/ --glob '*.py'` to find
60+
candidates, then inspect at most 5 high-value files:
4161

4262
- Parameters in the `Args:` section that no longer exist in the signature
4363
- Parameters in the signature that are missing from `Args:`
@@ -60,14 +80,17 @@ Check links in these locations:
6080
- `docs/` - MkDocs content links, code references, cross-page links
6181
- `CONTRIBUTING.md`, `DEVELOPMENT.md`, `STYLEGUIDE.md` - relative links
6282

63-
For each link, verify the target file or anchor exists. Report broken links
64-
with the source file, line number, and broken target.
83+
Use targeted link extraction and inspect at most 10 candidate links. Prefer
84+
high-value docs and links changed recently. For each sampled link, verify the
85+
target file or anchor exists. Report broken links with the source file, line
86+
number, and broken target.
6587

6688
### 3. Architecture doc references
6789

6890
The 10 files in `architecture/` reference specific classes, functions, files,
6991
and registries by name. These are high-value docs that agents and developers
70-
rely on for orientation. For each code reference:
92+
rely on for orientation. Sample at most 3 architecture files per run,
93+
prioritizing files changed recently. For each code reference:
7194
- Verify the referenced class, function, or module still exists at the stated
7295
location
7396
- If renamed or moved, flag with the old and new location
@@ -101,11 +124,8 @@ Review for accuracy against the current code:
101124
the most recent 3-5 posts for references to functions, classes, or
102125
architecture that have since been modified.
103126

104-
**Code reference** (`docs/code_reference/`):
105-
- Check that autodoc module paths point to modules that still exist.
106-
107127
**Prioritize by risk of drift**: pages with the most code symbols referenced
108-
are most likely to be stale. Don't read every page - sample 5-10 high-value
128+
are most likely to be stale. Don't read every page - sample 3-5 high-value
109129
pages and flag patterns.
110130

111131
## Output format

.agents/recipes/structure/recipe.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Audit structural integrity - import boundaries, lazy import complia
44
trigger: schedule
55
tool: claude-code
66
timeout_minutes: 20
7-
max_turns: 30
7+
max_turns: 50
88
permissions:
99
contents: write
1010
---
@@ -223,6 +223,13 @@ Follow the standard fix procedure in `_fix-policy.md`. Suite-specific bits:
223223
| missing-future | `chore` | yes | Insert `from __future__ import annotations` after the SPDX header block, before other imports. Fully deterministic. Tests required because `__future__` annotations can affect introspection-heavy code paths. |
224224
| lazy-import | `refactor` | yes | Move a top-level heavy import (pandas/numpy/polars/torch/duckdb/sqlfluff/faker) to the `data_designer.lazy_heavy_imports` accessor pattern. Eligible only when (a) file is under `packages/*/src/`, (b) the module is already wired in the lazy system, (c) the heavy module is used only inside function bodies. |
225225

226+
`missing-future` is batchable: when the primary candidate is
227+
`missing-future`, include other `missing-future` backlog entries with the
228+
same `test_target` if each file still lacks the import and the combined
229+
diff remains within the localized-fix bar. Batch at most 3 files. Run the
230+
shared test target once. Use one hidden finding marker and one
231+
`attempted_fixes` entry per file.
232+
226233
**Not eligible** — stays report-only:
227234

228235
- Import boundary violations (architectural judgement).

.agents/recipes/test-health/recipe.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,24 @@ update `baselines` with current values and `known_issues` with new findings.
3232

3333
## Instructions
3434

35+
### Turn budget
36+
37+
This suite must finish before the `max_turns` limit. Do not attempt a
38+
repo-wide test audit in one run.
39+
40+
1. Read runner memory.
41+
2. Write `/tmp/audit-{{suite}}.md` immediately with the required headings and
42+
empty tables. If the run is interrupted later, the workflow must still have
43+
a usable partial report.
44+
3. Use targeted searches to find candidates, then read only the files needed
45+
to verify a specific finding.
46+
4. Stop after either:
47+
- 20 tool calls
48+
- 2 new findings in a section
49+
- all sections have been sampled
50+
5. Finalize the report, update runner memory, and stop. If no new findings
51+
were verified, replace the report with `NO_FINDINGS`.
52+
3553
### 1. Test-to-source coverage mapping
3654

3755
Map source files to their corresponding test files:
@@ -208,7 +226,7 @@ without at least one provider configured. Stick to config-layer checks
208226
(`DataDesignerConfigBuilder.build()`, column type resolution) which do
209227
not require providers.
210228

211-
**API reference** for writing checks:
229+
**Useful imports** for writing checks:
212230

213231
```python
214232
from data_designer.config.config_builder import DataDesignerConfigBuilder

0 commit comments

Comments
 (0)