Skip to content

feat: add manifest input for paperbanana sweep#184

Open
eureka0928 wants to merge 1 commit intollmsresearch:mainfrom
eureka0928:feat/sweep-manifest
Open

feat: add manifest input for paperbanana sweep#184
eureka0928 wants to merge 1 commit intollmsresearch:mainfrom
eureka0928:feat/sweep-manifest

Conversation

@eureka0928
Copy link
Copy Markdown
Contributor

Closes #182

Summary

paperbanana sweep today takes eight comma-separated axis flags plus --input and --caption — a 200+ character line of bash that's miserable to diff, share, or commit alongside the paper. paperbanana batch already solved this with YAML/JSON manifests; sweep is the symmetric gap.

paperbanana sweep --manifest examples/sweep_manifest.yaml

Example manifest:

input: sample_inputs/transformer_method.txt
caption: "Overview of our encoder-decoder architecture with sparse routing"
max_variants: 20

axes:
  vlm_providers: [gemini, openai]
  image_providers: [google_imagen, openai_imagen]
  refinement_iterations: [2, 3]
  optimize_inputs: [false, true]
  auto_refine: [false]

Implementation

  • load_sweep_manifest() in core/sweep.py (~50 lines): parses YAML/JSON, validates required keys (input, caption), enforces types on optional keys (pdf_pages: str, max_variants: int >= 1), rejects unknown axis keys, resolves input paths relative to the manifest's parent directory (same convention as load_batch_manifest).
  • --manifest / -m flag on sweep: mutually exclusive with the eight axis flags. When set, --input and --caption become optional and default from the manifest.
  • Invocation-level flags stay CLI-only (--output-dir, --config, --format, --dry-run, --verbose, --auto-download-data) because they're invocation concerns, not sweep-plan concerns.
  • examples/sweep_manifest.yaml with inline commentary on what belongs in the manifest vs. on the CLI.
  • 15 new tests covering YAML + JSON, relative-path resolution, each validation branch, and unknown-axis rejection.

Design notes

  • Axis keys in the manifest use plural names (vlm_providers) to match CLI flags (--vlm-providers), not the singular internal axis names in build_sweep_variants (vlm_provider).
  • Unknown axis keys are rejected up front so typos like typo_axis don't silently no-op.
  • pdf_pages from manifest only overrides when set — the CLI --pdf-pages falls through if the manifest omits it.

Test plan

  • pytest tests/test_core/test_sweep.py — 48 tests pass (15 new)
  • Full suite: 544 pass, 1 pre-existing unrelated failure deselected, 2 skipped (gradio/integration)
  • ruff check + ruff format --check — clean
  • paperbanana sweep --manifest examples/sweep_manifest.yaml --dry-run — plans 16 variants correctly, writes sweep_report.json
  • paperbanana sweep --manifest ... --vlm-providers gemini — correctly errors with the mutual-exclusion message

cc @dippatel1994 for review.

Closes llmsresearch#182.

Sweep today takes eight comma-separated axis flags plus --input and
--caption — a 200+ char line of bash that's miserable to diff, share,
or commit alongside the paper. Batch already solved this with YAML/JSON
manifests; sweep is the symmetric gap.

- New `load_sweep_manifest(path)` in `core/sweep.py` parses YAML/JSON
  manifests: required `input` + `caption`, optional `pdf_pages` and
  `max_variants`, optional `axes` object with the seven axis lists.
  Rejects unknown axis keys, wrong types, and missing requireds. Input
  paths resolve relative to the manifest's parent (mirrors
  `load_batch_manifest`).
- New `--manifest` / `-m` flag on `sweep`, mutually exclusive with the
  axis flags. When set, `--input` and `--caption` become optional and
  default from the manifest. Invocation-level flags (--output-dir,
  --config, --format, --dry-run, --verbose, --auto-download-data) stay
  as CLI flags because they're invocation concerns, not plan concerns.
- New `examples/sweep_manifest.yaml` and README callout.
- 15 new tests covering YAML + JSON, relative-path resolution, each
  validation branch, and unknown-axis rejection.
@eureka0928
Copy link
Copy Markdown
Contributor Author

Hi @dippatel1994 — tagging you for review when you get a chance.

Quick summary:

  • Scope: Add --manifest / -m to paperbanana sweep so users can store the full sweep plan (input, caption, axes, max_variants) in YAML/JSON instead of eight comma-separated CLI flags. Symmetric with how batch already works.
  • Size: ~50 lines of functional code + ~160 lines of tests. One new core/sweep.py helper (load_sweep_manifest) and a small CLI wire-in. No changes to the sweep runner itself.
  • Mutual exclusion: --manifest cannot be combined with the axis flags — errors fast with a clear message.
  • CI: All 11 checks pass (lint, build, tests across Linux/macOS/Windows × Python 3.10/3.11/3.12).
  • Smoke tested: paperbanana sweep --manifest examples/sweep_manifest.yaml --dry-run plans 16 variants correctly.

Happy to squash, rebase, or adjust based on your feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add manifest input for paperbanana sweep to replace long CLI flag chains

1 participant