Skip to content

[Feature]: Add manifest input for paperbanana sweep to replace long CLI flag chains #182

@eureka0928

Description

@eureka0928

Problem or motivation

paperbanana sweep today takes eight comma-separated axis flags (--vlm-providers, --vlm-models, --image-providers, --image-models, --iterations, --optimize-modes, --auto-modes, --max-variants) plus --input and --caption. A realistic sweep command is a 200+ character line of bash that's miserable to retype, miserable to diff, and miserable to commit to a paper repo alongside the methodology source.

paperbanana batch already solved this with YAML/JSON manifests. Sweep is the obvious symmetric gap — the build_sweep_variants() helper already accepts lists, so all the machinery is in place.

Proposed solution

Add a --manifest / -m option to the sweep command that loads a YAML or JSON file:

# examples/sweep_manifest.yaml
input: examples/sample_inputs/transformer_method.txt
caption: "Overview of encoder-decoder architecture with sparse routing"
pdf_pages: "1-5"            # optional
max_variants: 20            # optional

axes:
  vlm_providers: [gemini, openai]
  vlm_models: [gemini-2.5-flash, gpt-4o]
  image_providers: [google_imagen, openai_imagen]
  image_models: []
  refinement_iterations: [2, 3]
  optimize_inputs: [false, true]
  auto_refine: [false]

Then:

paperbanana sweep --manifest examples/sweep_manifest.yaml

Semantics

  • --manifest is mutually exclusive with the eight axis flags (fail early if both are set, mirroring how batch-report errors on --batch-dir + --batch-id).
  • input and caption are required keys; all other keys default to empty/None just like the CLI flags.
  • Paths inside the manifest resolve relative to the manifest's parent directory (same convention as load_batch_manifest).
  • --output-dir, --format, --config, --dry-run, --auto-download-data, --verbose stay as CLI flags (they are invocation concerns, not sweep-plan concerns).

Scope

  • New load_sweep_manifest(path) helper in paperbanana/core/sweep.py: ~25 LOC.
  • Wire --manifest into the sweep CLI command: ~20 LOC.
  • Tests in tests/test_core/test_sweep.py (valid YAML + JSON, missing required keys, wrong types, mutual-exclusion): ~50 LOC.
  • examples/sweep_manifest.yaml and one README line.

Estimated functional total: ~50 LOC.

Alternatives considered

  • CLI config profiles (shelf of pre-set flag combinations). Rejected because it's a new concept; manifest mirrors existing batch UX.
  • Reuse batch manifest shape with a sweep: section. Rejected: batches run different figures, sweeps run variants of one figure. Overloading the shape would confuse both paths.

Area

CLI

Willingness to contribute

  • I'd be willing to submit a PR for this feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions