Skip to content

Python API

Dipkumar Patel edited this page Feb 4, 2026 · 1 revision

Python API

For programmatic usage, PaperBanana exposes a Python API alongside the CLI.

Basic Usage

import asyncio
from paperbanana import PaperBananaPipeline, GenerationInput, DiagramType
from paperbanana.core.config import Settings

settings = Settings(
    vlm_provider="gemini",
    image_provider="google_imagen",
    refinement_iterations=3,
)

pipeline = PaperBananaPipeline(settings=settings)

result = asyncio.run(pipeline.generate(
    GenerationInput(
        source_context="Our framework consists of...",
        communicative_intent="Overview of the proposed method.",
        diagram_type=DiagramType.METHODOLOGY,
    )
))

print(f"Output: {result.image_path}")

GenerationInput

Field Type Description
source_context str The methodology section text
communicative_intent str Figure caption or what the diagram should communicate
diagram_type DiagramType DiagramType.METHODOLOGY or DiagramType.PLOT

GenerationResult

The generate() method returns a GenerationResult:

Field Type Description
image_path Path Path to the final generated image
iterations list[IterationResult] Results from each refinement round
planning PlanningResult Retrieved examples and generated descriptions
metadata dict Run timing, config, provider details

Settings

Load from a config file or construct directly:

from paperbanana.core.config import Settings

# From YAML file
settings = Settings.from_yaml("configs/config.yaml")

# Direct construction
settings = Settings(
    vlm_provider="gemini",
    vlm_model="gemini-2.0-flash",
    image_provider="google_imagen",
    image_model="gemini-3-pro-image-preview",
    refinement_iterations=3,
    num_retrieval_examples=10,
    output_dir="outputs",
    save_iterations=True,
    save_metadata=True,
)

Plot Generation

from paperbanana import PlotInput

result = asyncio.run(pipeline.generate(
    PlotInput(
        data_path="results.csv",
        communicative_intent="Bar chart comparing accuracy across benchmarks",
    )
))

Evaluation

from paperbanana.evaluation import evaluate_diagram

scores = asyncio.run(evaluate_diagram(
    generated_path="output.png",
    reference_path="human_reference.png",
    source_context="Our framework consists of...",
    caption="Overview of our method",
    settings=settings,
))

print(scores)
# {
#   "faithfulness": 0.82,
#   "readability": 0.75,
#   "conciseness": 0.88,
#   "aesthetics": 0.71,
#   "overall": 0.79
# }

Batch Generation

For generating multiple diagrams:

import asyncio
from pathlib import Path

inputs = [
    ("method_a.txt", "Overview of method A"),
    ("method_b.txt", "Architecture of method B"),
    ("method_c.txt", "Pipeline for method C"),
]

async def batch_generate():
    pipeline = PaperBananaPipeline(settings=settings)
    results = []
    for text_path, caption in inputs:
        text = Path(text_path).read_text()
        result = await pipeline.generate(
            GenerationInput(
                source_context=text,
                communicative_intent=caption,
                diagram_type=DiagramType.METHODOLOGY,
            )
        )
        results.append(result)
        # Respect rate limits between calls
        await asyncio.sleep(5)
    return results

results = asyncio.run(batch_generate())

Note the asyncio.sleep(5) between calls. Gemini's free tier has per-minute rate limits, and each generation makes multiple API calls internally.

Working Examples

See examples/ in the repo:

  • examples/generate_diagram.py - basic methodology diagram generation
  • examples/generate_plot.py - statistical plot generation

Clone this wiki locally