Python API

For programmatic usage, PaperBanana exposes a Python API alongside the CLI.

Basic Usage

import asyncio
from paperbanana import PaperBananaPipeline, GenerationInput, DiagramType
from paperbanana.core.config import Settings

settings = Settings(
    vlm_provider="gemini",
    image_provider="google_imagen",
    refinement_iterations=3,
)

pipeline = PaperBananaPipeline(settings=settings)

result = asyncio.run(pipeline.generate(
    GenerationInput(
        source_context="Our framework consists of...",
        communicative_intent="Overview of the proposed method.",
        diagram_type=DiagramType.METHODOLOGY,
    )
))

print(f"Output: {result.image_path}")

GenerationInput

Field	Type	Description
`source_context`	`str`	The methodology section text
`communicative_intent`	`str`	Figure caption or what the diagram should communicate
`diagram_type`	`DiagramType`	`DiagramType.METHODOLOGY` or `DiagramType.PLOT`

GenerationResult

The generate() method returns a GenerationResult:

Field	Type	Description
`image_path`	`Path`	Path to the final generated image
`iterations`	`list[IterationResult]`	Results from each refinement round
`planning`	`PlanningResult`	Retrieved examples and generated descriptions
`metadata`	`dict`	Run timing, config, provider details

Settings

Load from a config file or construct directly:

from paperbanana.core.config import Settings

# From YAML file
settings = Settings.from_yaml("configs/config.yaml")

# Direct construction
settings = Settings(
    vlm_provider="gemini",
    vlm_model="gemini-2.0-flash",
    image_provider="google_imagen",
    image_model="gemini-3-pro-image-preview",
    refinement_iterations=3,
    num_retrieval_examples=10,
    output_dir="outputs",
    save_iterations=True,
    save_metadata=True,
)

Plot Generation

from paperbanana import PlotInput

result = asyncio.run(pipeline.generate(
    PlotInput(
        data_path="results.csv",
        communicative_intent="Bar chart comparing accuracy across benchmarks",
    )
))

Evaluation

from paperbanana.evaluation import evaluate_diagram

scores = asyncio.run(evaluate_diagram(
    generated_path="output.png",
    reference_path="human_reference.png",
    source_context="Our framework consists of...",
    caption="Overview of our method",
    settings=settings,
))

print(scores)
# {
#   "faithfulness": 0.82,
#   "readability": 0.75,
#   "conciseness": 0.88,
#   "aesthetics": 0.71,
#   "overall": 0.79
# }

Batch Generation

For generating multiple diagrams:

import asyncio
from pathlib import Path

inputs = [
    ("method_a.txt", "Overview of method A"),
    ("method_b.txt", "Architecture of method B"),
    ("method_c.txt", "Pipeline for method C"),
]

async def batch_generate():
    pipeline = PaperBananaPipeline(settings=settings)
    results = []
    for text_path, caption in inputs:
        text = Path(text_path).read_text()
        result = await pipeline.generate(
            GenerationInput(
                source_context=text,
                communicative_intent=caption,
                diagram_type=DiagramType.METHODOLOGY,
            )
        )
        results.append(result)
        # Respect rate limits between calls
        await asyncio.sleep(5)
    return results

results = asyncio.run(batch_generate())

Note the asyncio.sleep(5) between calls. Gemini's free tier has per-minute rate limits, and each generation makes multiple API calls internally.

Working Examples

See examples/ in the repo:

examples/generate_diagram.py - basic methodology diagram generation
examples/generate_plot.py - statistical plot generation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python API

Python API

Basic Usage

GenerationInput

GenerationResult

Settings

Plot Generation

Evaluation

Batch Generation

Working Examples

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally