-
Notifications
You must be signed in to change notification settings - Fork 216
Python API
Dipkumar Patel edited this page Feb 4, 2026
·
1 revision
For programmatic usage, PaperBanana exposes a Python API alongside the CLI.
import asyncio
from paperbanana import PaperBananaPipeline, GenerationInput, DiagramType
from paperbanana.core.config import Settings
settings = Settings(
vlm_provider="gemini",
image_provider="google_imagen",
refinement_iterations=3,
)
pipeline = PaperBananaPipeline(settings=settings)
result = asyncio.run(pipeline.generate(
GenerationInput(
source_context="Our framework consists of...",
communicative_intent="Overview of the proposed method.",
diagram_type=DiagramType.METHODOLOGY,
)
))
print(f"Output: {result.image_path}")| Field | Type | Description |
|---|---|---|
source_context |
str |
The methodology section text |
communicative_intent |
str |
Figure caption or what the diagram should communicate |
diagram_type |
DiagramType |
DiagramType.METHODOLOGY or DiagramType.PLOT
|
The generate() method returns a GenerationResult:
| Field | Type | Description |
|---|---|---|
image_path |
Path |
Path to the final generated image |
iterations |
list[IterationResult] |
Results from each refinement round |
planning |
PlanningResult |
Retrieved examples and generated descriptions |
metadata |
dict |
Run timing, config, provider details |
Load from a config file or construct directly:
from paperbanana.core.config import Settings
# From YAML file
settings = Settings.from_yaml("configs/config.yaml")
# Direct construction
settings = Settings(
vlm_provider="gemini",
vlm_model="gemini-2.0-flash",
image_provider="google_imagen",
image_model="gemini-3-pro-image-preview",
refinement_iterations=3,
num_retrieval_examples=10,
output_dir="outputs",
save_iterations=True,
save_metadata=True,
)from paperbanana import PlotInput
result = asyncio.run(pipeline.generate(
PlotInput(
data_path="results.csv",
communicative_intent="Bar chart comparing accuracy across benchmarks",
)
))from paperbanana.evaluation import evaluate_diagram
scores = asyncio.run(evaluate_diagram(
generated_path="output.png",
reference_path="human_reference.png",
source_context="Our framework consists of...",
caption="Overview of our method",
settings=settings,
))
print(scores)
# {
# "faithfulness": 0.82,
# "readability": 0.75,
# "conciseness": 0.88,
# "aesthetics": 0.71,
# "overall": 0.79
# }For generating multiple diagrams:
import asyncio
from pathlib import Path
inputs = [
("method_a.txt", "Overview of method A"),
("method_b.txt", "Architecture of method B"),
("method_c.txt", "Pipeline for method C"),
]
async def batch_generate():
pipeline = PaperBananaPipeline(settings=settings)
results = []
for text_path, caption in inputs:
text = Path(text_path).read_text()
result = await pipeline.generate(
GenerationInput(
source_context=text,
communicative_intent=caption,
diagram_type=DiagramType.METHODOLOGY,
)
)
results.append(result)
# Respect rate limits between calls
await asyncio.sleep(5)
return results
results = asyncio.run(batch_generate())Note the asyncio.sleep(5) between calls. Gemini's free tier has per-minute rate limits, and each generation makes multiple API calls internally.
See examples/ in the repo:
-
examples/generate_diagram.py- basic methodology diagram generation -
examples/generate_plot.py- statistical plot generation