Adding Reference Examples

This is the most impactful contribution you can make to PaperBanana. Output quality improves directly with better reference examples.

What makes a good reference example

The diagram clearly illustrates system architecture, pipeline flow, or framework structure
Landscape layout with aspect ratio between 1.5 and 2.5 (width / height)
The methodology section is self-contained enough to describe the approach without the full paper
The diagram is not a results plot, ablation table, t-SNE visualization, or data sample

Option 1: Submit a paper recommendation

Lowest effort. Open a Reference Example issue with:

arXiv link
Figure number
Which category it fits

We handle extraction and curation.

Option 2: Submit a complete reference via PR

Step 1: Extract the data

You need three things from the paper:

Methodology text: Copy the methodology/method section from the paper. Clean up any PDF extraction artifacts (broken words, stray headers). The text should describe what the diagram shows.
Diagram image: Extract the methodology figure as a PNG. Ensure it's clean, readable at 800px width, and has an aspect ratio between 1.5 and 2.5.
Metadata: Paper title, arXiv ID, figure number, original caption, and category.

Step 2: Create the directory

data/reference_sets/your_example_name/
├── methodology.txt
├── diagram.png
└── metadata.json

Use a short, descriptive directory name based on the paper's key concept (e.g., react_agent, diffusion_unet, graph_rag).

Step 3: Populate metadata.json

{
  "paper_title": "ReAct: Synergizing Reasoning and Acting in Language Models",
  "arxiv_id": "2210.03629",
  "figure_number": 1,
  "caption": "Overview of the ReAct framework combining reasoning traces and actions",
  "category": "agent_reasoning",
  "source_url": "https://arxiv.org/abs/2210.03629",
  "aspect_ratio": 2.1
}

Step 4: Verify

Before submitting the PR:

The methodology text matches what the diagram actually depicts
The diagram is clean (no scan artifacts, no cropping issues)
The aspect ratio is between 1.5 and 2.5
The paper is publicly available
metadata.json is valid JSON with all fields populated

Step 5: Submit PR

Open a pull request with the title [Reference]: <paper name>. The PR template will guide you through the checklist.

Categories we need most

Science & Applications: Domain-specific architectures outside core ML
Vision & Perception: Detection, segmentation, multimodal pipelines

Any category is welcome, but these two are currently underrepresented.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Reference Examples

Adding Reference Examples

What makes a good reference example

Option 1: Submit a paper recommendation

Option 2: Submit a complete reference via PR

Step 1: Extract the data

Step 2: Create the directory

Step 3: Populate metadata.json

Step 4: Verify

Step 5: Submit PR

Categories we need most

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally