-
Notifications
You must be signed in to change notification settings - Fork 217
Adding Reference Examples
This is the most impactful contribution you can make to PaperBanana. Output quality improves directly with better reference examples.
- The diagram clearly illustrates system architecture, pipeline flow, or framework structure
- Landscape layout with aspect ratio between 1.5 and 2.5 (width / height)
- The methodology section is self-contained enough to describe the approach without the full paper
- The diagram is not a results plot, ablation table, t-SNE visualization, or data sample
Lowest effort. Open a Reference Example issue with:
- arXiv link
- Figure number
- Which category it fits
We handle extraction and curation.
You need three things from the paper:
-
Methodology text: Copy the methodology/method section from the paper. Clean up any PDF extraction artifacts (broken words, stray headers). The text should describe what the diagram shows.
-
Diagram image: Extract the methodology figure as a PNG. Ensure it's clean, readable at 800px width, and has an aspect ratio between 1.5 and 2.5.
-
Metadata: Paper title, arXiv ID, figure number, original caption, and category.
data/reference_sets/your_example_name/
├── methodology.txt
├── diagram.png
└── metadata.json
Use a short, descriptive directory name based on the paper's key concept (e.g., react_agent, diffusion_unet, graph_rag).
{
"paper_title": "ReAct: Synergizing Reasoning and Acting in Language Models",
"arxiv_id": "2210.03629",
"figure_number": 1,
"caption": "Overview of the ReAct framework combining reasoning traces and actions",
"category": "agent_reasoning",
"source_url": "https://arxiv.org/abs/2210.03629",
"aspect_ratio": 2.1
}Before submitting the PR:
- The methodology text matches what the diagram actually depicts
- The diagram is clean (no scan artifacts, no cropping issues)
- The aspect ratio is between 1.5 and 2.5
- The paper is publicly available
-
metadata.jsonis valid JSON with all fields populated
Open a pull request with the title [Reference]: <paper name>. The PR template will guide you through the checklist.
- Science & Applications: Domain-specific architectures outside core ML
- Vision & Perception: Detection, segmentation, multimodal pipelines
Any category is welcome, but these two are currently underrepresented.