Skip to content

Latest commit

 

History

History
77 lines (55 loc) · 4.83 KB

File metadata and controls

77 lines (55 loc) · 4.83 KB

Baselines

Environment: hiyouga/verl:ngc-th2.7.1-cu12.6-vllm0.10.0

EasyR1 version: v0.3.2

Welcome to contribute new data points!

Algorithm Baselines

Size Algorithm Bits LR KL Test Accuracy
7B GRPO AMP 1e-6 1e-2 0.75 -> 0.77 (+0.02)
Size Algorithm Bits LR KL Test Accuracy
7B GRPO AMP 1e-6 1e-2 0.37 -> 0.48 (+0.11)
7B GRPO BF16 1e-6 1e-2 0.37 -> 0.48 (+0.11)
7B DAPO AMP 1e-6 1e-2 0.37 -> 0.50 (+0.13)
7B GSPO AMP 1e-6 0 0.37 -> 0.48 (+0.11)
7B CISPO AMP 1e-6 1e-2 0.37 -> 0.50 (+0.13)
3B GRPO AMP 1e-6 1e-2 0.24 -> 0.38 (+0.14)
32B GRPO BF16 1e-6 1e-2 0.50 -> 0.56 (+0.06)
Size Algorithm Bits LR KL Test Accuracy
30B-A3B GRPO BF16 1e-6 1e-2 0.55 -> 0.78 (+0.23)
Size Algorithm Bits LR KL Test Accuracy
30B-A3B GRPO BF16 1e-6 1e-2 0.49 -> 0.77 (+0.28)

Note

The hyper-parameters not listed are all the same as the default values.

Performance Baselines

Size GPU Type Bits Batch Size vLLM TP Peak Mem Peak VRAM Throughput Sec per step Actor MFU
3B 8 * H100 80GB AMP 1 / 2 2 120GB 54GB 1800 (+600) 120s 8.1%
7B 8 * H100 80GB AMP 1 / 2 2 120GB 68GB 1600 (+400) 145s 16.0%
7B 8 * H100 80GB AMP 4 / 8 2 200GB 72GB 2000 (+600) 120s 23.2%
7B 8 * L20 48GB AMP 1 / 2 2 120GB 42GB 410 (+0) 580s 26.5%
7B 8 * H100 80GB BF16 1 / 2 2 120GB 58GB 1600 (+320) 145s 16.0%
32B 8 * H100 80GB BF16 1 / 2 8 260GB 72GB 620 (+260) 530s 25.8%
Size GPU Type Bits Batch Size vLLM TP Peak Mem Peak VRAM Throughput Sec per step Actor MFU
30B-A3B 8 * H800 80GB BF16 1 / 2 8 170GB 50GB 80 4600s 1.8%
Size GPU Type Bits Batch Size vLLM TP Peak Mem Peak VRAM Throughput Sec per step Actor MFU
30B-A3B 8 * H800 80GB BF16 1 / 2 8 210GB 50GB 65 8000s 1.4%
  • Batch Size: micro_batch_size_per_device_for_update / micro_batch_size_per_device_for_experience
  • vLLM TP: rollout.tensor_parallel_size
  • Peak Mem: Peak CPU memory usage
  • Peak VRAM: Peak GPU memory usage
  • Throughput: Number of tokens per second per GPU by one training step (including the improvement compared to the previous version)
  • Sec per step: Average time per step in seconds

Note

The hyper-parameters not listed are all the same as the default values.