Skip to content

Fix paper link: use huggingface.co/papers/ instead of arxiv.org/abs/ …

cfc4e54
Select commit
Loading
Failed to load commit list.
Closed

Fix spurious KL gradients for zero-std reward groups in GRPOTrainer #5640

Fix paper link: use huggingface.co/papers/ instead of arxiv.org/abs/ …
cfc4e54
Select commit
Loading
Failed to load commit list.
Cursor / Cursor Bugbot completed Apr 26, 2026 in 8m 12s

Bugbot Review

Bugbot Analysis Progress (8m 15s elapsed)

✅ Gathered PR context (2s)
✅ Completed bug detection (8m 8s)
✅ Posted analysis results (5s)

Final Result: Bugbot completed review and found 1 potential issue.

Request ID: serverGenReqId_a48fb750-8c1e-4436-b2e1-658d7526562f

Details

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit cfc4e54. Configure here.