Skip to content

add GARDO: Reinforcing Diffusion Models without Reward Hacking#216

Open
tinnerhrhe wants to merge 2 commits intoyifan123:mainfrom
tinnerhrhe:main
Open

add GARDO: Reinforcing Diffusion Models without Reward Hacking#216
tinnerhrhe wants to merge 2 commits intoyifan123:mainfrom
tinnerhrhe:main

Conversation

@tinnerhrhe
Copy link

GARDO is introduced in the paper "GARDO: Reinforcing Diffusion Models without Reward Hacking" (https://arxiv.org/abs/2512.24138), which studies effective methods for relieving reward hacking without compromising sample efficiency.

This update includes recipes for GARDO training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant