-
Notifications
You must be signed in to change notification settings - Fork 29.9k
fix: Added code to match interpolation of Google's ViT implementation… #38626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@bot /style |
Style fixes have been applied. View the workflow run here. |
Hey @lerolynn, thanks for the PR!
Also, it would be super helpful if you could provide a link to the original code + line where the correct interpolation is specified, thanks! |
Hey @qubvel , thanks for the review! I can check these models and make the changes if required. I didn't want to make so many changes in a single pull request initially, but if it's fine I can integrate the changes! There are a few questions I have:
![]() |
|
Got it, I'll check the original implementations of all the models on the list and update it in a few days |
Fixes #28180
What does this PR do?
Fixes the interpolation method in ViT image processors to match the original Google ViT implementation. Changes the default resampling from BILINEAR to BICUBIC interpolation.
Implementation Notes
This implementation follows @NielsRogge's comments from #28180:
torch.allclose
similar to the DINOv2 conversionpass
- can be updated in a follow-up PR if neededBefore submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@NielsRogge @amyeroberts @qubvel