-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding Depth Anything v2 #458
Conversation
added dephanything_v2 repo added remove extra assets moving imports add __init__.py attempting to fix imports
8db5f0d
to
f4509f5
Compare
@graemeniedermayer Hello and thank you for this MR! Could you please update it so that all the available depth anything models are listed? If you could have a shot at the half mode that would be awesome (almost everyone uses half mode). Also, do you think there is a change this new code could cause exceptions for users who won't even use Depth Anything v2? Currently, the code uses modular approach: if one feature fails to load (e.g. impainting), other features won't be harmed. |
@thygate These models look good, do you think we should set one of them as a default once this lands? |
It is compartmentalised so I don't expect it to cause any issues for other users. People appear to be using the pull request directly so I would prefer it be merged before it causes too much confusion. Adding models should be straightforward, but because people are using this branch maybe I could make a separate pull request for that? This is also one of the faster models so I think it would be a good default. |
I was getting an error when trying to naively implement half. It runs at about 3.5 vram GB for me which is really surprising. But I'll look into this more. |
I am making some changes based on this MR, lands soon(tm*) *Ok soon for real actually. |
The half bug is "RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same" |
@graemeniedermayer Thanks one more time for the MR ❤️ , took the code and inspiration from there. I also added half mode and other models. I would appreciate if you could verify that I did not miss anything. This model (standing on the shoulders of giants, of course) is awesome and just destroys the res101 IMHO. The authors even provide the video mode and point clouds and metric values, I start feeling silly for my video implementation. If you can find an idea for improving, I'm all ears! |
While I do appreciate your comments about technical debt in the any chance you could add me as coauthor on the depthanything v2 commit? |
About technical debt, sorry, I meant to put it to another place - we have unexplainable numbers like this all over the project. You could not have prevented this mess, we had these magic numbers long before the API was even a thing. And the core reason actually is how gradio treats options in the drop-down - as numbers. Also we needed an entire class just to aggregate the UI elements and their settings together (gradio_args_transport.py). I moved it to get_default_net_size (the function with the cryptic set of numbers). I hoped the comment to be funny. Didn't mean it to be a dig at your code 😅😢 @graemeniedermayer Adding you as co-author, forgot to do that... I am really sorry, how I handled this looked quite bad not gonna lie. |
This deals with #450 .
Only adding most basic functionality from depth anything v2. Also not tested with boost, tiling, or any extra features (although they should work).
Notes
Maybe we should make a guide for how to add new models because they are coming out so fast right now.
This only uses vitl model (maybe others should be included too). vitg is suppose to come out soon and presumable be better.
Depth anything has it's own functions for pointcloud conversions (right now using zoedepth functions)
Depth anything v2 also does have metric information that's thrown out that maybe could be included somewhere (the information is normalised away).
I'm also starting to think maybe we should invert the UI list so newer models are near the top.