Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding Depth Anything v2 #458

Closed

Conversation

graemeniedermayer
Copy link
Contributor

@graemeniedermayer graemeniedermayer commented Jul 9, 2024

This deals with #450 .

Only adding most basic functionality from depth anything v2. Also not tested with boost, tiling, or any extra features (although they should work).

Notes

Maybe we should make a guide for how to add new models because they are coming out so fast right now.

This only uses vitl model (maybe others should be included too). vitg is suppose to come out soon and presumable be better.

Depth anything has it's own functions for pointcloud conversions (right now using zoedepth functions)

Depth anything v2 also does have metric information that's thrown out that maybe could be included somewhere (the information is normalised away).

I'm also starting to think maybe we should invert the UI list so newer models are near the top.

added dephanything_v2 repo

added remove extra assets

moving imports

add __init__.py

attempting to fix imports
@graemeniedermayer
Copy link
Contributor Author

@semjon00 @thygate have a quick moment to review this?

@semjon00
Copy link
Collaborator

semjon00 commented Jul 21, 2024

@graemeniedermayer Hello and thank you for this MR! Could you please update it so that all the available depth anything models are listed? If you could have a shot at the half mode that would be awesome (almost everyone uses half mode).

Also, do you think there is a change this new code could cause exceptions for users who won't even use Depth Anything v2? Currently, the code uses modular approach: if one feature fails to load (e.g. impainting), other features won't be harmed.

@semjon00
Copy link
Collaborator

@thygate These models look good, do you think we should set one of them as a default once this lands?

@graemeniedermayer
Copy link
Contributor Author

It is compartmentalised so I don't expect it to cause any issues for other users. People appear to be using the pull request directly so I would prefer it be merged before it causes too much confusion. Adding models should be straightforward, but because people are using this branch maybe I could make a separate pull request for that?

This is also one of the faster models so I think it would be a good default.

@graemeniedermayer
Copy link
Contributor Author

I was getting an error when trying to naively implement half. It runs at about 3.5 vram GB for me which is really surprising. But I'll look into this more.

@semjon00
Copy link
Collaborator

semjon00 commented Jul 21, 2024

I am making some changes based on this MR, lands soon(tm*)

*Ok soon for real actually.

@semjon00
Copy link
Collaborator

The half bug is "RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same"
Means we should check that all the objects we pass around (image input for instance) are all half.

@semjon00
Copy link
Collaborator

semjon00 commented Jul 21, 2024

@graemeniedermayer Thanks one more time for the MR ❤️ , took the code and inspiration from there. I also added half mode and other models. I would appreciate if you could verify that I did not miss anything. This model (standing on the shoulders of giants, of course) is awesome and just destroys the res101 IMHO. The authors even provide the video mode and point clouds and metric values, I start feeling silly for my video implementation.

If you can find an idea for improving, I'm all ears!

@semjon00 semjon00 closed this Jul 21, 2024
@graemeniedermayer
Copy link
Contributor Author

While I do appreciate your comments about technical debt in the depth_api file, perhaps the comment should be shorter.

any chance you could add me as coauthor on the depthanything v2 commit?

@semjon00
Copy link
Collaborator

semjon00 commented Jul 21, 2024

About technical debt, sorry, I meant to put it to another place - we have unexplainable numbers like this all over the project. You could not have prevented this mess, we had these magic numbers long before the API was even a thing. And the core reason actually is how gradio treats options in the drop-down - as numbers. Also we needed an entire class just to aggregate the UI elements and their settings together (gradio_args_transport.py). I moved it to get_default_net_size (the function with the cryptic set of numbers). I hoped the comment to be funny. Didn't mean it to be a dig at your code 😅😢

@graemeniedermayer Adding you as co-author, forgot to do that...

I am really sorry, how I handled this looked quite bad not gonna lie.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants