Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use any as the default function label type #9050

Merged
merged 1 commit into from
Feb 6, 2025

Conversation

SpecLad
Copy link
Contributor

@SpecLad SpecLad commented Feb 4, 2025

Motivation and context

Currently, the default is unknown, but allowing unknown as a function label type value creates some complications.

  • The sets of allowed label types of a task and of a function are not the same, so we have to use different enums to represent them.

  • any and unknown have similar meanings, and it's not clear what the difference between them is unless you search the code (AFAIK it's not documented anywhere).

The sole benefit of having a separate unknown label type seems to be that function labels of type any are only allowed to be mapped to task labels of type any, while function labels of type unknown can be mapped to task labels of any type. But it doesn't seem like a useful distinction. Functions with both unknown and any label types can produce any shape types, so if we allow one of them, it makes sense to also allow the other.

In addition, functions that can produce multiple shape types for a single label are probably going to be rare, so I think it's reasonable to assume that a function with an any-typed label will still produce a single shape type and therefore can be mapped to a label with a specific type (although the user is responsible for ensuring that the types actually match).

To sum up, I don't think the additional complexity introduced by the unknown type is worth it. So let's remove it and use any instead.

This PR keeps some unknown-related logic in the UI, since there's some code in the Enterprise backend that also uses unknown as a label type. If this patch is accepted, I'll replace those uses with any too, then go back and remove the remaining logic here.

How has this been tested?

Manual testing.

Checklist

  • I submit my changes into the develop branch
  • I have created a changelog fragment
  • [ ] I have updated the documentation accordingly
  • [ ] I have added tests to cover my changes
  • [ ] I have linked related issues (see GitHub docs)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@SpecLad SpecLad requested a review from bsekachev as a code owner February 4, 2025 16:35
Currently, the default is `unknown`, but allowing `unknown` as a function
label type value creates some complications.

* The sets of allowed label types of a task and of a function are not the
  same, so we have to use different enums to represent them.

* `any` and `unknown` have similar meanings, and it's not clear what the
  difference between them is unless you search the code (AFAIK it's not
  documented anywhere).

The sole benefit of having a separate `unknown` label type seems to be that
function labels of type `any` are only allowed to be mapped to task labels
of type `any`, while function labels of type `unknown` can be mapped to task
labels of any type. But it doesn't seem like a useful distinction. Functions
with both `unknown` and `any` label types can produce any shape types, so if
we allow one of them, it makes sense to also allow the other.

In addition, functions that can produce _multiple_ shape types for a single
label are probably going to be rare, so I think it's reasonable to assume
that a function with an `any`-typed label will still produce a single shape
type and therefore can be mapped to a label with a specific type (although
the user is responsible for ensuring that the types actually match).

To sum up, I don't think the additional complexity introduced by the
`unknown` type is worth it. So let's remove it and use `any` instead.

This PR keeps some `unknown`-related logic in the UI, since there's
some code in the Enterprise backend that also uses `unknown` as a label
type. If this patch is accepted, I'll replace those uses with `any` too,
then go back and remove the remaining logic here.
Copy link

sonarqubecloud bot commented Feb 4, 2025

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.73%. Comparing base (908a47d) to head (eec9feb).
Report is 10 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9050      +/-   ##
===========================================
+ Coverage    73.43%   73.73%   +0.30%     
===========================================
  Files          419      428       +9     
  Lines        44351    44501     +150     
  Branches      3875     3875              
===========================================
+ Hits         32569    32813     +244     
+ Misses       11782    11688      -94     
Components Coverage Δ
cvat-ui 77.43% <ø> (ø)
cvat-server 70.68% <ø> (+0.56%) ⬆️

@bsekachev
Copy link
Member

The idea behind having "unknown" as a default value was: the function always return a certain type of objects. It can't return "car" being a rectangle on the first call and a polygon on the second call. Sounds like something unreal.

And it was generally done for backward compatibility. After years, may be to remove it at all, considering detectors, not providing their label types, as incorrect?

@SpecLad
Copy link
Contributor Author

SpecLad commented Feb 6, 2025

I'm against removing it, for a few reasons:

  1. I'd like to require as little configuration as possible when creating functions, to lower the entry barrier. The label types in a model are not necessary for it to function, they're just extra information that the UI can use to help the user, so it makes sense for them to be optional.

    The auto-annotation function interface in the SDK also does not require label types, and I'd like to keep it that way.

  2. I think it's premature to say that a model can never output multiple shape types. The models we have in the repo don't do that, but we don't know what kind of model we (or our users) will implement next. To not support this seems like an unnecessary, artificial restriction.

@SpecLad SpecLad merged commit fec040d into cvat-ai:develop Feb 6, 2025
34 checks passed
@SpecLad SpecLad deleted the unknown-any branch February 6, 2025 13:17
@cvat-bot cvat-bot bot mentioned this pull request Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants