-
Notifications
You must be signed in to change notification settings - Fork 1
Add an API to describe variant using labels #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Marking this as draft until I've got meson-python counterpart ready. |
Add a minimal API that lets plugins process the variant description and add short labels that could be used to describe the wheel. The labels are entirely optional -- plugins may not return any, and clients may ignore them. The design also assumes that the client is responsible for choosing how many labels to use before truncating -- though I suppose I'll add a helper function to `variantlib` for that purpose. The design assumes that plugin get complete unfiltered `VariantDescription` -- and therefore also see variant metadata from other plugins. I don't think that's really a problem, though it assumes that the plugin must compare both against provider and key names. At this point, the protocol makes the API mandatory, even if it would only return an empty list unconditionally. Perhaps we should make the function optional somehow instead.
Damn it ! I didn't see that PR before the big refactoring ... |
@mgorny I read this PR (because I was trying to rebase it) and I think I disagree with what is being done here (as an objective - not how it's done). I don't think this is a good idea.I think that should be a package maintainer API:
Given that these names have no way to guarantee unicity (unlike the hash). We shouldn't pretend they are "part of the variant API". They are not. They are convenience identifiers - mostly for maintainers / power users. They are not part of either the variant resolution nor checked in any way for unicity / validity (beyond a regex and max length maybe). Plus different packages will want to "name the same variant" differently because they will focus on different aspects. And lastly I don't think the alias should include every "features" inside the alias: From my perspective, |
@rgommers, what do you think? |
Two main thoughts after seeing this:
I was indeed expecting
They should be free to override it of course, but forcing non-uniformity by design seems worse than having good defaults. In addition, the UX is worse without defaults, because you have to add something like The UX I'd hope we aim for on the build side is fairly extensively described in wheelnext/pep_xxx_wheel_variants#9. More discussion/review on that would be useful.
That seems a little far-fetched since we don't want that many variables to begin with. Specifically for these x86-64 SIMD labels I included the desired level of granularity in wheelnext/pep_xxx_wheel_variants#9. So this will be something like |
My idea is that |
Yeah I can see the case for that. It can/will work either way I think. And whatever the design is, it needs a documented and stable interface anyway, because |
That was the original idea, though I do agree with @mgorny here that we should increase the scope. It's much cleaner to provide a very strong, well tested complete implementation of the variant standard behavior. Rather than 50 projects implementing 15% of the complete standard. It's both a reference implementation and a support library for the entire ecosystem of tools that will build on top of wheel variant features.
I think here the "disconnect" is more on the "usecase/scenario" and we should answer that question first.
I think it's reasonable to say >95-98% of users will never see them - never be aware of them - never even understand them in most cases. I would argue if that statement is not correct, we need to re-engineer the "variant resolution algorithm" because that's not good. If the idea is to give a "unique name" to each [major ?] variant to allow people to fix "variant resolution issue" then I would argue that there is no need for correlation for feature / name. A high level name that can be documented is good enough. And that most importantly will be short because it's an alias (e.g. Now if the idea is to provide package maintainers a way to in local identify different variants to the "smallest degree of granularity", this is a completely use-case and they can be flagged as "development wheels" and not be published "as-is".
Depending on the "usecase" I am not convinced every variants need an "alias". Question: so now that clearly opens the door to alias conflicts. Because obviously I can have
I don't disagree. But unless you block it people will do it. My whole point is "if we consider a behavior undesirable - let's close the door to that behavior". I don't want to limit the number of features a variant can declare (arbitrary philosophy), but then consequently I need to block that feature to prevent the filenames to become arbitrarily long. My overall perspective on this variant alias feature is that we have to be careful. Adding this "variant alias" feature, a non-essential feature though very nice to have, implies to modify the filename a lot more than absolutely necessary ( I do not care much about how this is implemented. I don't care much if it's fully manual or auto + override. But we need a very very clear use-case, and be able to explain why we believe it's really important to have this. Because arguably - if the variant resolution algorithm is good - then most users will never see these aliases.
|
Yes, that's a good point in favor of adding it to
There are more types of users than "end users who use There's also a lot of precedent, e.g, for PyTorch (see https://download.pytorch.org/whl/torch/) and JAX (see https://storage.googleapis.com/jax-releases/jax_nightly_releases.html). Just imagine what those indexes would look like if you removed all
The whole argument here is backwards in my opinion. Hashes are essential to the internals to make things work, but are basically leaking an internal implementation detail to wide visibility in the filename. There's nothing else like this today, all tags are strings with a meaning. My expectation is grudging acceptance of hashes in filenames only because there's no good way to hide them - but people may try to come up with schemes to get rid of them.
You can give me this task to wordsmith it in docs/PEP. But essentially: it makes maintainers job significantly harder if we don't have string aliases, for wheel building/authoring, for debugging, and other maintenance/development tasks. Taking a step back: I think the resistance comes from arguing against arbitrary-length strings. I agree that's a bad idea (or at least, it's not going to be accepted, hence we should avoid it). I don't really want to have an opinion on whether no alias or super-long alias is worse. Both are bad. Let's pick a max length and stay with that, with the understanding that that limit can change in community review. Something like <=20 very clearly is better than no alias. |
I'd like to merge #20 finish, but then I'd like to revisit this, so I'd appreciate some guidance on what's your preferred course of action here. FWICS, the main options here are:
|
There is absolutely no consensus among us about I'm sympathetic to the objective being achieved here, but I'm not convinced at all:
Before we make a decision whether or not to add @mgorny @rgommers @msarahan @warsawnv @emmatyping-nv @aterrel @charliermarsh @konstin @seemethere @atalman Summary for those who discover the topic - discussions happened in different places.We are discussing if/how to give "human readable" aliases to variants. The "common current ground is the following": Obviously This could be addressed by:
There are very valid argument to pretty much defend each approach - It quite depends on which perspective you position yourself. It's a compromise you can't win everywhere. Obviously this "feature" is important to distinguish at a glance which package is which. However if we don't pick the right solution we might bump into other issues:
|
Thanks for the summary @DEKHTIARJonathan. Personally, I agree that this feels non-critical (and that we have larger points to align on) -- I would also prefer to defer this for now. |
I know @rgommers feels strongly that aliases should be included in the first (and only) PEP, but you also know that I agree that I think this will be a bone of contention that distracts from the core bits of the PEP. I think it would be perfectly fine to lay out the argument for aliases in an Open Questions section and explicitly state that a) it's not critical to the functional success of variants, b) enhances the usability of variants, c) may be proposed in a companion PEP. If we're wrong about that, then this will come up in the DPO thread and we can update the variants spec to include aliases pretty easily. |
Add a minimal API that lets plugins process the variant description and add short labels that could be used to describe the wheel. The labels are entirely optional -- plugins may not return any, and clients may ignore them. The design also assumes that the client is responsible for choosing how many labels to use before truncating -- though I suppose I'll add a helper function to
variantlib
for that purpose.The design assumes that plugin get complete unfiltered
VariantDescription
-- and therefore also see variant metadata from other plugins. I don't think that's really a problem, though it assumes that the plugin must compare both against provider and key names.At this point, the protocol makes the API mandatory, even if it would only return an empty list unconditionally. Perhaps we should make the function optional somehow instead.