Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: token suspicion adjustment #477

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

fix: token suspicion adjustment #477

wants to merge 2 commits into from

Conversation

tazlin
Copy link
Member

@tazlin tazlin commented Jan 17, 2025

Once upon a time, before batching and other optimizations, these were the speeds we considered unreasonable but new paradigms, backends and breakthroughs have made these numbers increasingly inaccurate or irrelevant.

While I do think there has to be some sort of longer term solution (such as the one addressing the problem detailed in #463), there have been virtually only false positives, and the few true positives boiled down to innocent misconfigurations.

Further, it appears that certain types of worker-reported failures can artificially inflate token count, which may be its own issue.

For the time being, I am advocating that the number is increased to 100t/s for most models, as recommended by henky, and that we respond to possible abuse of this relaxation with other, more complete and sound, measures.

Once upon a time, before batching and other optimizations, these were the speeds we considered unreasonable but new paradigms, backends and breakthroughs have made these numbers increasingly inaccurate or irrelevant.

While I do think there has to be some sort of longer term (such as the problem detailed in #463), there have been virtually *only* false positives, and the few true positives boiled down to innocent misconfigurations.

Further, it appears that certain terms of worker-reported failures can artificially inflate token count, which may be its own issue.

For the time being, I am advocating that the number is increased to 100t/s, as recommended by henky, and that we respond to possible abuse of this relaxation with other, more complete and sound, measures.
@tazlin tazlin added the allow-ci A PR with this label will run through CI. label Jan 17, 2025
@tazlin
Copy link
Member Author

tazlin commented Jan 17, 2025

and just to clarify, this is to resolve the problem of new workers very consistently being flagged and shadow banned. It happens with shocking regularity and due to the nature of the sanction, goes undetected for long periods of time, reducing good compute and confusing users (who see a model as available while in reality it is not)

Further, I would invite any horde moderator to review the suspicion of any text worker: it will be absurdly high, in the hundreds at least and I have seen several examples of being in the tens-of-thousands. This is almost certainly a direct result of tripping this particular safety and really is a testament to the idea that the values are much too low.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
allow-ci A PR with this label will run through CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants