-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misleading documentation #174
Comments
I can also contribute, if you find it relevant. We interact with customers, so might bring in a different perspective |
Maybe we should make it clearer in the README that not all features of TGI are supported on Gaudi and that the doc for this fork is the README. |
I came here to chime in that the documentation is wrong this example crashes during warmup docker run -p 8080:80 |
@endomorphosis Can you please point me at where you find this example in the documentation? I can't find it. |
Hi everyone,
I can see there has been a recent effort to add more documentation on TGI, and I appreciate it. However, there are some sections that are misleading, for example:
docs/source/conceptual/quantization.md
it is describing Quantization with GPTQ and Quantization with bitsandbytes, but to the best of my knowledge, this is not working on Gaudi2 (we tested bitsandbytes and cuda calls are hardcoded in there).
My ask would be if you can prune the bites that are not relevant for Gaudi
The text was updated successfully, but these errors were encountered: