Important changes
Deepseek R1 is fully supported on both AMD and Nvidia !
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data \
ghcr.io/huggingface/text-generation-inference:3.1.0 --model-id deepseek-ai/DeepSeek-R1
What's Changed
- Attempt to remove AWS S3 flaky cache for sccache by @mfuntowicz in #2953
- Update to attention-kernels 0.2.0 by @danieldk in #2950
- fix: Telemetry by @Hugoch in #2957
- Fixing the oom maybe with 2.5.1 change. by @Narsil in #2958
- Add backend name to telemetry by @Hugoch in #2962
- Add fp8 support moe models by @mht-sharma in #2928
- Update to moe-kernels 0.8.0 by @danieldk in #2966
- Hotfixing intel-cpu (not sure how it was working before). by @Narsil in #2967
- Add deepseekv3 by @Narsil in #2968
- doc: Update TRTLLM deployment doc. by @Hugoch in #2960
- Update moe-kernel to 0.8.2 for rocm by @mht-sharma in #2977
- Prepare for release 3.1.0 by @Narsil in #2972
Full Changelog: v3.0.2...v3.1.0