@@ -23,19 +23,14 @@ Our vision for KTransformers is to serve as a flexible platform for experimentin
2323
2424<h2 id =" Updates " >🔥 Updates</h2 >
2525
26+ * ** July 26, 2025** : Support SmallThinker and GLM4-MoE. ([ Tutorial] ( ./doc/en/SmallThinker_and_Glm4moe.md ) )
2627* ** July 11, 2025** : Support Kimi-K2. ([ Tutorial] ( ./doc/en/Kimi-K2.md ) )
27-
2828* ** June 30, 2025** : Support 3-layer (GPU-CPU-Disk) [ prefix cache] ( ./doc/en/prefix_cache.md ) reuse.
29-
3029* ** May 14, 2025** : Support Intel Arc GPU ([ Tutorial] ( ./doc/en/xpu.md ) ).
31-
3230* ** Apr 29, 2025** : Support AMX-Int8、 AMX-BF16 and Qwen3MoE ([ Tutorial] ( ./doc/en/AMX.md ) )
3331
3432https://github.com/user-attachments/assets/fafe8aec-4e22-49a8-8553-59fb5c6b00a2
3533
36-
37-
38-
3934* ** Apr 9, 2025** : Experimental support for LLaMA 4 models ([ Tutorial] ( ./doc/en/llama4.md ) ).
4035* ** Apr 2, 2025** : Support Multi-concurrency. ([ Tutorial] ( ./doc/en/balance-serve.md ) ).
4136
@@ -65,7 +60,7 @@ https://github.com/user-attachments/assets/ebd70bfa-b2c1-4abb-ae3b-296ed38aa285
6560</p >
6661
6762- ** [ NEW!!!] Local 671B DeepSeek-Coder-V3/R1:** Running its Q4_K_M version using only 14GB VRAM and 382GB DRAM([ Tutorial] ( ./doc/en/DeepseekR1_V3_tutorial.md ) ).
68-
63+
6964 - Prefill Speed (tokens/s):
7065 - KTransformers: 54.21 (32 cores) → 74.362 (dual-socket, 2×32 cores) → 255.26 (optimized AMX-based MoE kernel, V0.3 only) → 286.55 (selectively using 6 experts, V0.3 only)
7166 - Compared to 10.31 tokens/s in llama.cpp with 2×32 cores, achieving up to ** 27.79× speedup** .
@@ -131,7 +126,6 @@ we have already supported vendors:
131126- Kunpeng
132127- AMD
133128
134-
135129### 📥 Installation
136130
137131To install KTransformers, follow the official [ Installation Guide] ( https://kvcache-ai.github.io/ktransformers/en/install.html ) .
@@ -201,3 +195,4 @@ If you have any questions, feel free to open an issue. Alternatively, you can jo
201195<h2 id="FAQ">🙋 FAQ</h2>
202196
203197Some common questions are answered in the [FAQ](doc/en/FAQ.md).
198+
0 commit comments