This directory is the main documentation hub for kvcompress.
- Spec and implementation notes:
spec/IMPLEMENTATION_SPEC.md - Current stack status:
spec/STACK_NOTES.md - Runtime architecture:
runtime/OPENAI_COMPAT_COMPRESSED_KV.md - Benchmark protocol and reproducibility checklist:
benchmarks/BENCHMARK_PROTOCOL.md
- Benchmark protocol:
benchmarks/BENCHMARK_PROTOCOL.md - Hardware tiers:
benchmarks/TWO_TIER_MATRIX.md - Triton validation runbook:
benchmarks/TRITON_KERNEL_RUNBOOK.md
- Baseline codec comparison:
results/CURRENT_RESULTS.md - TurboAngle MixedKV sweep:
results/TURBOANGLE_MIXEDKV_RESULTS.md - TurboQuant residual sweep:
results/TURBOQUANT_RESULTS.md - Adaptive hybrid policy:
results/HYBRID_POLICY_RESULTS.md - Fused hybrid ablation:
results/FUSED_HYBRID_RESULTS.md - Runtime prototype status:
results/RUNTIME_PROTOTYPE_RESULTS.md - Gemma-family model-path benchmark:
results/GEMMA2_MODEL_PATH_RESULTS.md
- Overview figure:
assets/kvcompress-overview.png - Runtime flow figure:
assets/kvcompress-runtime-flow.png - Post-FWHT angle histogram:
assets/post-fwht-angle-histogram.png