openvinotoolkit · MaximProshin · Apr 8, 2025 · Mar 26, 2025 · Mar 27, 2025 · Mar 27, 2025
@@ -1,5 +1,56 @@
 # Release Notes
 
+## New in Release 2.16.0
+
+Post-training Quantization:
+
+- Breaking changes:
+  - ...
+- General:
+  - ...
+- Features:
+  - (Torch) Introduced a novel weight compression method to significantly improve the accuracy of Large Language Models (LLMs) with int4 weights. Leveraging Quantization-Aware Training (QAT) and absorbable LoRA adapters, this approach can achieve a 2x reduction in accuracy loss during compression compared to the best post-training weight compression technique in NNCF (Scale Estimation + AWQ + GPTQ). The `nncf.compress_weight` API now includes a new `compression_format` option, `CompressionFormat.FQ_LORA`, for this QAT method, and a sample compression pipeline with preview support is available [here](examples/llm_compression/torch/qat_with_lora).
+  - (Torch) Add support for 4-bit weight compression, along with AWQ and Scale Estimation data-aware methods to reduce quality loss after compression.
+- Fixes:
+  - Fixed occasional failures of weight compression algorithm on ARM CPUs.
+  - (Torch) Fixed weight compression for float16/bfloat16 models.
+- Improvements:
+  - Reduced the run time and peak memory of mixed precision assignment procedure during weight compression in the OpenVINO backend. Overall compression time reduction in mixed precision case is about 20-40%; peak memory reduction is about 20%.
+  - (TorchFX, Experimental) Added quantization support for (TorchFX)[https://pytorch.org/docs/stable/fx.html] models exported with dynamic shapes.
+- Deprecations/Removals:
+  - ...
+- Tutorials:
+  - ...
+- Known issues:
+  - ...
+
+Compression-aware training:
+
+- Breaking changes:
+  - ...
+- General:
+  - ...
+- Features:
+  - ...
+- Fixes:
+  - ...
+- Improvements:
+  - ...
+- Deprecations/Removals:
+  - ...
+- Tutorials:
+  - ...
+- Known issues:
+  - ...
+
+Deprecations/Removals:
+
+- ...
+
+Requirements:
+
+- Updated PyTorch (2.6.0) and Torchvision (0.21.0) versions.
+
 ## New in Release 2.15.0
 
 Post-training Quantization: