Skip to content

Migrating from AffineQuantizedTensor + Layouts to new structure of tensor subclasses #2752

@jerryzh168

Description

@jerryzh168

Update: Our team will evaluate this more before outsourcing the migration to more people in the community

Context:
Previously we use AffineQuantizedTensor for many of our use cases including int4, float8, intx, floatx. It introduces some complicated abstractions like Layout, people have been saying it's a bit hard to understand, and there are many indirections in the code.

As an effort simplify the code base and make it easier to contribute to, we have been adding new features with a different structure in mind. Now we want to structure Tensors by "dtype" and "packing_format", e.g. we'll have Int4PreshuffledTensor, Int8Tensor, Float8Tensor, instead of having AffineQuantizedTensor and multiple layouts.

Please check out our updated docs for the new tensor subclass organization structure and guide for design:

migration status

inference config name current status plan POC status
MXFPInferenceConfig built on v2 n/a - done
NVFP4InferenceConfig built on v2 n/a - done
Float8DynamicActivationInt4WeightConfig built on v2 n/a - done
Int4WeightOnlyConfig v2 and v1 exists deprecate v1 ? ?
Int8DynamicActivationIntxWeightConfig v2 and v1 exists deprecate v1 ? ?
Float8WeightOnlyConfig v2 and v1 exists deprecate v1 ? ?
Float8DynamicActivationFloat8WeightConfig v2 and v1 exists deprecate v1 ? ?
IntxWeightOnlyConfig v2 and v1 exists deprecate v1 ? ?
Float8DynamicActivationFloat8SemiSparseWeightConfig v1 exists create v2, then deprecate v1 ? ?
Int8WeightOnlyConfig v1 exists create v2, then deprecate v1 ? ?
Int8DynamicActivationInt8WeightConfig v1 exists create v2, then deprecate v1 ? ?
Int8DynamicActivationInt4WeightConfig v1 exists move to prototype ? ?
Int4DynamicActivationInt4WeightConfig v1 exists move to prototype ? ?
GemliteUIntXWeightOnlyConfig v1 exists move to prototype ? ?
Float8StaticActivationFloat8WeightConfig v1 exists move to prototype ? ?
UIntXWeightOnlyConfig v1 exists move to prototype ? ?
FPXWeightOnlyConfig v1 exists move to prototype ? ?

appendix

List of things to migrate:
INT8

[migration done, TODO: delete old path after all migration is done] INT4 weight only

[move to prototype] INT4 weight + int8 activation

UINTx Weight Only

[migration done, TODO: delete old path after all migration is done] Int8DynamicActivationIntxWeightConfig

FP8

FPx

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions