Skip to content

Support MXFP8/NVFP4 tensors with pre-swizzled scales #2446

@timmoon10

Description

@timmoon10

Is your feature request related to a problem? Please describe.

Disentangle scale swizzling from GEMM.

Describe the solution you'd like

  • Add logic in C++ tensor class whether row-wise/col-wise scales are swizzled or not
  • Add logic in PyTorch quantized tensors whether row-wise/col-wise scales are swizzled
  • Add hints in quantizer whether pre-swizzling is helpful

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions