how to use it for tensor? #1571

wangli68 · 2025-03-13T04:04:11Z

When I use xx, because the features are multiples of 8 and 16, the Linear layer can be converted to te Linear, as follows:

(time_embedding): Sequential(
(0): Linear(in_features=256, out_features=5120, bias=True)
(1): SiLU()
(2): Linear(in_features=5120, out_features=5120, bias=True)
)

But when I input the data, because the data is a tensor, the dimension changes from 1 to 256 to 2 (1256), triggering an error as follows:

AssertionError: FP8 execution requires 2D input matrices with height divisible by 8 and width divisible by 16, but got tensor with dims=[1, 256]

Is there any other solution besides disabling TE?

ksivaman · 2025-03-13T11:08:19Z

From #1422:

These divisibility requirements are from FP8 Tensor Cores. The simplest fix is to pad to the nearest multiple of 32, but also this compute seems too small to get full GPU utilization. It may be better to disable FP8 for small layers to avoid the extra overhead.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to use it for tensor? #1571

how to use it for tensor? #1571

wangli68 commented Mar 13, 2025

ksivaman commented Mar 13, 2025

how to use it for tensor? #1571

how to use it for tensor? #1571

Comments

wangli68 commented Mar 13, 2025

ksivaman commented Mar 13, 2025