fix: use weights from base_layer #2141

drbh · 2024-06-28T14:54:36Z

This PR fixes a bug in the rocm path for llama. This bug was introduced in the lora adapter pr #2010 and attempts to incorrectly read .linear.weight from the gate_up_proj.

This throws because gate_up_proj is now of type TensorParallelMultiAdapterLinear which wraps the type it used to be TensorParallelColumnLinear. In order to get the raw weight we need to access it via the base_layer

fxmarty

Thanks there is the same in flash_mistral_modeling.py.

To be fair it is not at all your fault, having this controlflow for rocm here is bad anyway & we still don't have the hosted rocm CI to catch that

fix: use weights from base_layer

c4feb98

drbh mentioned this pull request Jun 28, 2024

AutoModel supports FA2/paged attention #2133

Closed

fxmarty approved these changes Jun 28, 2024

View reviewed changes

Narsil merged commit 25f57e2 into main Jul 1, 2024
9 checks passed

Narsil deleted the fix-rocm-wrapped-adapter-weights branch July 1, 2024 10:58

drbh mentioned this pull request Jul 1, 2024

fix: use the base layers weight in mistral rocm #2155

Merged

glegendre01 pushed a commit that referenced this pull request Jul 2, 2024

fix: use weights from base_layer (#2141)

8b2ed82

ErikKaum pushed a commit that referenced this pull request Jul 26, 2024

fix: use weights from base_layer (#2141)

53e146f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use weights from base_layer #2141

fix: use weights from base_layer #2141

drbh commented Jun 28, 2024

fxmarty left a comment

fix: use weights from base_layer #2141

fix: use weights from base_layer #2141

Conversation

drbh commented Jun 28, 2024

fxmarty left a comment

Choose a reason for hiding this comment