Skip to content

Commit 9d9e44a

Browse files
Removed redundant input_length column from dataset. (#549)
- This is required because, whenever dataset padding is performed there will be an additional input being fed to the model. For causal lm models this did not gave any error as its forward methods has kwargs which caught unnecessary arguments. But for BERT kind of models this gave error. Signed-off-by: meetkuma <[email protected]>
1 parent c67393b commit 9d9e44a

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

QEfficient/finetune/utils/dataset_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ def padding_dataset(train_config, dataset, batch_size):
8181
# Hugging Face Dataset transformation
8282
dataset = dataset.map(lambda x: {"input_length": len(x["input_ids"])})
8383
dataset = dataset.sort("input_length")
84-
84+
dataset = dataset.remove_columns("input_length")
8585
else:
8686
dataset = sorted(dataset, key=lambda x: len(x["input_ids"]))
8787

0 commit comments

Comments
 (0)