Skip to content

Commit

Permalink
Fix model checkpoint saving issue when using PEFT (#727)
Browse files Browse the repository at this point in the history
Fix model checkpoint saving issue when using PEFT, the is no check for whether the directory already exists resulting in error when using distributed training

Co-authored-by: gioannides <[email protected]>
  • Loading branch information
gioannides and gioannides authored Nov 14, 2024
1 parent 2897a08 commit ab6be0d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion optimum/neuron/utils/peft_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ def state_dict(self):

adapter_shards_dir_model = os.path.join(output_dir, "adapter_shards", "model")
if not os.path.isdir(adapter_shards_dir_model):
os.makedirs(adapter_shards_dir_model)
os.makedirs(adapter_shards_dir_model, exist_ok=True)

dummy_mod = DummyModule()
neuronx_distributed.trainer.save_checkpoint(
Expand Down

0 comments on commit ab6be0d

Please sign in to comment.