Skip to content

Conversation

@12010486
Copy link
Collaborator

What does this PR do?

Refactor of #2128 after v1.19-release branch got updated.
Copying here the original description, for reference.

Latest version of datasets, is not supporting trust_remote_code anymore, and with that any loading scripts.

As a consequence, when running

PT_HPU_LAZY_MODE=1 HF_DATASETS_TRUST_REMOTE_CODE=true QUANT_CONFIG=/root/optimum-habana/examples/text-generation//quantization_config//maxabs_measure.json  TQDM_DISABLE=1 python3  run_lm_eval.py --model_name_or_path meta-llama/Llama-3.1-8B-Instruct --warmup 0 --use_hpu_graphs -o test_results_measure.json --bf16 --batch_size 1 --use_kv_cache --trim_logits --attn_softmax_bf16 --bucket_size=128 --bucket_internal --trust_remote_code --tasks hellaswag

If datasets==4.0.0 is downloaded, we get:

`trust_remote_code` is not supported anymore.
Please check that the Hugging Face dataset 'hellaswag' isn't based on a loading script and remove `trust_remote_code`.
If the dataset is based on a loading script, please ask the dataset author to remove it and convert it to a standard format like Parquet.
07/10/2025 09:11:55 - ERROR - datasets.load - `trust_remote_code` is not supported anymore.
Please check that the Hugging Face dataset 'hellaswag' isn't based on a loading script and remove `trust_remote_code`.
If the dataset is based on a loading script, please ask the dataset author to remove it and convert it to a standard format like Parquet.
README.md: 6.84kB [00:00, 10.9MB/s]
hellaswag.py: 4.36kB [00:00, 8.86MB/s]
Traceback (most recent call last):
  File "/root/optimum-habana/examples/text-generation/run_lm_eval.py", line 384, in <module>
    main()
  File "/root/optimum-habana/examples/text-generation/run_lm_eval.py", line 347, in main
    results = evaluator.simple_evaluate(lm, tasks=args.tasks, limit=args.limit_iters, log_samples=log_samples)
  File "/usr/local/lib/python3.10/dist-packages/lm_eval/utils.py", line 422, in _wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/lm_eval/evaluator.py", line 240, in simple_evaluate
    task_dict = get_task_dict(tasks, task_manager)
  File "/usr/local/lib/python3.10/dist-packages/lm_eval/tasks/__init__.py", line 619, in get_task_dict
    task_name_from_string_dict = task_manager.load_task_or_group(
  File "/usr/local/lib/python3.10/dist-packages/lm_eval/tasks/__init__.py", line 415, in load_task_or_group
    collections.ChainMap(*map(self._load_individual_task_or_group, task_list))
  File "/usr/local/lib/python3.10/dist-packages/lm_eval/tasks/__init__.py", line 315, in _load_individual_task_or_group
    return _load_task(task_config, task=name_or_config)
  File "/usr/local/lib/python3.10/dist-packages/lm_eval/tasks/__init__.py", line 281, in _load_task
    task_object = ConfigurableTask(config=config)
  File "/usr/local/lib/python3.10/dist-packages/lm_eval/api/task.py", line 823, in __init__
    self.download(self.config.dataset_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/lm_eval/api/task.py", line 934, in download
    self.dataset = datasets.load_dataset(
  File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 1392, in load_dataset
    builder_instance = load_dataset_builder(
  File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 1132, in load_dataset_builder
    dataset_module = dataset_module_factory(
  File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 1031, in dataset_module_factory
    raise e1 from None
  File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 989, in dataset_module_factory
    raise RuntimeError(f"Dataset scripts are no longer supported, but found {filename}")
RuntimeError: Dataset scripts are no longer supported, but found hellaswag.py

@astachowiczhabana astachowiczhabana merged commit 4161cbe into huggingface:v1.19-release Jul 18, 2025
1 check passed
@regisss
Copy link
Collaborator

regisss commented Jul 21, 2025

I think this should be done for LM eval only.
What are the datasets where this issue was raised for audio classification, stable diffusion training and text generation?

@12010486
Copy link
Collaborator Author

@regisss, on stable-diffusion there was an issue in ControlNet training, with fusing/fill50k dataset. On text-generation with https://huggingface.co/datasets/JulesBelveze/tldr_news, in audio classification with the dataset you already added in your draft, and superb

@12010486
Copy link
Collaborator Author

I've basically checked the examples that had --trust_remote_code, even if we are not explicitly testing them in CI, and only in case the datasets version was moved already to >=3.0.2

@regisss
Copy link
Collaborator

regisss commented Jul 21, 2025

After looking more into it, there is another issue with Datasets v4: it relies on torchcodec for audio decoding, which is compatible Torch 2.7 only. So let's keep these changes and we'll update everything once torchcodec is supported on Gaudi.

@12010486 12010486 deleted the trust_remote_unsupported_upd branch July 31, 2025 15:34
gplutop7 pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Oct 15, 2025
…huggingface#453)

Fix for datasets based on a loading script

Co-authored-by: Silvia Colabrese <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants