-
Notifications
You must be signed in to change notification settings - Fork 843
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
HfApi.dataset_info gives the same info for names cased differently which leads to returning info for datasets that do not exist.
For example, HfApi.dataset_info('MBZUAI/Bactrian-X') and HfApi.dataset_info('mbzuai/bactrian-x') give the same info while the second dataset mbzuai/bactrian-x doesn't even exist. This leads to weird behavior in the datasets library (datasets are trying to be loaded with different loaders while the second one shouldn't be loaded at all).
Reproduction
from huggingface_hub import HfApi
api = HfApi("https://huggingface.co")
info1 = api.dataset_info('MBZUAI/Bactrian-X')
info2 = api.dataset_info('mbzuai/bactrian-x')
info3 = api.dataset_info('MbZuAi/bactrian-X') # any random casing
info1.id == info2.id == info3.id
>> True
info1.sha == info2.sha == info3.sha
>> TrueLogs
No response
System info
- huggingface_hub version: 0.14.1
- Platform: Linux-5.14.0-1059-oem-x86_64-with-glibc2.31
- Python version: 3.9.16
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: /home/polina/.cache/huggingface/token
- Has saved token ?: False
- Configured git credential helpers: store
- FastAI: N/A
- Tensorflow: 2.11.0
- Torch: 1.13.1
- Jinja2: 3.1.2
- Graphviz: N/A
- Pydot: N/A
- Pillow: 9.4.0
- hf_transfer: N/A
- gradio: N/A
- ENDPOINT: https://huggingface.co
- HUGGINGFACE_HUB_CACHE: /home/polina/.cache/huggingface/hub
- HUGGINGFACE_ASSETS_CACHE: /home/polina/.cache/huggingface/assets
- HF_TOKEN_PATH: /home/polina/.cache/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: FalseMetadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working