inference LLama2-7b-half with fp8, there is a bug

bigdl-core-xe               2.4.0b20231101
bigdl-core-xe-esimd         2.4.0b20231101
bigdl-llm                   2.4.0b20231101
------------------------------------------------------
]# numactl -C 0-4 -m 0 python generate.py --repo-id-or-model-path ./pretrained-model/llama2-7b-half/ --n-predict 1024 --prompt "Once upon a time, there existed a little girl who liked to have adventures. She wanted to go to places and meet new people, and have fun"
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:06<00:00,  3.22s/it]
2023-11-01 22:27:43,883 - bigdl.llm.transformers.utils - INFO - Converting the current model to fp8 format......
Traceback (most recent call last):
  File "/home/BigDL/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2/generate.py", line 66, in <module>
  File "/root/anaconda3/envs/bigdl-llm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1841, in from_pretrained
  File "/root/anaconda3/envs/bigdl-llm/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1891, in _from_pretrained
OSError: [Errno 24] Too many open files: './pretrained-model/llama2-7b-half/tokenizer_config.json'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

inference LLama2-7b-half with fp8, there is a bug #9332

bigdl-core-xe 2.4.0b20231101
bigdl-core-xe-esimd 2.4.0b20231101
bigdl-llm 2.4.0b20231101

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

inference LLama2-7b-half with fp8, there is a bug #9332

Description

bigdl-core-xe 2.4.0b20231101 bigdl-core-xe-esimd 2.4.0b20231101 bigdl-llm 2.4.0b20231101

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

bigdl-core-xe 2.4.0b20231101
bigdl-core-xe-esimd 2.4.0b20231101
bigdl-llm 2.4.0b20231101