Chatglm2-6b will throw error when run on cpu but import ipex

When I change gpu script to run on cpu but keep ipex imported, chatglm2-6b will throw the following error.
After removing the import of ipex, everything works fine. Llama2 is OK even import ipex.
This issue is of low priority, just record it here in case others meet the same issue.
```
(Triggered internally at /build/intel-pytorch-extension/csrc/gpu/jit/fusion_pass.cpp:826.)
  query_layer = apply_rotary_pos_emb(query_layer, rotary_pos_emb)
Traceback (most recent call last):
  File "/home/arda/kai/BigDL/python/llm/example/gpu/hf-transformers-models/chatglm2/./generate.py", line 67, in <module>
    output = model.generate(input_ids,
  File "/home/arda/anaconda3/envs/kai-llm-pip-nf4/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/arda/kai/BigDL/python/llm/example/gpu/hf-transformers-models/chatglm2/benchmark_util.py", line 1564, in generate
    return self.greedy_search(
  File "/home/arda/kai/BigDL/python/llm/example/gpu/hf-transformers-models/chatglm2/benchmark_util.py", line 2385, in greedy_search
    outputs = self(
  File "/home/arda/kai/BigDL/python/llm/example/gpu/hf-transformers-models/chatglm2/benchmark_util.py", line 531, in __call__
    return self.model(*args, **kwargs)
  File "/home/arda/anaconda3/envs/kai-llm-pip-nf4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/arda/.cache/huggingface/modules/transformers_modules/modeling_chatglm.py", line 845, in forward
    transformer_outputs = self.transformer(
  File "/home/arda/anaconda3/envs/kai-llm-pip-nf4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/arda/.cache/huggingface/modules/transformers_modules/modeling_chatglm.py", line 741, in forward
    hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
  File "/home/arda/anaconda3/envs/kai-llm-pip-nf4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/arda/.cache/huggingface/modules/transformers_modules/modeling_chatglm.py", line 588, in forward
    hidden_states, kv_cache = layer(
  File "/home/arda/anaconda3/envs/kai-llm-pip-nf4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/arda/.cache/huggingface/modules/transformers_modules/modeling_chatglm.py", line 510, in forward
    attention_output, kv_cache = self.self_attention(
  File "/home/arda/anaconda3/envs/kai-llm-pip-nf4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/arda/anaconda3/envs/kai-llm-pip-nf4/lib/python3.9/site-packages/bigdl/llm/transformers/models/chatglm2.py", line 124, in chatglm2_attention_forward_8eb45c
    query_layer = apply_rotary_pos_emb(query_layer, rotary_pos_emb)
NotImplementedError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: Could not run 'torch_ipex::mul_add' with arguments from the 'CPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torch_ipex::mul_add' is only available for these backends: [XPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].

XPU: registered at /build/intel-pytorch-extension/csrc/gpu/aten/operators/TripleOps.cpp:510 [kernel]
BackendSelect: fallthrough registered at /build/pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at /build/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:144 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /build/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:491 [backend fallback]
Functionalize: registered at /build/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:280 [backend fallback]
Named: registered at /build/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at /build/pytorch/aten/src/ATen/ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at /build/pytorch/aten/src/ATen/native/NegateFallback.cpp:19 [backend fallback]
ZeroTensor: registered at /build/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at /build/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:63 [backend fallback]
AutogradOther: fallthrough registered at /build/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:30 [backend fallback]
AutogradCPU: fallthrough registered at /build/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:34 [backend fallback]
AutogradCUDA: fallthrough registered at /build/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:42 [backend fallback]
AutogradXLA: fallthrough registered at /build/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:46 [backend fallback]
AutogradMPS: fallthrough registered at /build/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:54 [backend fallback]
AutogradXPU: fallthrough registered at /build/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:38 [backend fallback]
AutogradHPU: fallthrough registered at /build/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:67 [backend fallback]
AutogradLazy: fallthrough registered at /build/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:50 [backend fallback]
AutogradMeta: fallthrough registered at /build/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:58 [backend fallback]
Tracer: registered at /build/pytorch/torch/csrc/autograd/TraceTypeManual.cpp:294 [backend fallback]
AutocastCPU: fallthrough registered at /build/pytorch/aten/src/ATen/autocast_mode.cpp:487 [backend fallback]
AutocastXPU: registered at /build/intel-pytorch-extension/csrc/gpu/aten/operators/TripleOps.cpp:510 [kernel]
AutocastCUDA: fallthrough registered at /build/pytorch/aten/src/ATen/autocast_mode.cpp:354 [backend fallback]
FuncTorchBatched: registered at /build/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:815 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /build/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at /build/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1073 [backend fallback]
VmapMode: fallthrough registered at /build/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /build/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:210 [backend fallback]
PythonTLSSnapshot: registered at /build/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:152 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /build/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:487 [backend fallback]
PythonDispatcher: registered at /build/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:148 [backend fallback]
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Chatglm2-6b will throw error when run on cpu but import ipex #9207

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Chatglm2-6b will throw error when run on cpu but import ipex #9207

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions