Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
154 commits
Select commit Hold shift + click to select a range
e57aeda
Add ggml-openvino base files
YangleiZouIntel Oct 29, 2024
e169d0c
add openvino as optional backend for Llama.cpp ggml
zhanmyz Nov 13, 2024
5db7996
* Configure the device(default CPU) that uses OpenVINO to compile th…
zhanmyz Nov 19, 2024
d632239
Solve the issue of abnormal model output caused by using OpenVINO ADD…
zhanmyz Nov 21, 2024
f78c52a
Add OpenVINO MUL operator to GGML of Llama.cpp.
zhanmyz Dec 2, 2024
6d4d318
Add compile options
zhanmyz Dec 2, 2024
a0495c5
add OpenVINO frontend convert process steps
zhanmyz Dec 4, 2024
0d6cf3a
add get openvino available ops function
zhanmyz Dec 5, 2024
d191a77
Add PoC of integration of openvino frontend. Main changes: ggml-ov-fr…
yumengbo Nov 16, 2024
451e201
Implement GgmlOvDecoder. Add dump functions.
yumengbo Nov 19, 2024
42641b6
Convert subgraph with add, sub, mul, div op to ov model and do infer …
yumengbo Nov 22, 2024
a26a28b
Add GGML_OV_FRONTEND option. Add readme.
yumengbo Nov 22, 2024
59495bf
Change output for infer request to set output tensor. Support scale, …
yumengbo Dec 5, 2024
e17fcbf
add GET_ROWS operator of OpenVINO to GGML of llama.cpp
zhanmyz Dec 9, 2024
d844353
Update build.md and add operation mapping(GGML to OpenVINO)
zhanmyz Dec 10, 2024
13a1af6
add the rms_norm operator implemented using OpenVINO to the GGML back…
zhanmyz Dec 16, 2024
58cd50c
Fix issue for output memory copy of infer request
yumengbo Dec 12, 2024
108ac50
Change to implementation following pytorch frontend
yumengbo Dec 12, 2024
8a12ffa
Add support for UNARY SILU op . Fix pytorch impl bugs.
yumengbo Dec 17, 2024
18d53db
Support Softmax op
yumengbo Dec 18, 2024
f4856e3
Support Softmax op
yumengbo Dec 18, 2024
87ea61d
Support ROPE op.
yumengbo Dec 21, 2024
d200348
Add support for RMS_NORM OP
zhanmyz Dec 19, 2024
a582b3e
Add MUL_MAT,CPY,CONT as operators implemented in OpenVINO for GGML ba…
zhanmyz Jan 14, 2025
e32ac44
Move CPY from GGML OV Backend to OV Frontend
zhanmyz Jan 22, 2025
d1983ec
add implementation of MUL_MAT, CPY, CONT of GGML ops using OV ops
zhanmyz Feb 18, 2025
dadaa6a
add implementation of CPY when the output tensor is non-contiguous
zhanmyz Feb 19, 2025
83fe0fa
add tmp source code files
zhanmyz Feb 25, 2025
8f43a86
Execute singel CONT operator is OK
zhanmyz Feb 25, 2025
3d292bc
Execute CONT & VIEW operators in OV Frontend is OK
zhanmyz Mar 1, 2025
ed4cfd2
OV Frontend supports GET_ROWS/RMS_NORM/MUL/MUL_MAT graph conversion o…
zhanmyz Mar 3, 2025
075b5c4
OV Frontend supports GET_ROWS/RMS_NORM/MUL/MUL_MAT/ROPE/SCALE/SOFTMAX…
zhanmyz Mar 5, 2025
b1c84c5
Change the input parameter shape of CONT operator
zhanmyz Mar 5, 2025
b92a66f
Change the input and ouput node shape of MUL_MAT operator
zhanmyz Mar 5, 2025
d13d0f4
Change the input and ouput node shape of MUL_MAT operator
zhanmyz Mar 5, 2025
2cedfe5
change CONT and MULMAT input node shape
zhanmyz Mar 6, 2025
7028d86
All adjacent ops can conversion but calculation result is wrong and n…
zhanmyz Mar 6, 2025
51e8cc1
1. All operators implemented using OpenVINO can be successfully execu…
zhanmyz Mar 9, 2025
ca2a25e
1. Update the implementation of CPY node when it's non-contiguous
zhanmyz Mar 11, 2025
3714f59
Minor Update
zhanmyz Mar 11, 2025
19ab059
Try to add VIEW node to OV Frontend and have some issues that need to…
zhanmyz Mar 12, 2025
96442b2
1. In the Prompt process and predict first token stage, the PERMUTE n…
zhanmyz Mar 15, 2025
77fbd63
add debug info
zhanmyz Mar 17, 2025
2df7186
Process Prompt and predict first token is OK
zhanmyz Mar 26, 2025
18cb804
1. Solve the AC issue of Permute+VIEW and MULMAL issue in the phase o…
zhanmyz Mar 31, 2025
fc69f10
1. Delete some comments
zhanmyz Mar 31, 2025
a14525e
* Use find_package in CMake to configure OpenVINO
wine99 Apr 14, 2025
e667b8a
change op mappings to list in openvino_supports_op
wine99 Apr 15, 2025
6d06144
2nd+ token correct by fix CPY in OV, remove single op backend compute…
wine99 Apr 15, 2025
30260f9
Arbitrary token len (>32) work; Fix bug in mulmat
wine99 Apr 17, 2025
acc633f
FEAT: do PERMUTE eagerly
wine99 Apr 21, 2025
1636125
FEAT: Add interleaved mode for ROPE
wine99 Apr 22, 2025
95feccb
REFACTOR: support weigts as constant
wine99 Apr 28, 2025
c649ee6
STYLE: minor refactor
wine99 Apr 28, 2025
055be3c
PERF: share const nodes for weights for diff infer
wine99 Apr 28, 2025
10bd7a6
BUILD: update build doc, add cmake preset, add CACHE_DIR env var
wine99 Apr 29, 2025
ff18278
FEAT: improve debug capability
wine99 Apr 30, 2025
797662c
PERF: compile once (dynamic graph + cache)
wine99 May 8, 2025
a0d043e
Rebase - Bring up to date and fix build process
virajwad May 9, 2025
7a0ca42
fix build error
wine99 May 13, 2025
005972f
FIX: backend buffer type issue
wine99 May 13, 2025
e5d4a44
STYLE: clang-format
wine99 May 9, 2025
78f2dbe
FEAT: Add all conversion code from ov side
wine99 May 9, 2025
65b7a8b
PERF: favor low precision matmul
wine99 May 13, 2025
92bf764
STYLE and minor REFACTOR
wine99 May 13, 2025
5fb55c8
FIX: Re-add tensor names in cgraph, Add another case for RESHAPE
wine99 May 14, 2025
0c6844e
FIX: input shape of KQ_mask
wine99 May 14, 2025
b56ef38
PERF: add weight constant in parallel
wine99 May 14, 2025
472023b
FIX: set_max_token_len
wine99 May 16, 2025
d99d234
PERF: use Slice+Concat in writing cache_v
wine99 May 16, 2025
6665e73
Update build doc
wine99 May 20, 2025
699518e
Add cgraph tensor output name to OV op name
wine99 May 22, 2025
d6b6935
Update openvino build instructions
ravi9 May 29, 2025
8c964a7
Add initial NPU support
wine99 May 27, 2025
7aa40d0
draft NPU support version 2: prefill + kvcache
wine99 May 29, 2025
6f63cdf
NPU support version 2: prefill + kvcache
wine99 Jun 3, 2025
04dd48a
Change due to ggml cgraph changes, not correct yet
wine99 Jun 4, 2025
1817161
Change due to ggml cgraph changes, llama-3.2 CPU work
wine99 Jun 16, 2025
c11950a
Add AMD64 to CMakeLists
wine99 Jun 16, 2025
51f43b9
Change due to ggml cgraph changes, all device work
wine99 Jun 16, 2025
f4590a2
Refactor: clean, fix warning
wine99 Jun 20, 2025
f998a1b
Update clang-format
wine99 Jun 23, 2025
1de5dff
Statful transformation for CPU GPU
wine99 Jun 26, 2025
9de368b
Add SwiGLU
wine99 Jul 3, 2025
1de0ea9
Fuse to SDPA
wine99 Jul 3, 2025
ccb8bc3
Replace Concat with Broadcast in MulMat for GQA
wine99 Jul 4, 2025
3510252
Pull out indices creation for kv cache update
wine99 Jul 6, 2025
e465d37
Refactor: remove past_token_len from extra_inputs
wine99 Jul 9, 2025
29f3345
Fix Phi3 SwiGLU and SoftMax
wine99 Jul 9, 2025
510d537
Pull out sin cos from rope
wine99 Jul 9, 2025
4b30837
Reduce memory: free ov weights node after graph conversion
wine99 Jul 11, 2025
33bd760
Fix CPY due to cgraph change
wine99 Jul 17, 2025
f601a69
Added OpenVINO CI/CD. Updated docs
ravi9 Jul 18, 2025
e1104d4
Fix llama-cli
wine99 Jul 23, 2025
5235f13
Fix Phi3 ROPE; Add test-backend-ops
wine99 Jul 21, 2025
3cc9b9d
Fix NPU
wine99 Jul 23, 2025
f24838c
Fix llama-bench; Clang-format
wine99 Jul 24, 2025
8140c6b
Fix llama-perplexity
wine99 Jul 24, 2025
cde44cc
temp. changes for mark decomp
cavusmustafa Jul 29, 2025
513d3a6
matmul in fp32
wine99 Jul 29, 2025
9be4c1d
mulmat input conversion fix
cavusmustafa Jul 30, 2025
a186425
mulmat type conversion update
cavusmustafa Jul 30, 2025
3ec63b7
add mark decomp pass
cavusmustafa Jul 30, 2025
4ae1ca8
Revert changes in fuse_to_sdpa
wine99 Jul 30, 2025
bba2fa9
Update build.md
ravi9 Jul 31, 2025
3204fa2
Fix test-backend-ops
wine99 Jul 31, 2025
7d0587d
Skip test-thread-safety; Run ctest only in ci/run.sh
wine99 Jul 31, 2025
d3ae51e
Use CiD for NPU
wine99 Aug 1, 2025
f99b831
Optimize tensor conversion, improve TTFT
wine99 Aug 4, 2025
453fe10
Support op SET_ROWS
wine99 Aug 13, 2025
0b3ca49
Fix NPU
wine99 Aug 14, 2025
913ac7b
Remove CPY
wine99 Aug 14, 2025
197ad20
Fix test-backend-ops
wine99 Aug 14, 2025
f0eb04f
Minor updates for raising PR
wine99 Aug 14, 2025
2ca9e51
Perf: RMS fused to OV internal RMS op
wine99 Aug 27, 2025
f8457d5
Fix after rebasing
wine99 Sep 4, 2025
2bf043d
Change openvino device_type to GPU; Enable flash_attn
wine99 Sep 5, 2025
7f11c4f
Update supports_buft and supports_op for quantized models
wine99 Aug 5, 2025
9fcab9d
Add quant weight conversion functions from genai gguf reader
wine99 Aug 5, 2025
84395c3
Quant models run with accuracy issue
wine99 Aug 6, 2025
9ddfaa3
Fix accuracy: disable cpu_repack
wine99 Aug 7, 2025
bfb1350
Fix CI; Disable test-backend-ops
wine99 Aug 7, 2025
b9114ad
Fix Q4_1
wine99 Aug 8, 2025
ec4491a
Fix test-thread-safety
wine99 Aug 8, 2025
49eca46
Fix test-backend-ops: Treat quantized tensors as weights
wine99 Aug 12, 2025
6d2f7af
Add NPU Q4_0 support
wine99 Aug 19, 2025
6cb00ed
NPU perf: eliminate zp
wine99 Aug 22, 2025
bf1c0eb
NPU perf: Faster compilation
wine99 Aug 26, 2025
53eda79
Dequantize q4_1 q4_k q6_k for NPU
wine99 Aug 29, 2025
7dfcb6b
Add custom quant type: q8_1_c, q4_0_128
wine99 Sep 2, 2025
b3bb37c
Set m_is_static=false as default in decoder
wine99 Sep 2, 2025
7b526f2
Simpilfy translation of get_rows
wine99 Sep 2, 2025
ad06c0b
Fix after rebasing
wine99 Sep 8, 2025
eb30874
Improve debug util; Eliminate nop ReshapeReshape
wine99 Sep 10, 2025
169d067
STYLE: make get_types_to_requant a function
wine99 Sep 10, 2025
d6db0ae
Support BF16 model
wine99 Sep 11, 2025
121113f
Fix NPU compile
wine99 Sep 12, 2025
382668f
WA for npu 1st token acc issue
wine99 Sep 12, 2025
ef88e6f
Apply EliminateZP only for npu
wine99 Sep 12, 2025
2774507
Add GeGLU
wine99 Sep 15, 2025
3c0c0c7
Fix Hunyuan
wine99 Sep 15, 2025
2e8b899
Support iSWA
wine99 Sep 16, 2025
158129f
Fix NPU accuracy
wine99 Sep 17, 2025
d00aa4c
Fix ROPE accuracy when freq_scale != 1
wine99 Sep 17, 2025
e855e09
Minor: not add attention_size_swa for non-swa model
wine99 Sep 17, 2025
8c95d78
Minor refactor
wine99 Sep 19, 2025
6db0773
Add Q5_K to support phi-3-q4_k_m
wine99 Sep 23, 2025
a436bc7
Requantize Q6_K (gs16) to gs32 on GPU
wine99 Sep 26, 2025
76ab76e
Fix after rebasing
wine99 Sep 26, 2025
c7ac35d
initial kvcachefusion support
cavusmustafa Sep 26, 2025
767f192
openvino backend 4d sdpa
cavusmustafa Sep 29, 2025
5d8f984
openvino backend fix for values shape issue
cavusmustafa Sep 30, 2025
a5fbb53
disable sdpa optimization
cavusmustafa Sep 30, 2025
94d64a8
Merge branch 'ravi9:master' into kvcachefusion_2
cavusmustafa Sep 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 134 additions & 0 deletions .devops/openvino.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
ARG OPENVINO_VERSION_MAJOR=2025.2
ARG OPENVINO_VERSION_FULL=2025.2.0.19140.c01cd93e24d
ARG UBUNTU_VERSION=24.04

# Optional proxy build arguments - empty by default
ARG http_proxy=
ARG https_proxy=

## Build Image
FROM ubuntu:${UBUNTU_VERSION} AS build

# Pass proxy args to build stage
ARG http_proxy
ARG https_proxy

RUN apt-get update && \
apt-get install -y --no-install-recommends \
ca-certificates \
gnupg \
wget \
git \
cmake \
ninja-build \
build-essential \
libtbb12 \
libcurl4-openssl-dev && \
rm -rf /var/lib/apt/lists/*

# Install OpenVINO for Ubuntu 24.04
ARG OPENVINO_VERSION_MAJOR
ARG OPENVINO_VERSION_FULL
RUN mkdir -p /opt/intel && \
wget https://storage.openvinotoolkit.org/repositories/openvino/packages/${OPENVINO_VERSION_MAJOR}/linux/openvino_toolkit_ubuntu24_${OPENVINO_VERSION_FULL}_x86_64.tgz && \
tar -xf openvino_toolkit_ubuntu24_${OPENVINO_VERSION_FULL}_x86_64.tgz && \
mv openvino_toolkit_ubuntu24_${OPENVINO_VERSION_FULL}_x86_64 /opt/intel/openvino_${OPENVINO_VERSION_MAJOR} && \
cd /opt/intel/openvino_${OPENVINO_VERSION_MAJOR} && \
echo "Y" | ./install_dependencies/install_openvino_dependencies.sh && \
cd - && \
ln -s /opt/intel/openvino_${OPENVINO_VERSION_MAJOR} /opt/intel/openvino

ENV OpenVINO_DIR=/opt/intel/openvino

WORKDIR /app

COPY . .

# Build Stage
RUN bash -c "source ${OpenVINO_DIR}/setupvars.sh && \
cmake -B build/ReleaseOV -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_OPENVINO=ON && \
cmake --build build/ReleaseOV -j$(nproc)"

# Copy all necessary libraries
RUN mkdir -p /app/lib && \
find build/ReleaseOV -name '*.so*' -exec cp {} /app/lib \; && \
find ${OpenVINO_DIR}/runtime/lib/intel64 -name '*.so*' -exec cp -P {} /app/lib \; 2>/dev/null || \
find ${OpenVINO_DIR}/lib/intel64 -name '*.so*' -exec cp -P {} /app/lib \;

# Create runtime directories and copy binaries
RUN mkdir -p /app/full \
&& cp build/ReleaseOV/bin/* /app/full/ \
&& cp *.py /app/full \
&& cp -r gguf-py /app/full \
&& cp -r requirements /app/full \
&& cp requirements.txt /app/full \
&& cp .devops/tools.sh /app/full/tools.sh

## Base Runtime Image
FROM ubuntu:${UBUNTU_VERSION} AS base

# Pass proxy args to runtime stage
ARG http_proxy
ARG https_proxy

RUN apt-get update \
&& apt-get install -y libgomp1 libtbb12 curl\
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
&& find /var/cache/apt/archives /var/lib/apt/lists -not -name lock -type f -delete \
&& find /var/cache -type f -delete

COPY --from=build /app/lib/ /app/

### Full (all binaries)
FROM base AS full

ARG http_proxy
ARG https_proxy

COPY --from=build /app/full /app/

WORKDIR /app

RUN apt-get update && \
apt-get install -y --no-install-recommends \
git \
python3 \
python3-venv \
python3-pip && \
python3 -m venv /ov-venv && \
/ov-venv/bin/pip install --no-cache-dir --upgrade pip setuptools wheel && \
/ov-venv/bin/pip install --no-cache-dir -r requirements.txt && \
apt-get autoremove -y && \
apt-get clean && \
rm -rf /tmp/* /var/tmp/* && \
find /var/cache/apt/archives /var/lib/apt/lists -not -name lock -type f -delete && \
find /var/cache -type f -delete

ENTRYPOINT ["/bin/bash", "-c", "source /ov-venv/bin/activate && exec /app/tools.sh \"$@\"", "--"]


### Light, CLI only
FROM base AS light

COPY --from=build /app/full/llama-cli /app/

WORKDIR /app

ENTRYPOINT [ "/app/llama-cli" ]

### Server, Server only
FROM base AS server

ENV LLAMA_ARG_HOST=0.0.0.0

COPY --from=build /app/full/llama-server /app/

WORKDIR /app

HEALTHCHECK CMD [ "curl", "-f", "http://localhost:8080/health" ]

ENTRYPOINT [ "/app/llama-server" ]
39 changes: 39 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -629,6 +629,45 @@ jobs:
-DGGML_SYCL_F16=ON
cmake --build build --config Release -j $(nproc)

ubuntu-24-cmake-openvino:
runs-on: ubuntu-24.04

steps:
- name: Clone
id: checkout
uses: actions/checkout@v4

- name: ccache
uses: hendrikmuhs/[email protected]
with:
key: ubuntu-24-cmake-openvino-no-preset-v1
evict-old-files: 1d

- name: Dependencies
id: depends
run: |
export OPENVINO_VERSION_MAJOR=2025.2
export OPENVINO_VERSION_FULL=2025.2.0.19140.c01cd93e24d
sudo apt-get update
sudo apt-get install -y build-essential libcurl4-openssl-dev libtbb12 cmake ninja-build python3-pip curl wget tar
sudo mkdir -p /opt/intel
wget -O openvino_${OPENVINO_VERSION_MAJOR}.tgz https://storage.openvinotoolkit.org/repositories/openvino/packages/${OPENVINO_VERSION_MAJOR}/linux/openvino_toolkit_ubuntu24_${OPENVINO_VERSION_FULL}_x86_64.tgz
tar -xf openvino_${OPENVINO_VERSION_MAJOR}.tgz
sudo mv openvino_toolkit_ubuntu24_${OPENVINO_VERSION_FULL}_x86_64 /opt/intel/openvino_${OPENVINO_VERSION_MAJOR}
rm openvino_${OPENVINO_VERSION_MAJOR}.tgz
cd /opt/intel/openvino_${OPENVINO_VERSION_MAJOR}
echo "Y" | sudo -E ./install_dependencies/install_openvino_dependencies.sh && cd -
sudo ln -s /opt/intel/openvino_${OPENVINO_VERSION_MAJOR} /opt/intel/openvino

- name: Build
id: cmake_build
run: |
source /opt/intel/openvino/setupvars.sh
cmake -B build/ReleaseOV -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_OPENVINO=ON
cmake --build build/ReleaseOV --config Release -j $(nproc)

build-linux-cross:
uses: ./.github/workflows/build-linux-cross.yml

Expand Down
1 change: 1 addition & 0 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ jobs:
- { tag: "intel", dockerfile: ".devops/intel.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true }
- { tag: "vulkan", dockerfile: ".devops/vulkan.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false }
- { tag: "s390x", dockerfile: ".devops/s390x.Dockerfile", platforms: "linux/s390x", full: true, light: true, server: true, free_disk_space: false }
- { tag: "openvino", dockerfile: ".devops/openvino.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false }
# Note: the rocm images are failing due to a compiler error and are disabled until this is fixed to allow the workflow to complete
#- {tag: "rocm", dockerfile: ".devops/rocm.Dockerfile", platforms: "linux/amd64,linux/arm64", full: true, light: true, server: true, free_disk_space: true }
steps:
Expand Down
57 changes: 57 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,63 @@ jobs:
path: llama-${{ steps.tag.outputs.name }}-bin-ubuntu-vulkan-x64.zip
name: llama-bin-ubuntu-vulkan-x64.zip

ubuntu-24-openvino:
runs-on: ubuntu-24.04

steps:
- name: Clone
id: checkout
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: ccache
uses: hendrikmuhs/[email protected]
with:
key: ubuntu-24-cmake-openvino-release-no-preset-v1
evict-old-files: 1d

- name: Dependencies
id: depends
run: |
export OPENVINO_VERSION_MAJOR=2025.2
export OPENVINO_VERSION_FULL=2025.2.0.19140.c01cd93e24d
sudo apt-get update
sudo apt-get install -y build-essential libcurl4-openssl-dev libtbb12 cmake ninja-build python3-pip curl wget tar
sudo mkdir -p /opt/intel
wget -O openvino_${OPENVINO_VERSION_MAJOR}.tgz https://storage.openvinotoolkit.org/repositories/openvino/packages/${OPENVINO_VERSION_MAJOR}/linux/openvino_toolkit_ubuntu24_${OPENVINO_VERSION_FULL}_x86_64.tgz
tar -xf openvino_${OPENVINO_VERSION_MAJOR}.tgz
sudo mv openvino_toolkit_ubuntu24_${OPENVINO_VERSION_FULL}_x86_64 /opt/intel/openvino_${OPENVINO_VERSION_MAJOR}
rm openvino_${OPENVINO_VERSION_MAJOR}.tgz
cd /opt/intel/openvino_${OPENVINO_VERSION_MAJOR}
echo "Y" | sudo -E ./install_dependencies/install_openvino_dependencies.sh && cd -
sudo ln -s /opt/intel/openvino_${OPENVINO_VERSION_MAJOR} /opt/intel/openvino

- name: Build
id: cmake_build
run: |
source /opt/intel/openvino/setupvars.sh
cmake -B build/ReleaseOV -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_OPENVINO=ON
cmake --build build/ReleaseOV --config Release -j $(nproc)

- name: Determine tag name
id: tag
uses: ./.github/actions/get-tag-name

- name: Pack artifacts
id: pack_artifacts
run: |
cp LICENSE ./build/ReleaseOV/bin/
zip -r llama-${{ steps.tag.outputs.name }}-bin-ubuntu-openvino-x64.zip ./build/ReleaseOV/bin/*

- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
path: llama-${{ steps.tag.outputs.name }}-bin-ubuntu-openvino-x64.zip
name: llama-bin-ubuntu-openvino-x64.zip

windows-cpu:
runs-on: windows-2025

Expand Down
Loading