Skip to content

Commit 16cf76b

Browse files
authored
Merge branch 'main' into codegen/rag_agents
2 parents 7b848c0 + 0392610 commit 16cf76b

15 files changed

+124
-983
lines changed

.github/workflows/dockerhub-description.yml

+84-951
Large diffs are not rendered by default.

AvatarChatbot/tests/test_compose_on_gaudi.sh

+5-4
Original file line numberDiff line numberDiff line change
@@ -86,15 +86,16 @@ function start_services() {
8686
docker compose up -d > ${LOG_PATH}/start_services_with_compose.log
8787
n=0
8888
until [[ "$n" -ge 200 ]]; do
89-
docker logs tgi-gaudi-server > $LOG_PATH/tgi_service_start.log
90-
if grep -q Connected $LOG_PATH/tgi_service_start.log; then
89+
docker logs tgi-gaudi-server > $LOG_PATH/tgi_service_start.log && docker logs whisper-service 2>&1 | tee $LOG_PATH/whisper_service_start.log && docker logs speecht5-service 2>&1 | tee $LOG_PATH/speecht5_service_start.log
90+
if grep -q Connected $LOG_PATH/tgi_service_start.log && grep -q running $LOG_PATH/whisper_service_start.log && grep -q running $LOG_PATH/speecht5_service_start.log; then
9191
break
9292
fi
93-
sleep 5s
93+
sleep 10s
9494
n=$((n+1))
9595
done
9696
echo "All services are up and running"
97-
sleep 5s
97+
# sleep 5s
98+
sleep 1m
9899
}
99100

100101

AvatarChatbot/tests/test_compose_on_xeon.sh

+6-10
Original file line numberDiff line numberDiff line change
@@ -85,15 +85,16 @@ function start_services() {
8585
# Start Docker Containers
8686
docker compose up -d
8787
n=0
88-
until [[ "$n" -ge 100 ]]; do
89-
docker logs tgi-service > $LOG_PATH/tgi_service_start.log
90-
if grep -q Connected $LOG_PATH/tgi_service_start.log; then
88+
until [[ "$n" -ge 200 ]]; do
89+
docker logs tgi-service > $LOG_PATH/tgi_service_start.log && docker logs whisper-service 2>&1 | tee $LOG_PATH/whisper_service_start.log && docker logs speecht5-service 2>&1 | tee $LOG_PATH/speecht5_service_start.log
90+
if grep -q Connected $LOG_PATH/tgi_service_start.log && grep -q running $LOG_PATH/whisper_service_start.log && grep -q running $LOG_PATH/speecht5_service_start.log; then
9191
break
9292
fi
93-
sleep 5s
93+
sleep 10s
9494
n=$((n+1))
9595
done
9696
echo "All services are up and running"
97+
sleep 1m
9798
}
9899

99100

@@ -104,6 +105,7 @@ function validate_megaservice() {
104105
if [[ $result == *"mp4"* ]]; then
105106
echo "Result correct."
106107
else
108+
echo "Result wrong, print docker logs."
107109
docker logs whisper-service > $LOG_PATH/whisper-service.log
108110
docker logs speecht5-service > $LOG_PATH/speecht5-service.log
109111
docker logs tgi-service > $LOG_PATH/tgi-service.log
@@ -117,19 +119,13 @@ function validate_megaservice() {
117119
}
118120

119121

120-
#function validate_frontend() {
121-
122-
#}
123-
124-
125122
function stop_docker() {
126123
cd $WORKPATH/docker_compose/intel/cpu/xeon
127124
docker compose down
128125
}
129126

130127

131128
function main() {
132-
133129
stop_docker
134130
if [[ "$IMAGE_REPO" == "opea" ]]; then build_docker_images; fi
135131
start_services

ChatQnA/README.md

+8-7
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,14 @@ RAG bridges the knowledge gap by dynamically fetching relevant information from
1515

1616
## 🤖 Automated Terraform Deployment using Intel® Optimized Cloud Modules for **Terraform**
1717

18-
| Cloud Provider | Intel Architecture | Intel Optimized Cloud Module for Terraform | Comments |
19-
| -------------------- | --------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------- |
20-
| AWS | 4th Gen Intel Xeon with Intel AMX | [AWS Module](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna) | Uses meta-llama/Meta-Llama-3-8B-Instruct by default |
21-
| AWS Falcon2-11B | 4th Gen Intel Xeon with Intel AMX | [AWS Module with Falcon11B](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna-falcon11B) | Uses TII Falcon2-11B LLM Model |
22-
| GCP | 5th Gen Intel Xeon with Intel AMX | [GCP Module](https://github.com/intel/terraform-intel-gcp-vm/tree/main/examples/gen-ai-xeon-opea-chatqna) | Also supports Confidential AI by using Intel® TDX with 4th Gen Xeon |
23-
| Azure | 5th Gen Intel Xeon with Intel AMX | Work-in-progress | Work-in-progress |
24-
| Intel Tiber AI Cloud | 5th Gen Intel Xeon with Intel AMX | Work-in-progress | Work-in-progress |
18+
| Cloud Provider | Intel Architecture | Intel Optimized Cloud Module for Terraform | Comments |
19+
| -------------------- | ------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------- |
20+
| AWS | 4th Gen Intel Xeon with Intel AMX | [AWS Deployment](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna) | Uses meta-llama/Meta-Llama-3-8B-Instruct by default |
21+
| AWS Falcon2-11B | 4th Gen Intel Xeon with Intel AMX | [AWS Deployment with Falcon11B](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna-falcon11B) | Uses TII Falcon2-11B LLM Model |
22+
| AWS Falcon3 | 4th Gen Intel Xeon with Intel AMX | [AWS Deployment with Falcon3](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna-falcon3) | Uses TII Falcon3 LLM Model |
23+
| GCP | 4th/5th Gen Intel Xeon with Intel AMX & Intel TDX | [GCP Deployment](https://github.com/intel/terraform-intel-gcp-vm/tree/main/examples/gen-ai-xeon-opea-chatqna) | Supports Confidential AI by using Intel® TDX with 4th Gen Xeon |
24+
| Azure | 4th/5th Gen Intel Xeon with Intel AMX & Intel TDX | [Azure Deployment](https://github.com/intel/terraform-intel-azure-linux-vm/tree/main/examples/azure-gen-ai-xeon-opea-chatqna-tdx) | Supports Confidential AI by using Intel® TDX with 4th Gen Xeon |
25+
| Intel Tiber AI Cloud | 5th Gen Intel Xeon with Intel AMX | Work-in-progress | Work-in-progress |
2526

2627
## Automated Deployment to Ubuntu based system (if not using Terraform) using Intel® Optimized Cloud Modules for **Ansible**
2728

ChatQnA/docker_compose/intel/cpu/xeon/compose.yaml

+2-1
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ services:
9696
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
9797
LLM_MODEL_ID: ${LLM_MODEL_ID}
9898
VLLM_TORCH_PROFILER_DIR: "/mnt"
99+
VLLM_CPU_KVCACHE_SPACE: 40
99100
healthcheck:
100101
test: ["CMD-SHELL", "curl -f http://$host_ip:9009/health || exit 1"]
101102
interval: 10s
@@ -124,7 +125,7 @@ services:
124125
- RERANK_SERVER_HOST_IP=tei-reranking-service
125126
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
126127
- LLM_SERVER_HOST_IP=vllm-service
127-
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
128+
- LLM_SERVER_PORT=80
128129
- LLM_MODEL=${LLM_MODEL_ID}
129130
- LOGFLAG=${LOGFLAG}
130131
ipc: host

ChatQnA/docker_compose/intel/cpu/xeon/compose_milvus.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,7 @@ services:
183183
- RERANK_SERVER_HOST_IP=tei-reranking-service
184184
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
185185
- LLM_SERVER_HOST_IP=vllm-service
186-
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
186+
- LLM_SERVER_PORT=80
187187
- LLM_MODEL=${LLM_MODEL_ID}
188188
- LOGFLAG=${LOGFLAG}
189189
ipc: host

ChatQnA/docker_compose/intel/cpu/xeon/compose_pinecone.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ services:
107107
- RERANK_SERVER_HOST_IP=tei-reranking-service
108108
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
109109
- LLM_SERVER_HOST_IP=vllm-service
110-
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
110+
- LLM_SERVER_PORT=80
111111
- LOGFLAG=${LOGFLAG}
112112
- LLM_MODEL=${LLM_MODEL_ID}
113113
ipc: host

ChatQnA/docker_compose/intel/cpu/xeon/compose_qdrant.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ services:
113113
- RERANK_SERVER_HOST_IP=tei-reranking-service
114114
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
115115
- LLM_SERVER_HOST_IP=vllm-service
116-
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
116+
- LLM_SERVER_PORT=80
117117
- LLM_MODEL=${LLM_MODEL_ID}
118118
- LOGFLAG=${LOGFLAG}
119119
ipc: host

ChatQnA/docker_compose/intel/cpu/xeon/compose_tgi.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ services:
113113
- RERANK_SERVER_HOST_IP=tei-reranking-service
114114
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
115115
- LLM_SERVER_HOST_IP=tgi-service
116-
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
116+
- LLM_SERVER_PORT=80
117117
- LLM_MODEL=${LLM_MODEL_ID}
118118
- LOGFLAG=${LOGFLAG}
119119
ipc: host

ChatQnA/docker_compose/intel/cpu/xeon/compose_without_rerank.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ services:
9494
- EMBEDDING_SERVER_PORT=${EMBEDDING_SERVER_PORT:-80}
9595
- RETRIEVER_SERVICE_HOST_IP=retriever
9696
- LLM_SERVER_HOST_IP=vllm-service
97-
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
97+
- LLM_SERVER_PORT=80
9898
- LLM_MODEL=${LLM_MODEL_ID}
9999
- LOGFLAG=${LOGFLAG}
100100
- CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_NO_RERANK}

ChatQnA/docker_compose/intel/hpu/gaudi/compose.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ services:
133133
- RERANK_SERVER_HOST_IP=tei-reranking-service
134134
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
135135
- LLM_SERVER_HOST_IP=vllm-service
136-
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
136+
- LLM_SERVER_PORT=80
137137
- LLM_MODEL=${LLM_MODEL_ID}
138138
- LOGFLAG=${LOGFLAG}
139139
ipc: host

ChatQnA/docker_compose/intel/hpu/gaudi/compose_guardrails.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -166,7 +166,7 @@ services:
166166
- RERANK_SERVER_HOST_IP=tei-reranking-service
167167
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
168168
- LLM_SERVER_HOST_IP=vllm-service
169-
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
169+
- LLM_SERVER_PORT=80
170170
- LLM_MODEL=${LLM_MODEL_ID}
171171
- LOGFLAG=${LOGFLAG}
172172
- CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_GUARDRAILS}

ChatQnA/docker_compose/intel/hpu/gaudi/compose_tgi.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ services:
127127
- RERANK_SERVER_HOST_IP=tei-reranking-service
128128
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
129129
- LLM_SERVER_HOST_IP=tgi-service
130-
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
130+
- LLM_SERVER_PORT=80
131131
- LLM_MODEL=${LLM_MODEL_ID}
132132
- LOGFLAG=${LOGFLAG}
133133
ipc: host

ChatQnA/docker_compose/intel/hpu/gaudi/compose_without_rerank.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ services:
9999
- EMBEDDING_SERVER_PORT=${EMBEDDING_SERVER_PORT:-80}
100100
- RETRIEVER_SERVICE_HOST_IP=retriever
101101
- LLM_SERVER_HOST_IP=vllm-service
102-
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
102+
- LLM_SERVER_PORT=80
103103
- LLM_MODEL=${LLM_MODEL_ID}
104104
- LOGFLAG=${LOGFLAG}
105105
- CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_NO_RERANK}

CodeGen/README.md

+10-1
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,16 @@ flowchart LR
8989
DP <-.->VDB
9090
```
9191

92-
## Deploy CodeGen Service
92+
## 🤖 Automated Terraform Deployment using Intel® Optimized Cloud Modules for **Terraform**
93+
94+
| Cloud Provider | Intel Architecture | Intel Optimized Cloud Module for Terraform | Comments |
95+
| -------------------- | --------------------------------- | ------------------------------------------------------------------------------------------------------------- | -------- |
96+
| AWS | 4th Gen Intel Xeon with Intel AMX | [AWS Deployment](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-codegen) | |
97+
| GCP | 4th/5th Gen Intel Xeon | [GCP Deployment](https://github.com/intel/terraform-intel-gcp-vm/tree/main/examples/gen-ai-xeon-opea-codegen) | |
98+
| Azure | 4th/5th Gen Intel Xeon | Work-in-progress | |
99+
| Intel Tiber AI Cloud | 5th Gen Intel Xeon with Intel AMX | Work-in-progress | |
100+
101+
## Manual Deployment of CodeGen Service
93102

94103
The CodeGen service can be effortlessly deployed on either Intel Gaudi2 or Intel Xeon Scalable Processor.
95104

0 commit comments

Comments
 (0)