Skip to content

Commit 27266cb

Browse files
committed
Add stable diffusion inference
1 parent 353a179 commit 27266cb

File tree

19 files changed

+1884
-2
lines changed

19 files changed

+1884
-2
lines changed

.dockerignore

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,16 @@ venv/
1111

1212
# Development
1313
models
14-
weights
15-
weights_tf
1614
models_tf
1715
models_sd
16+
weights
17+
weights_tf
18+
weights_sd
1819
.ipynb_checkpoints
1920
deb
2021
*plan
2122
*onnx
2223
*pt
2324
*pth
2425
inference_notebook
26+
bash_scripts

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ venv/
1313
# Development
1414
weights
1515
weights_tf
16+
weights_sd
1617
.ipynb_checkpoints
1718
*.plan
1819
*.onnx

README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,10 @@ I have also prepared some notes here in README, you can explore them too.
7272
1. Make sure to use the same input and output names while creating the Onnx model and during client inference.
7373
1. Take care of the dtypes you are using to compile to Onnx and the ones specified in the `config.pbtxt`. For instance, in the case of transformers tokenizer, it returns dtype int64 and if you use int32 (preferred) in `config.pbtxt`, it will fail.
7474

75+
### Torchscript Backend
76+
77+
1. You can configure the following [parameters](https://github.com/triton-inference-server/pytorch_backend#parameters) when using torchscript platform.
78+
7579
### TensorRT Backend
7680

7781
#### Installation
@@ -91,6 +95,11 @@ Personal recommendation is to run this within a docker container.
9195
1. TensorRT is not supported for each operation and can cause issues. In that case, try upgrading its version but keep in mind the CUDA version and trition of your system. If possible update the CUDA version.
9296
1. FP16 version takes time to compile so take a break.
9397

98+
## Stable Diffusion
99+
100+
1. While compiling Unet with ONNX, it will create multiple files because the model size >2GB.
101+
1. If loading all of these models for a pipeline doesn't work and doesn't show any significant info in the logs, try loading them individually with `--log-verbose=1`.
102+
94103
## Features
95104

96105
### Dynamic Batching

bash_scripts/triton_server_sd.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
docker run --shm-size=16g --ulimit memlock=-1 --ulimit stack=67108864 -p 8000:8000 -p 8001:8001 -p 8002:8002 --rm -it -v ${PWD}/models_sd/:/project/models_sd/ -v ${PWD}/weights_sd/:/project/weights_sd/ triton_cc_sd:0.0.1 tritonserver --model-repository models_sd/torchscript --log-verbose=2 --model-control-mode=poll
File renamed without changes.

dockers/Dockerfile.cpu.sd

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
FROM nvcr.io/nvidia/tritonserver:22.08-py3
2+
3+
ARG PROJECT_PATH=/project
4+
5+
WORKDIR ${PROJECT_PATH}
6+
SHELL ["/bin/bash", "-c"]
7+
8+
COPY requirements ${PROJECT_PATH}/requirements/
9+
10+
RUN pip install --upgrade pip && \
11+
pip install torch==1.13.1 --extra-index-url https://download.pytorch.org/whl/cpu && \
12+
pip install -r ${PROJECT_PATH}/requirements/sd.txt

0 commit comments

Comments
 (0)