[PaddleV3] 修改 Dockerfile 为 Paddle 3.0.0beta 并添加 CI 测试的 blacklist #1061

megemini · 2024-10-09T10:22:12Z

Create A Good Pull Request

修改 Dockerfile 为 Paddle 3.0.0beta，PyTorch ONNX 等一并改为最新的版本
test_benchmark 使用 black.list 过滤测试

默认所有都不测试
后面每次修改模型，black.list 中删掉对应项，CI 中对其进行测试

Dockfile 我本地构建没啥问题：

λ 483e70b23ef6 /home python
Python 3.9.18 (main, Aug 25 2023, 13:20:04) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import paddle
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
>>> import torch
>>> import tensorflow
2024-10-09 06:23:12.329805: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-10-09 06:23:12.360176: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-09 06:23:12.815496: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (2.0.7) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
>>> import onnx
>>> paddle.__version__
'3.0.0-beta1'
>>> torch.__version__
'2.4.1+cu121'
>>> tensorflow.__version__
'2.16.1'
>>> onnx.__version__
'1.17.0'

有个小问题，基础 docker 里面的 python 是 3.9 ，不是 Paddle 支持的最低 3.8 ～不过问题也不大，我这里就沿用了～

但是，Caffe 没有在 Dockerfile 的配置中，这个是咋配置的？

CI 里面我看 PyTorch 跟其他几个是分开测试的，不太清楚我这里脚本有木有问题，先提交一下看看吧～

另外，后面修改的时候，不能保证上面所有框架的最新版本都能通过，中间如果实在适配困难，可能需要再修改一下 Dockerfile ～

关联：#1060

@luotao1 请评审～

下面的文字请保留在PR说明的最后面，并在提完PR后，根据实际情况勾选确认以下情况

Please check the follow step before merging this pull request

Python code style verification
Review all the code diff by yourself
All models(TensorFLow/Caffe/ONNX/PyTorch) testing passed
Details about your pull request, releated issues

If this PR add new model support, please update model_zoo.md and add model to out test model zoos(@luotao1 )

New Model Supported
No New Model Supported

megemini · 2024-10-11T04:38:30Z

Update 20241011

增加 Dockerfile 安装 protobuf 为 3.20.2 版本

Caffe 转换时，原 docker 中的 protobuf 版本太高，因此需要降低版本

megemini · 2024-10-16T14:54:14Z

Update 20241016

修改 dockerfile 中的 cuda 版本为 11.2

CI 服务器上的 cuda 应该是 11.2 ，参考之前的日志：

2024-10-13 20:08:04 LD_LIBRARY_PATH=/usr/local/cuda-11.2/targets/x86_64-linux/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64

麻烦再构建一下镜像试试吧 😅😅😅

@luotao1

luotao1 · 2024-10-17T01:45:39Z

docker/Dockerfile

-    python -m pip install torchmetrics==0.10.2 pytorch_lightning==1.5.3 kornia==0.5.11 hypothesis pre-commit==2.17.0 && \
+    python -m pip install wget timm transformers pandas nose pytest opencv-python==4.6.0.66 allure-pytest && \
+    python -m pip install torch==2.4.1 torchvision torchaudio tensorflow==2.16.1 onnx==1.17.0 onnxruntime && \
+    python -m pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/ && \


https://www.paddlepaddle.org.cn/packages/stable/cu118/

11.2的镜像可以装3.0.0b1么？

luotao1 · 2024-10-17T02:05:20Z

可以回退到上一个commit，路径里export下即可
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.8/compat

before：

after：

镜像我先不重新生成了，我在CI配置里加这句

This reverts commit 7ee86f7.

luotao1

LGTM 版本检查在下一个PR中完成

[PaddleV3] 修复对于 Paddle 版本的检查 #1064

[Update] dockerfile and ci script

4b8c617

megemini mentioned this pull request Oct 9, 2024

[PaddleV3] x2paddle 套件能力建设 Tracking Issue #1060

Open

14 tasks

[Update] dockerfile of protobuf version

87d1587

PaddlePaddle locked and limited conversation to collaborators Oct 11, 2024

PaddlePaddle unlocked this conversation Oct 11, 2024

megemini added 2 commits October 13, 2024 19:53

[Update] easyocr version

71376b7

[Update] easyocr version

1aeed0d

[Fix] docker cuda version

7ee86f7

luotao1 reviewed Oct 17, 2024

View reviewed changes

Revert "[Fix] docker cuda version"

01eb968

This reverts commit 7ee86f7.

luotao1 approved these changes Oct 17, 2024

View reviewed changes

luotao1 merged commit d432e06 into PaddlePaddle:develop Oct 17, 2024
3 of 4 checks passed

megemini mentioned this pull request Oct 17, 2024

[PaddleV3] 添加 pytorch 中 aten::list 的算子映射并修复相关模型 #1075

Merged

luotao1 added the contributor External developers label Oct 17, 2024

megemini mentioned this pull request Oct 18, 2024

[PaddleV3] 添加 pytorch 中 aten::aten_linalg_vector_norm 的算子映射并修复相关模型 #1076

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PaddleV3] 修改 Dockerfile 为 Paddle 3.0.0beta 并添加 CI 测试的 blacklist #1061

[PaddleV3] 修改 Dockerfile 为 Paddle 3.0.0beta 并添加 CI 测试的 blacklist #1061

megemini commented Oct 9, 2024

megemini commented Oct 11, 2024

megemini commented Oct 16, 2024

luotao1 Oct 17, 2024

luotao1 commented Oct 17, 2024

luotao1 left a comment

[PaddleV3] 修改 Dockerfile 为 Paddle 3.0.0beta 并添加 CI 测试的 blacklist #1061

[PaddleV3] 修改 Dockerfile 为 Paddle 3.0.0beta 并添加 CI 测试的 blacklist #1061

Conversation

megemini commented Oct 9, 2024

Create A Good Pull Request

megemini commented Oct 11, 2024

Update 20241011

megemini commented Oct 16, 2024

Update 20241016

luotao1 Oct 17, 2024

Choose a reason for hiding this comment

luotao1 commented Oct 17, 2024

luotao1 left a comment

Choose a reason for hiding this comment