Skip to content

Latest commit



248 lines (194 loc) · 22.4 KB

File metadata and controls

248 lines (194 loc) · 22.4 KB

MindOCR Models Offline Inference - Quick Start

1. MindOCR Model Support List

1.1 Text Detection

Model Backbone Language Dataset F-score(%) FPS data shape (NCHW) Config Download
DBNet MobileNetV3 en IC15 76.96 26.19 (1,3,736,1280) yaml ckpt | mindir
ResNet-18 en IC15 81.73 24.04 (1,3,736,1280) yaml ckpt | mindir
ResNet-50 en IC15 85.00 21.69 (1,3,736,1280) yaml ckpt | mindir
ResNet-50 ch + en 12 Datasets 83.41 21.69 (1,3,736,1280) yaml ckpt | mindir
DBNet++ ResNet-50 en IC15 86.79 8.46 (1,3,1152,2048) yaml ckpt | mindir
ResNet-50 ch + en 12 Datasets 84.30 8.46 (1,3,1152,2048) yaml ckpt | mindir
EAST ResNet-50 en IC15 86.86 6.72 (1,3,720,1280) yaml ckpt | mindir
MobileNetV3 en IC15 75.32 26.77 (1,3,720,1280) yaml ckpt | mindir
PSENet ResNet-152 en IC15 82.50 2.52 (1,3,1472,2624) yaml ckpt | mindir
ResNet-50 en IC15 81.37 10.16 (1,3,736,1312) yaml ckpt | mindir
MobileNetV3 en IC15 70.56 10.38 (1,3,736,1312) yaml ckpt | mindir
FCENet ResNet50 en IC15 78.94 14.59 (1,3,736,1280) yaml ckpt | mindir

1.2 Text Recognition

Model Backbone Dict File Dataset Acc(%) FPS data shape (NCHW) Config Download
CRNN VGG7 Default IC15 66.01 465.64 (1,3,32,100) yaml ckpt | mindir
ResNet34_vd Default IC15 69.67 397.29 (1,3,32,100) yaml ckpt | mindir
ResNet34_vd ch_dict.txt / / / (1,3,32,320) yaml ckpt | mindir
SVTR Tiny Default IC15 79.92 338.04 (1,3,64,256) yaml ckpt | mindir
Rare ResNet34_vd Default IC15 69.47 273.23 (1,3,32,100) yaml ckpt | mindir
ResNet34_vd ch_dict.txt / / / (1,3,32,320) yaml ckpt | mindir
RobustScanner ResNet-31 en_dict90.txt IC15 73.71 22.30 (1,3,48,160) yaml ckpt | mindir
VisionLAN ResNet-45 Default IC15 80.07 321.37 (1,3,64,256) yaml(LA) ckpt(LA) | mindir(LA)

1.3 Text Direction Classification

Model Backbone Dataset F-score(%) FPS data shape (NCHW) Config Download
MobileNetV3 MobileNetV3 / / / (1,3,48,192) yaml ckpt

2. Overview of MindOCR Inference

graph LR;
    subgraph Step 1
        A[ckpt] -- --> B[MindIR]

    subgraph Step 2
        B -- converter_lite --> C[MindSpore Lite MindIR];

    subgraph Step 3
        C -- input --> D[];

    subgraph Step 4
        D -- outputs --> E[];

    F[images] -- input --> D;

As shown in the figure above, the inference process is divided into the following steps:

  1. Use tools/ to export the ckpt model to MindIR model;
  2. Download and configure the model converter (i.e. converter_lite), and use the converter_lite tool to convert the MindIR to the MindSpore Lite MindIR;
  3. After preparing the MindSpore Lite MindIR and the input image, use deploy/py_infer/ to perform inference;
  4. Depending on the type of model, use deploy/eval_utils/ to evaluate the inference results of the text detection models, or use deploy/eval_utils/ for text recognition models.

Note: Step 1 runs on Ascend910, GPU or CPU. Step 2, 3, 4 run on Ascend310 or 310P.

3. MindOCR Inference Methods

3.1 Text Detection

Let's take DBNet ResNet-50 en in the model support list as an example to introduce the inference method:

  • Download the ckpt file in the model support list and use the following command to export to MindIR, or directly download the exported mindir file from the model support list:

    # Use the local ckpt file to export the MindIR of the `DBNet ResNet-50 en` model
    # For more parameter usage details, please execute `python tools/ -h`
    python tools/ --model_name_or_config dbnet_resnet50 --data_shape 736 1280 --local_ckpt_path /path/to/dbnet.ckpt

    In the above command, --model_name_or_config is the model name in MindOCR or we can pass the yaml directory to it (for example --model_name_or_config configs/rec/crnn/crnn_resnet34.yaml);

    The --data_shape 736 1280 parameter indicates that the size of the model input image is [736, 1280], and each MindOCR model corresponds to a fixed export data shape. For details, see data shape in the model support list;

    --local_ckpt_path /path/to/dbnet.ckpt parameter indicates that the model file to be exported is /path/to/dbnet.ckpt

  • Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:

    Run the following command:

    converter_lite \
         --saveType=MINDIR \
         --fmk=MINDIR \
         --optimize=ascend_oriented \
         --modelFile=dbnet_resnet50-c3a4aa24-fbf95c82.mindir \

    In the above command:

    --fmk=MINDIR indicates that the original format of the input model is MindIR, and the --fmk parameter also supports ONNX, etc.;

    --saveType=MINDIR indicates that the output model format is MindIR format;

    --optimize=ascend_oriented indicates that optimize for Ascend devices;

    --modelFile=dbnet_resnet50-c3a4aa24-fbf95c82.mindir indicates that the current model path to be converted is dbnet_resnet50-c3a4aa24-fbf95c82.mindir;

    --outputFile=dbnet_resnet50_lite indicates that the path of the output model is dbnet_resnet50_lite, which can be automatically generated without adding the .mindir suffix;

    After the above command is executed, the dbnet_resnet50_lite.mindir model file will be generated;

    Learn more about converter_lite

    Learn more about Model Conversion Tutorial

  • Perform inference using deploy/py_infer/ codes and dbnet_resnet50_lite.mindir file:

    python deploy/py_infer/ \
         --input_images_dir=/path/to/ic15/ch4_test_images \
         --det_model_path=/path/to/mindir/dbnet_resnet50_lite.mindir \
         --det_model_name_or_config=en_ms_det_dbnet_resnet50 \

    After the execution is completed, the prediction file det_results.txt will be generated in the directory pointed to by the parameter --res_save_dir

    When doing inference, you can use the --vis_det_save_dir parameter to visualize the results:

    Visualization of text detection results

    Learn more about inference parameters

  • Evaluate the results with the following command:

    python deploy/eval_utils/ \
         --gt_path=/path/to/ic15/test_det_gt.txt \

    The result is: {'recall': 0.8348579682233991, 'precision': 0.8657014478282576, 'f-score': 0.85}

3.2 Text Recognition

Let's take CRNN ResNet34_vd en in the model support list as an example to introduce the inference method:

  • Download the MindIR file in the model support list;

  • Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:

    Run the following command:

    converter_lite \
         --saveType=MINDIR \
         --fmk=MINDIR \
         --optimize=ascend_oriented \
         --modelFile=crnn_resnet34-83f37f07-eb10a0c9.mindir \

    After the above command is executed, the crnn_resnet34vd_lite.mindir model file will be generated;

    For a brief description of the converter_lite parameters, see the text detection example above.

    Learn more about converter_lite

    Learn more about Model Conversion Tutorial

  • Perform inference using deploy/py_infer/ codes and crnn_resnet34vd_lite.mindir file:

    python deploy/py_infer/.py \
         --input_images_dir=/path/to/ic15/ch4_test_word_images \
         --rec_model_path=/path/to/mindir/crnn_resnet34vd_lite.mindir \
         --rec_model_name_or_config=../../configs/rec/crnn/crnn_resnet34.yaml \

    After the execution is completed, the prediction file rec_results.txt will be generated in the directory pointed to by the parameter --res_save_dir.

    Learn more about inference parameters

  • Evaluate the results with the following command:

    python deploy/eval_utils/ \
         --gt_path=/path/to/ic15/rec_gt.txt \

3.3 Text Direction Classification

Let's take MobileNet in the model support list as an example to introduce the inference method:

  • Download ckpt

  • Use and convert ckpt to mindIR

    • To Dynamic mindIR
      python tools/ \
          --model_name_or_config configs/cls/mobilenetv3/cls_mv3.yaml \
          --save_dir /path/to/save/cls_mv3 \
          --is_dynamic_shape True \
          --model_type cls
    • To Static mindIR
      python tools/ \
          --model_name_or_config configs/cls/mobilenetv3/cls_mv3.yaml \
          --save_dir /path/to/save/cls_mv3 \
          --is_dynamic_shape False \
          --data_shape 48 192
  • Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:

    Run the following command:

    converter_lite \
         --saveType=MINDIR \
         --fmk=MINDIR \
         --optimize=ascend_oriented \
         --modelFile=/path/to/save/cls_mv3.mindir \

    After the above command is executed, the cls_mv3_lite_lite.mindir model file will be generated;

    Learn more about converter_lite

    Learn more about Model Conversion Tutorial

3.4 End to End Inference

Prepare mindIR according to Text Detection, Text Recognition, Text Direction Classification, and run the following command to do end-to-end inference

python deploy/py_infer/ \
    --input_images_dir=/path/to/ic15/ch4_test_images \
    --det_model_path=/path/to/mindir/dbnet_resnet50_lite.mindir \
    --det_model_name_or_config=en_ms_det_dbnet_resnet50 \
    --cls_model_path=/path/to/mindir/cls_mv3_lite.mindir \
    --cls_model_name_or_config=configs/cls/mobilenetv3/cls_mv3.yaml \
    --rec_model_path=/path/to/mindir/crnn_resnet34vd_lite.mindir \
    --rec_model_name_or_config=configs/rec/crnn/crnn_resnet34.yaml \

4.FAQ about converting and inference

For problems about converting model and inference, please refer to FAQ for solutions.