MindOCR Models Offline Inference - Quick Start

1. MindOCR Model Support List

1.1 Text Detection

Model	Backbone	Language	Dataset	F-score(%)	FPS	data shape (NCHW)	Config	Download
DBNet	MobileNetV3	en	IC15	76.96	26.19	(1,3,736,1280)	yaml	ckpt \| mindir
	ResNet-18	en	IC15	81.73	24.04	(1,3,736,1280)	yaml	ckpt \| mindir
	ResNet-50	en	IC15	85.00	21.69	(1,3,736,1280)	yaml	ckpt \| mindir
	ResNet-50	ch + en	12 Datasets	83.41	21.69	(1,3,736,1280)	yaml	ckpt \| mindir
DBNet++	ResNet-50	en	IC15	86.79	8.46	(1,3,1152,2048)	yaml	ckpt \| mindir
	ResNet-50	ch + en	12 Datasets	84.30	8.46	(1,3,1152,2048)	yaml	ckpt \| mindir
EAST	ResNet-50	en	IC15	86.86	6.72	(1,3,720,1280)	yaml	ckpt \| mindir
	MobileNetV3	en	IC15	75.32	26.77	(1,3,720,1280)	yaml	ckpt \| mindir
PSENet	ResNet-152	en	IC15	82.50	2.52	(1,3,1472,2624)	yaml	ckpt \| mindir
	ResNet-50	en	IC15	81.37	10.16	(1,3,736,1312)	yaml	ckpt \| mindir
	MobileNetV3	en	IC15	70.56	10.38	(1,3,736,1312)	yaml	ckpt \| mindir
FCENet	ResNet50	en	IC15	78.94	14.59	(1,3,736,1280)	yaml	ckpt \| mindir

1.2 Text Recognition

Model	Backbone	Dict File	Dataset	Acc(%)	FPS	data shape (NCHW)	Config	Download
CRNN	VGG7	Default	IC15	66.01	465.64	(1,3,32,100)	yaml	ckpt \| mindir
	ResNet34_vd	Default	IC15	69.67	397.29	(1,3,32,100)	yaml	ckpt \| mindir
	ResNet34_vd	ch_dict.txt	/	/	/	(1,3,32,320)	yaml	ckpt \| mindir
SVTR	Tiny	Default	IC15	79.92	338.04	(1,3,64,256)	yaml	ckpt \| mindir
Rare	ResNet34_vd	Default	IC15	69.47	273.23	(1,3,32,100)	yaml	ckpt \| mindir
	ResNet34_vd	ch_dict.txt	/	/	/	(1,3,32,320)	yaml	ckpt \| mindir
RobustScanner	ResNet-31	en_dict90.txt	IC15	73.71	22.30	(1,3,48,160)	yaml	ckpt \| mindir
VisionLAN	ResNet-45	Default	IC15	80.07	321.37	(1,3,64,256)	yaml(LA)	ckpt(LA) \| mindir(LA)

1.3 Text Direction Classification

Model	Backbone	Dataset	F-score(%)	FPS	data shape (NCHW)	Config	Download
MobileNetV3	MobileNetV3	/	/	/	(1,3,48,192)	yaml	ckpt

2. Overview of MindOCR Inference

graph LR;
    subgraph Step 1
        A[ckpt] -- export.py --> B[MindIR]
    end

    subgraph Step 2
        B -- converter_lite --> C[MindSpore Lite MindIR];
    end

    subgraph Step 3
        C -- input --> D[infer.py];
    end

    subgraph Step 4
        D -- outputs --> E[eval_rec.py/eval_det.py];
    end

    F[images] -- input --> D;

Loading

As shown in the figure above, the inference process is divided into the following steps:

Use tools/export.py to export the ckpt model to MindIR model;
Download and configure the model converter (i.e. converter_lite), and use the converter_lite tool to convert the MindIR to the MindSpore Lite MindIR;
After preparing the MindSpore Lite MindIR and the input image, use deploy/py_infer/infer.py to perform inference;
Depending on the type of model, use deploy/eval_utils/eval_det.py to evaluate the inference results of the text detection models, or use deploy/eval_utils/eval_rec.py for text recognition models.

Note: Step 1 runs on Ascend910, GPU or CPU. Step 2, 3, 4 run on Ascend310 or 310P.

3. MindOCR Inference Methods

3.1 Text Detection

Let's take DBNet ResNet-50 en in the model support list as an example to introduce the inference method:

Download the ckpt file in the model support list and use the following command to export to MindIR, or directly download the exported mindir file from the model support list:
```
# Use the local ckpt file to export the MindIR of the `DBNet ResNet-50 en` model
# For more parameter usage details, please execute `python tools/export.py -h`
python tools/export.py --model_name_or_config dbnet_resnet50 --data_shape 736 1280 --local_ckpt_path /path/to/dbnet.ckpt
```
In the above command, --model_name_or_config is the model name in MindOCR or we can pass the yaml directory to it (for example --model_name_or_config configs/rec/crnn/crnn_resnet34.yaml);

The --data_shape 736 1280 parameter indicates that the size of the model input image is [736, 1280], and each MindOCR model corresponds to a fixed export data shape. For details, see data shape in the model support list;

--local_ckpt_path /path/to/dbnet.ckpt parameter indicates that the model file to be exported is /path/to/dbnet.ckpt
Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:

Run the following command:
```
converter_lite \
     --saveType=MINDIR \
     --fmk=MINDIR \
     --optimize=ascend_oriented \
     --modelFile=dbnet_resnet50-c3a4aa24-fbf95c82.mindir \
     --outputFile=dbnet_resnet50_lite
```
In the above command:

--fmk=MINDIR indicates that the original format of the input model is MindIR, and the --fmk parameter also supports ONNX, etc.;

--saveType=MINDIR indicates that the output model format is MindIR format;

--optimize=ascend_oriented indicates that optimize for Ascend devices;

--modelFile=dbnet_resnet50-c3a4aa24-fbf95c82.mindir indicates that the current model path to be converted is dbnet_resnet50-c3a4aa24-fbf95c82.mindir;

--outputFile=dbnet_resnet50_lite indicates that the path of the output model is dbnet_resnet50_lite, which can be automatically generated without adding the .mindir suffix;

After the above command is executed, the dbnet_resnet50_lite.mindir model file will be generated;

Learn more about converter_lite

Learn more about Model Conversion Tutorial
Perform inference using deploy/py_infer/infer.py codes and dbnet_resnet50_lite.mindir file:
```
python deploy/py_infer/infer.py \
     --input_images_dir=/path/to/ic15/ch4_test_images \
     --det_model_path=/path/to/mindir/dbnet_resnet50_lite.mindir \
     --det_model_name_or_config=en_ms_det_dbnet_resnet50 \
     --res_save_dir=/path/to/dbnet_resnet50_results
```
After the execution is completed, the prediction file det_results.txt will be generated in the directory pointed to by the parameter --res_save_dir

When doing inference, you can use the --vis_det_save_dir parameter to visualize the results:

Visualization of text detection results

Learn more about infer.py inference parameters

Evaluate the results with the following command:

python deploy/eval_utils/eval_det.py \
     --gt_path=/path/to/ic15/test_det_gt.txt \
     --pred_path=/path/to/dbnet_resnet50_results/det_results.txt

The result is: {'recall': 0.8348579682233991, 'precision': 0.8657014478282576, 'f-score': 0.85}

3.2 Text Recognition

Let's take CRNN ResNet34_vd en in the model support list as an example to introduce the inference method:

Download the MindIR file in the model support list;
Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:

Run the following command:
```
converter_lite \
     --saveType=MINDIR \
     --fmk=MINDIR \
     --optimize=ascend_oriented \
     --modelFile=crnn_resnet34-83f37f07-eb10a0c9.mindir \
     --outputFile=crnn_resnet34vd_lite
```
After the above command is executed, the crnn_resnet34vd_lite.mindir model file will be generated;

For a brief description of the converter_lite parameters, see the text detection example above.

Learn more about converter_lite

Learn more about Model Conversion Tutorial

Perform inference using deploy/py_infer/infer.py codes and crnn_resnet34vd_lite.mindir file:

python deploy/py_infer/.py \
     --input_images_dir=/path/to/ic15/ch4_test_word_images \
     --rec_model_path=/path/to/mindir/crnn_resnet34vd_lite.mindir \
     --rec_model_name_or_config=../../configs/rec/crnn/crnn_resnet34.yaml \
     --res_save_dir=/path/to/rec_infer_results

After the execution is completed, the prediction file rec_results.txt will be generated in the directory pointed to by the parameter --res_save_dir.

Learn more about infer.py inference parameters

Evaluate the results with the following command:

python deploy/eval_utils/eval_rec.py \
     --gt_path=/path/to/ic15/rec_gt.txt \
     --pred_path=/path/to/rec_infer_results/rec_results.txt

3.3 Text Direction Classification

Let's take MobileNet in the model support list as an example to introduce the inference method:

Download ckpt；

Use export.py and convert ckpt to mindIR

To Dynamic mindIR

python tools/export.py \
    --model_name_or_config configs/cls/mobilenetv3/cls_mv3.yaml \
    --save_dir /path/to/save/cls_mv3 \
    --is_dynamic_shape True \
    --model_type cls

To Static mindIR

python tools/export.py \
    --model_name_or_config configs/cls/mobilenetv3/cls_mv3.yaml \
    --save_dir /path/to/save/cls_mv3 \
    --is_dynamic_shape False \
    --data_shape 48 192

Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:

Run the following command:
```
converter_lite \
     --saveType=MINDIR \
     --fmk=MINDIR \
     --optimize=ascend_oriented \
     --modelFile=/path/to/save/cls_mv3.mindir \
     --outputFile=cls_mv3_lite
```
After the above command is executed, the cls_mv3_lite_lite.mindir model file will be generated;

Learn more about converter_lite

Learn more about Model Conversion Tutorial

3.4 End to End Inference

Prepare mindIR according to Text Detection, Text Recognition, Text Direction Classification, and run the following command to do end-to-end inference

python deploy/py_infer/infer.py \
    --input_images_dir=/path/to/ic15/ch4_test_images \
    --det_model_path=/path/to/mindir/dbnet_resnet50_lite.mindir \
    --det_model_name_or_config=en_ms_det_dbnet_resnet50 \
    --cls_model_path=/path/to/mindir/cls_mv3_lite.mindir \
    --cls_model_name_or_config=configs/cls/mobilenetv3/cls_mv3.yaml \
    --rec_model_path=/path/to/mindir/crnn_resnet34vd_lite.mindir \
    --rec_model_name_or_config=configs/rec/crnn/crnn_resnet34.yaml \
    --res_save_dir=/path/to/infer_results

4.FAQ about converting and inference

For problems about converting model and inference, please refer to FAQ for solutions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!