Skip to content

A cute toolkit for OCR with GUI, including image preprocessing and text recognition. Works out of the box. 一只小小的OCR工具箱,包括图像预处理和文字识别等功能,开箱即用。

License

Notifications You must be signed in to change notification settings

yukiyuqichen/OCR-Toolkit

Repository files navigation

OCR-Toolkit

A cute toolkit for OCR, including image preprocessing and text recognition. Works out of the box.

一只小小的OCR工具箱,包括图像预处理和文字识别等功能,开箱即用。

Download

The exe file can be downloaded: OCR Toolkit 2023.03.02 new

1. Preprocessing

1.1 Binary

Denoise the image with Binarization Thresholding.

对图像进行基于阈值分割的二值化,简单去噪。

1.2 Split

Detect the middle line with Hough transform algorithm and segment the image into two parts. It might come in handy when handling documents like dictionary.

通过霍夫变换检测中间界栏,根据界栏对图像进行分割,适用于词典等版式的文档。

2. OCR

2.1 Offline: PaddleOCR

Use PaddleOCR models to get the result of OCR.
No KEY is needed. The result will be saved as a structured csv file.
在本地部署PaddleOCR模型,对图像进行OCR,并将结果存储为结构化的csv文件。

2.2 Online: Baidu API

Use api of Baidu AI to get the result of OCR and parse it. The result will be saved as a structured csv file.
Users need to provide the API_KEY and SECRET_KEY.
More APIs are going to be included.

使用Baidu AI高精度文字识别的API接口,对图像进行OCR,并将结果存储为结构化的csv文件。
用户需自行输入API_KEY和SECRET_KEY。
更多接口扩充中。

About

A cute toolkit for OCR with GUI, including image preprocessing and text recognition. Works out of the box. 一只小小的OCR工具箱,包括图像预处理和文字识别等功能,开箱即用。

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages