Streamlit project with Tesseract OCR running on Streamlit Cloud.
- Upload an image with text on it
 - Select the language
 - Select the image preprocessing options (if needed) and check the result in the preview
 - Crop the image to the text area (if needed)
 - Run the OCR and check the result in the text preview
 - Adjust the settings or image preprocessing and run the OCR again (if needed)
 - Download the result as a text file or copy from the text preview
 
Installed languages for Tesseract OCR
Streamlit application is working - 04.06.2024
- Change layout of the app
 - Change checkboxes to toggle buttons
 - Add cropping functionality: https://github.com/turner-anderson/streamlit-cropper
 - Add more CSS styling
 - Cleanup of python app and repository
 
- Use Pillow for image preprocessing instead of OpenCV
- any advantages?
 
 - Add Ace Editor for text preview
- any advantages?
 
 - Add other OCR engines and test them
 - Add 
easyocrand test it - Try 
tesserocrinstead ofpytesseract - Add 
PyMuPDFand test it - Add 
ocrmypdfand test it - Add 
PaddleOCRand test it - Add 
keras-ocrand test it 
- Tesseract Documentation
 - pytesseract Documentation
 - OCR with Tesseract
 
OpenCV is used for image preprocessing before running OCR with Tesseract.
- OpenCV Image Processing Documentation
 - OpenCV Python Tutorial
 - OCR in Python Tutorials
 
import cv2
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# or
coefficients = [1,0,0] # Gives blue channel all the weight
# for standard gray conversion, coefficients = [0.114, 0.587, 0.299]
m = np.array(coefficients).reshape((1,3))
blue = cv2.transform(im, m)- CLAHE (Contrast Limited Adaptive Histogram Equalization)
 - https://www.tutorialspoint.com/how-to-change-the-contrast-and-brightness-of-an-image-using-opencv-in-python
 - https://stackoverflow.com/questions/50474302/how-do-i-adjust-brightness-contrast-and-vibrance-with-opencv-python
 - https://stackoverflow.com/questions/32609098/how-to-fast-change-image-brightness-with-python-opencv
 - https://github.com/milahu/document-photo-auto-threshold
 - https://stackoverflow.com/questions/56905592/automatic-contrast-and-brightness-adjustment-of-a-color-photo-of-a-sheet-of-pape
 - https://stackoverflow.com/questions/39308030/how-do-i-increase-the-contrast-of-an-image-in-python-opencv
 - https://stackoverflow.com/questions/63243202/how-to-auto-adjust-contrast-and-brightness-of-a-scanned-image-with-opencv-python
 
Methods to rotate an image with different libraries.
https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.rotate
from PIL import Image
with Image.open("hopper.jpg") as im:
    # Rotate the image by 60 degrees counter clockwise
    theta = 60
    white = (255,255,255)
    # Angle is in degrees counter clockwise
    im_rotated = im.rotate(angle=theta, resample=Image.Resampling.BICUBIC, expand=1, fillcolor=white)destructive rotation, loses image data
import cv2
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1)
rotated = cv2.warpAffine(image, M, (w, h))non-destructive rotation, keeps image data
import imutils
rotate = imutils.rotate_bound(image, angle)destructive or non-destructive rotation, can be chosen py parameter
reshape
from scipy.ndimage import rotate as rotate_image
rotated_img1 = rotate_image(input, angle, reshape, mode, cval)