[Feature]: OCR only if there is no text #1463

electro-logic · 2025-01-24T21:03:42Z

OCR if PDF doesn't contain text

I have a folder containing both digital and scanned PDFs. I only want to OCR the scanned files. Is there a way to do this?

jbarlow83 · 2025-01-24T22:45:28Z

Just use ocrmypdf on each pdf in its default settings. If text is detected, the default behavior is to refuse leave the pdf unchanged. There are also options in the manual to do this on a page basis.

electro-logic added enhancement triage Issue needs triage labels Jan 24, 2025

electro-logic assigned jbarlow83 Jan 24, 2025

jbarlow83 closed this as completed Jan 24, 2025

github-actions bot removed the triage Issue needs triage label Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: OCR only if there is no text #1463

[Feature]: OCR only if there is no text #1463

electro-logic commented Jan 24, 2025 •

edited

Loading

jbarlow83 commented Jan 24, 2025

[Feature]: OCR only if there is no text #1463

[Feature]: OCR only if there is no text #1463

Comments

electro-logic commented Jan 24, 2025 • edited Loading

OCR if PDF doesn't contain text

jbarlow83 commented Jan 24, 2025

electro-logic commented Jan 24, 2025 •

edited

Loading