Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: OCR only if there is no text #1463

Closed
electro-logic opened this issue Jan 24, 2025 · 1 comment
Closed

[Feature]: OCR only if there is no text #1463

electro-logic opened this issue Jan 24, 2025 · 1 comment
Assignees

Comments

@electro-logic
Copy link

electro-logic commented Jan 24, 2025

OCR if PDF doesn't contain text

I have a folder containing both digital and scanned PDFs. I only want to OCR the scanned files. Is there a way to do this?

@jbarlow83
Copy link
Collaborator

Just use ocrmypdf on each pdf in its default settings. If text is detected, the default behavior is to refuse leave the pdf unchanged. There are also options in the manual to do this on a page basis.

@github-actions github-actions bot removed the triage Issue needs triage label Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants