Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Make installation of non-essential dependencies optional #789

Open
Lrakotoson opened this issue Jan 22, 2025 · 0 comments
Open

[FEAT] Make installation of non-essential dependencies optional #789

Lrakotoson opened this issue Jan 22, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@Lrakotoson
Copy link

Requested feature

In the code, it is already provided that certain packages are not installed as standard, with an error message asking the user to add them.
The behavior should be extended by eliminating these unnecessary packages from the required dependencies.

Some packages are cumbersome, cause unnecessary conflicts and bring useless dependencies when the option is not used.
This is the case, for example, with scikit-learn (and its numpy versioning problems) and the python ninja distribution, which is not used anywhere by docling but is brought in by easyocr.

In #648 , it is stated that it is preferable to integrate an OCR engine by default, and easyocr was chosen for this reason.
However, @jaluma 's original idea is a good one, so just make easyocr the default except if the choice of engine is explicit.

  1. Base installation (with EasyOCR):
    pip install docling

  2. Specific OCR models:
    pip install docling[easyocr] (default behaviour)
    pip install docling[tesseract]
    pip install docling[rapidocr]
    pip install docling[ocrmac]

Alternatives

Install docling with no-deps and cherry-pick the deps needed just to use the wanted engine without other engine's incompatibility problem, every time, every update.

@Lrakotoson Lrakotoson added the enhancement New feature or request label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant