You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the code, it is already provided that certain packages are not installed as standard, with an error message asking the user to add them.
The behavior should be extended by eliminating these unnecessary packages from the required dependencies.
Some packages are cumbersome, cause unnecessary conflicts and bring useless dependencies when the option is not used.
This is the case, for example, with scikit-learn (and its numpy versioning problems) and the python ninja distribution, which is not used anywhere by docling but is brought in by easyocr.
In #648 , it is stated that it is preferable to integrate an OCR engine by default, and easyocr was chosen for this reason.
However, @jaluma 's original idea is a good one, so just make easyocr the default except if the choice of engine is explicit.
Base installation (with EasyOCR): pip install docling
Install docling with no-deps and cherry-pick the deps needed just to use the wanted engine without other engine's incompatibility problem, every time, every update.
The text was updated successfully, but these errors were encountered:
Requested feature
In the code, it is already provided that certain packages are not installed as standard, with an error message asking the user to add them.
The behavior should be extended by eliminating these unnecessary packages from the required dependencies.
Some packages are cumbersome, cause unnecessary conflicts and bring useless dependencies when the option is not used.
This is the case, for example, with
scikit-learn
(and itsnumpy
versioning problems) and the python ninja distribution, which is not used anywhere bydocling
but is brought in byeasyocr
.In #648 , it is stated that it is preferable to integrate an OCR engine by default, and
easyocr
was chosen for this reason.However, @jaluma 's original idea is a good one, so just make
easyocr
the default except if the choice of engine is explicit.Base installation (with EasyOCR):
pip install docling
Specific OCR models:
pip install docling[easyocr]
(default behaviour)pip install docling[tesseract]
pip install docling[rapidocr]
pip install docling[ocrmac]
Alternatives
Install
docling
with no-deps and cherry-pick the deps needed just to use the wanted engine without other engine's incompatibility problem, every time, every update.The text was updated successfully, but these errors were encountered: