-
Notifications
You must be signed in to change notification settings - Fork 433
Description
Your Feature Request
So https://tesseract-ocr.github.io/tessdoc/Data-Files.html is a bit confusing. The first few links are
to some other github repository? I only wanted to download language data files.
I then ended up using these two URLs:
https://github.com/tesseract-ocr/tessdata/raw/4.00/eng.traineddata
and for german:
https://github.com/tesseract-ocr/tessdata/raw/4.00/deu.traineddata
Not sure if these are old.
It would be nice if someone could go through the whole page systematically.
I think it would be better to, as quickly as possible, show users which trained data
they are RECOMMENDED to download. In the first paragraph, like "most users
may prefer the following language data: NEW_LINE add URL here".
And then explain more on the following paragraphs. Right now I have this wall
of text, I click on something randomly but end up in another github repository.
But I only wanted to get the latest language file instead! Russian roulette
trying to find the proper language data file.
(I understand that a reference is made to linux distributions, but I am a solo hobbyist
dev so I think it would be best to just focus on "users wanting to download the
language data files as quickly as possible" as first priority, before showing much else.)