-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hocr: only add space if boxwidth is positive #1446
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1446 +/- ##
==========================================
+ Coverage 90.18% 90.20% +0.01%
==========================================
Files 95 95
Lines 7073 7073
Branches 722 722
==========================================
+ Hits 6379 6380 +1
Misses 491 491
+ Partials 203 202 -1 ☔ View full report in Codecov by Sentry. |
This is great work - I just want to take some time to review it. |
This seems to have caused a test failure for me, but I'm not entirely sure why:
|
Ah, it is one of the slow tests, so I think it's not being run in CI? |
It should pass. At some point some difference emerged in how those two rendering modes handle straight and smart quotes which appears to be issue. I changed the test to be tolerant of a few minor differences in commit 32322a9f. |
I don't see that commit; is it not pushed out, or a typo? |
If the HOCR contains overlapping words in the same line or the words are not ordered in reading-direction,
space_box.width
might be negative. This leads to more issues with the calculation of font width etc, producing text boxes that are way larger than the actual bbox set within the hocr-file