Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding small boxes in grobid coordinates #79

Open
minump opened this issue Apr 23, 2024 · 3 comments
Open

Finding small boxes in grobid coordinates #79

minump opened this issue Apr 23, 2024 · 3 comments
Assignees

Comments

@minump
Copy link
Collaborator

minump commented Apr 23, 2024

Find small rectangles from grobid coordinates. This is to see if grobid misses / gives wrong coordinates for sentences.

@minump minump self-assigned this Apr 23, 2024
@minump
Copy link
Collaborator Author

minump commented Apr 23, 2024

Small boxes (width<5) are usually references. Eg [2].
The scales (canvasWidth/pageWidth) is usually 1, or close to 1 (0.9, 0.87 etc).

@minump
Copy link
Collaborator Author

minump commented Apr 24, 2024

Grobid has references (super scripts and subscripts) as part of a sentence, but has a different box/coordinates for the references (superscript / subscript). The separate box is usually very small (<5 width).

@minump
Copy link
Collaborator Author

minump commented May 21, 2024

Code for small boxes is in the feature-branch. But not integrated yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant