-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TF response to HTML #29
Comments
Tableformer generates structure predictions in OTSL+ format (OTSL with header support), OTSL format described in our paper: Optimized Table Tokenization for Table Structure Recognition, there are big benefits in quality and performance to use it. OTSL+ is extension of OTSL with extra tags or instructions that describe cells of: Model predicts these tags sequentially in tag decoder, simultaneously with bounding boxes from bbox decoder. By the way more high level usage of docling-ibm-models can be seen in docling itself: https://github.com/DS4SD/docling |
@maxmnemonic , can you link to the code to do the same or add this as a test or sample notebook to the current repo? it will be really helpful for everyone. thanks |
Thanks for suggestion @mllife, indeed we can add some good examples purely related to tables in this repo |
@maxmnemonic , any update to this? can you add some sample code for this or some test like this https://github.com/DS4SD/docling-ibm-models/blob/main/tests/test_tf_predictor.py |
Any helper code available in repo to do this?
I see some code (related to dataset conversion?)
docling-ibm-models/docling_ibm_models/tableformer/otsl.py
Line 125 in 620ce42
docling-ibm-models/docling_ibm_models/tableformer/data_management/tf_dataset.py
Line 751 in 620ce42
Not sure about this -
--
docling-ibm-models/docling_ibm_models/tableformer/data_management/tf_predictor.py
Line 1086 in 620ce42
Any insight will be helpful.
The text was updated successfully, but these errors were encountered: