Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to save table data as HTML instead of images. #905

Closed
JongHyeonPart opened this issue Nov 8, 2024 · 4 comments
Closed

How to save table data as HTML instead of images. #905

JongHyeonPart opened this issue Nov 8, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@JongHyeonPart
Copy link

I attempted to convert PDF files to Markdown, but the table data keeps getting converted into images.

I have already updated the 'table-config' parameter in 'magic-pdf.template.json':

"table-config": {
"model": "TableMaster",
"is_table_recog_enable": true,
"max_time": 400
}

But it didn't work.

Is there any other way to ensure that tables are converted to HTML instead of images, so that the Markdown file includes them as HTML tables?

Thank you very much.

@JongHyeonPart JongHyeonPart added the enhancement New feature or request label Nov 8, 2024
@myhloli
Copy link
Collaborator

myhloli commented Nov 8, 2024

@rickymaggio02
Copy link

Hi,
right now I'm using this config:
{
// other config
"layout-config": {
"model": "layoutlmv3" // Please change to "doclayout_yolo" when using doclayout_yolo.
},
"formula-config": {
"mfd_model": "yolo_v8_mfd",
"mfr_model": "unimernet_small",
"enable": true // The formula recognition feature is enabled by default. If you need to disable it, please change the value here to "false".
},
"table-config": {
"model": "rapid_table", // Default to using "rapid_table", can be switched to "tablemaster" or "struct_eqtable".
"enable": True, // The table recognition feature is disabled by default. If you need to enable it, please change the value here to "true".
"max_time": 400
}
}
The tables are recognized, but saved as images and not as html in the markdown. What should I change?
Thanks in advance.

@myhloli
Copy link
Collaborator

myhloli commented Nov 29, 2024

@rickymaggio02 you should edit the magic-pdf.json in your user directory.

@rickymaggio02
Copy link

Thanks!

@myhloli myhloli closed this as completed Jan 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants